We use a standard pathtracing algorithm with a binary bounding volume hierarchy (BVH) acceleration structure. We store one triangle in each leaf node of the BVH tree. Probably the most common and efficient way to do ray-triangle intersection tests with the BVH tree is to maintain a stack of pointers to nodes that need to be tested. The basic tree traversal algorithm is outlined in the code listing below. A nice property of this algorithm is that each node is never visited more than once.
closest_hit = infinity
while(stack.size() > 0)
currentnode = stack.pop()
if (currentnode.is_leaf == true)
// Do ray primitive intersection and update hit record
hit = ray_triangle_intersection(ray,currentnode.triangle)
if ( hit < closest_hit )
closest_hit = hit
else if ( ray_box_intersection(ray,currentnode.bbox) < closest_hit )
In GLSL it is not possible to implement a fully dynamics stack. However, in GLSL 3.3 it is straightforward to implement a fixed sized stack of pointers as an integer array with a stack counter, e.g.
int currentnode = stack[stackcounter];
stack[stackcounter] = child;
Unfortunately, the OpenGL ES Shading Language 1.0 does not allow us to access an array element with a variable index, so a stack based approach is not feasible in WebGL. Thus, we implemented the stackless BVH traversal proposed in Ref. , which was reported to be about 30 % slower than the stack based traversal.
An additional problem with WebGL is that Windows browsers by default translates OpenGL calls to DirectX through a layer called ANGLE. Our experience with this translation is that loops in shaders get unrolled, and, hence, if we have shaders with very long loops, the shader compiler may run out of resources and fails to compile. With the typical scenes and BVH trees that we have tested, the ANGLE shader compiler can only compile a shader that traverses the tree a single time, i.e., we can only implement a pathtracer with a single bounce. Our solution to this problem is to run each bounce in a separate pass and save the state between the passes. This method will potentially involve some overhead because intermediate results must be read from and written to a texture. Additionally we must issue an additional draw instruction for each trace pass.
Linux and Mac browsers use the native OpenGL shader compiler. The Nvidia compiler that we have tested compiles the full pathtracer in a single shader without any problems.
Nvidia GeForce 470 GTX
Intel E5620 2.4 GHz Quad Core
Linux Nvidia drivers version 304.43
Windows Nvidia drivers version 306.97
Our benchmark results are shown in the table and figure below. If we compare the stackless and stack based versions in GLSL 3.3, we see that the stackless version is almost 50% slower, which is somewhat disappointing compared to the results reported in . WebGL/GLSL ES 1.0 seems to be slightly slower than GLSL 3.3 when we do the full pathtracing in a single shader. The WebGL multipass version does not seem to be affected much by storing intermediate results between the passes. In fact the fastest multipass results are comparable with the results obtained with GLSL 3.3.
|C++/GLSL 3.3 linux||WebGL Chrome linux||WebGL Firefox linux||WebGL Chrome Windows (native)||WebGL Firefox Windows (native)||WebGL Chrome Windows (Angle)||WebGL Firefox Windows (Angle)|
|Singlepass with stack||34.4|