| Tracing light rays from a virtual camera into a 3D scene |
One of the graphical features that are easy to obtain using ray tracing is physically accurate and pixel-perfect shadows. As is apparent in the figure to the right, rendering a shadow is as simple as casting a ray from the surface of an object toward the center of a light source. If that ray intersects some object before it reaches the light source, the originating surface point is in shadow.
Simple!
Ever since I was introduced to GLSL, I wondered if it would be feasible to implement ray tracing on the GPU. The GPU, after all, is well-suited to geometrically-oriented computations. I decided to take the opportunity to find out by attempting to implement it as a final project for Chuck Hansen's course on Interactive Computer Graphics at the University of Utah. As a first pass, I decided to use ray tracing simply as a means of computing the shadows of an interactive OpenGL scene.
Read on for how I did it.
Inspiration
Most of the ideas I used for implementing ray tracing on the GPU came from Ray Tracing on Programmable Graphics Hardware (RTOPGH) by T. Purcell, I. Buck, W. Mark, and P. Hanrahan. Their approach is to implement a complete ray tracer to generate the color of every pixel in a frame. My goals differed in that my intent was to utilize ray tracing only to generate shadows in an otherwise conventionally-rendered OpenGL scene. Nonetheless, the core work of the task is the same.
Before I discovered the aforementioned paper, the biggest question I had about this project was how to load the scene data into the memory of the graphics card. There is potentially quite a bit of geometry in a 3D scene of any human interest, and loading it into a shader program by means of uniform variables would be tedious to say the least, not to mention highly limiting. The authors of RTOPGH realized that there is already a built-in mechanism for loading large amounts of data into graphics memory: texture units.
Acceleration
Before describing how the scene data is laid out in texture memory, it is worth noting that one of ray tracing's drawbacks is running time. A naive implementation of ray tracing tests a ray for intersection with every polygon (or primitive) in the scene. Significant speed ups can be gained by spatially partitioning the scene into large and/or small chunks containing references to all of the geometry contained within them. Typically these chunks, which I will call voxels, are axis-aligned cuboids that can very quickly be checked for intersection with a ray. There are many approaches that can be taken to partition the space. The simple yet effective method I chose (based largely on familiarity) is called 3D-DDA.
![]() |
| Data structures laid out in texture memory. |
Every entry in the triangle list texture is an index into the global vertex list texture.
Finally, the vertex list texture contains 3D points stored as 32-bit floating point RGB values.
Implementation Details
Because it can be difficult to keep track of coordinate spaces in OpenGL, I took care to specify all vertices both in the textures and in the calls to glVertex() in world coordinates. I also explicitly set a uniform variable containing the location of the light source in world coordinates.
Tracking points along the surface of each polygon is trivial thanks to built-in features of GLSL. One can simply take advantage of the "varying" variable types in GLSL which simply interpolate data across vertices. My vertex shader program consists of little more than setting a varying vector, which I called surfacePoint, to the value of gl_Vertex. Thus, for every fragment at which I want to compute a shadow (which really is every fragment in the frame), I have a location from which to generate a ray toward the light.
One of the tricker implementation details was that of accounting for limited texture space. Specifically, when building a texture, OpenGL will throw an error if the client program attempts to create a texture larger than a specific platform-dependent size. The implication of this is that loading scenes of any "interesting" size requires splitting the scene data into multiple textures. To me, this felt akin to the good old days of spanning a ZIP file across multiple floppy disks. There was a significant amount of extra book-keeping I had not anticipated at the start.
At the heart of every good ray tracer is a fast triangle intersection function. I found a great one at Intersections of Rays, Segments, Planes and Triangles in 3D by Dan Sunday. My slightly modified version is listed here:
float intersectTriangle(vec3 orig, vec3 dir, vec3 vertices[3], float lastHitT)
{
const float INFINITY = 1e10;
vec3 u, v, n; // triangle vectors
vec3 w0, w; // ray vectors
float r, a, b; // params to calc ray-plane intersect
// get triangle edge vectors and plane normal
u = vertices[1] - vertices[0];
v = vertices[2] - vertices[0];
n = cross(u, v);
w0 = orig - vertices[0];
a = -dot(n, w0);
b = dot(n, dir);
if (abs(b) < 1e-5)
{
// ray is parallel to triangle plane, and thus can never intersect.
return INFINITY;
}
// get intersect point of ray with triangle plane
r = a / b;
if (r < 0.0)
return INFINITY; // ray goes away from triangle.
vec3 I = orig + r * dir;
float uu, uv, vv, wu, wv, D;
uu = dot(u, u);
uv = dot(u, v);
vv = dot(v, v);
w = I - vertices[0];
wu = dot(w, u);
wv = dot(w, v);
D = uv * uv - uu * vv;
// get and test parametric coords
float s, t;
s = (uv * wv - vv * wu) / D;
if (s < 0.0 || s > 1.0)
return INFINITY;
t = (uv * wu - uu * wv) / D;
if (t < 0.0 || (s + t) > 1.0)
return INFINITY;
return (r > 1e-5) ? r : INFINITY;
}
Results
Here is what it looks like. This scene is simple enough that it was easily rendered interactively without the need for any acceleration structure such as 3D-DDA.
Stuff that I'm lying about in this post
The two major components of this project that I haven't actually completed as of the preliminary presentation date are:
- The 3D-DDA acceleration algorithm
- Splitting scene data to span multiple texture units
Updated Results
After working on the two items above, I have some new results to share. Unfortunately, the results are not too great. First of all, I gave up on the 3D-DDA acceleration grid. I wrote code to voxelize the shadow-casting geometry in C++. However, encoding it as a texture and then making use of it within the shader program was turning out to be much more difficult than I had originally imagined.
Finally it occurred to me to wonder how much of a speedup, if any, would be gained by the grid. Initially, my simple scene consisted of a single cube (consisting of 12 triangles) casting a shadow. I added four more cubes, taking the triangle count up to 60. At that point, there was a very noticeable slow down. So in order for the 3D-DDA to be worth doing on the teapot model (which was my target), I would need the grid to be partitioned such that no single voxel contained more than about 30 or 40 triangles. I queried the grid I had already written in C++ and discovered the following numbers:
- Maximum triangles in a voxel: 1,258
- Minimum triangles in a voxel: 0
- Average triangles in a voxel: 330
I could improve those numbers by refining the grid to a higher resolution. Increasing the resolution of the grid for the teapot to 50x50x50, I was able to reduce those numbers to the following:
- Maximum triangles in a voxel: 50
- Minimum triangles in a voxel: 0
- Average triangles in a voxel: 0.14
Clearly those numbers are better. They nonetheless come with a cost: 125,000 total voxels. The higher voxel count leads to issues in the voxel-to-texture encoding, similar to spanning the triangles over multiple texture units. With a limited number of available texture units, this quickly becomes infeasible.
On the topic of spanning texture units, I was able to split the triangles across up to 6 texture units. However, an unexpected problem arose, related, I believe, to the platform I am working on. Visually, the result I saw was that the teapot's shadow had holes in it. Each of the holes appeared to be in the shape of a triangle.
The only hypothesis I have for why it might have been this way is what could be characterized as a sort of aliasing. To look up any given triangle and vertex in the set of triangle textures, I need to map the index to some number between 0 and 1. As the number of triangles I stuff into a single texture goes up, so does the likelihood that some pair of discrete floating values between 0 and 1 would map to the same triangle. Without some sort of speed up in the ray tracing algorithm, it is difficult to test this (or any other) hypothesis, but it seems at least plausible to me.
Overall, this was a very enlightening project for me. I was forced to think in creative ways about how to debug a GLSL program, and I feel as though I became quite familiar with the platform. Finally, here is one more screenshot.



0 comments:
Post a Comment