Friday, September 28, 2012

Paper summary

The past month, I've been reading a lot of papers, first specifically about acceleration structures and the past weeks more broadly about other means to accelerate global illumination. My goal was to find suitable algorithms or ideas to combine with probabilistic visibility.

I'm going to present the first batch of papers I've read and give a little summary:

Instant Ray Tracing: The Bounding Interval Hierarchy

This paper presents a new acceleration structure which is a cross-over between kD-trees and bounding volume hierachies (BVH). It combines the best of the two world, leading to a lower memory consumption and a fast kD-tree like traversal.

It works by not storing six planes for each node (like a BVH), but only two perpendicular planes. Traversal is like in a BVH, but since the nodes are spatially ordered, traversal can be made more efficient by intersecting the closest child along the ray first (like in kD-tree traversal). Analogous to the BVH it is possible to not intersect any node at all when the ray is between two non-overlapping children (see figure 2).

Figure 1: Situation when the ray (blue line) misses the left and the right child.

Construction can be done in O(N log (N)) time and the constants can be reduced by a bucket sorting preprocess. Furthermore, it is also possible to do a lazy construction of the hierarchy to further reduce the memory footprint, since construction will be avoided in unvisited elements of the scéne.

The number of object references in this technique is equal to the number of objects. This is an advantage since this restriction is not guaranteed by kD-trees in which a primitive is referenced multiple times.

Reference: Carsten Wächter and Alexander Keller, 2006: Instant Ray Tracing: The Bounding Interval Hierarchy. Eurographics Symposium on Rendering

Visibility driven BVH build up algorithm for ray tracing

The heuristic for creating BVH and kD-trees is usually the surface area heuristic or SAH and this heuristic performs well. However, the assumptions for the cost function are rather unrealistic. The assumptions are:

A uniform distribution of rays in scene space
Shadow rays do not terminate after their first intersection

The paper modifies the cost function to include visibility information. If the visibility assumptions are (mostly) valid, (on average) the algorithm will perform better.

The proposed method is usefull in animations, where we have knowledge about visibility from previous frames and in scenario's were a large number of rays needs to be traced.

With the infomation about the approximate visibility, a cost function is defined that accounts visibile and invisible triangles differently. This seperates the visible and invisible geometry in the hierarchy which results in shallower trees in the part with the visibile triangles. Figure 2 visualizes how the SAH would possibly generate a BVH (on the left) versus the new heuristic called OSAH. The blue triangle represents a visible triangle while the gray are invisible.

Figure 2: SAH vs OSAH hierarchy

Reference: Marek Vingkler, Vlastimil Havran, JiriSochor, 2012: Visibility driven BVH build up algorithm for ray tracing. Computers & Graphics 36 (2012) p. 283-296

Accelerating Shadow Rays Using Volumetric Occluders and Modified kD-tree traversal

An algorithm is described that marks nodes of a kD-tree using two additional bits. These bits are used to differentiate between three possible situations. The node is either:

Opaque (completly inside the model)
Boundary (contains a triangle of the model)
Clear (is completely outside the model)

Traversal down the kD-tree can be stopped when an opaque node is reached. The traversal of the kD-tree is also modified to favor opaque nodes, since for a shadow ray, any intersection is good, not necessarily the closed.

Performance is gained here because the opaque nodes are larger and less deep in the tree. Further performance can be gained by using a cache of used volumetric occluders to exploit ray coherence.

The downside of the technique is that only watertight models can be used. When open models are used, no difference can be made between opaque and clear nodes. In addition the bounding boxes of the volumetric occluders are not explicitly known since, they are discovered by traversing the kD-tree.

Dispite this fact, the idea of volumetric occluders is very interesting for probabilistic visibility, since we can use the larger occluders as stand-in's for smaller geometry, perhaps reducing the variance.

Reference: Peter Djeu, Sean Keely, Warren Hunt, 2009. Accelerating Shadow Rays Using Volumetric Occluders and Modified kD-tree traversal. High Performance Graphics 2009 p. 69-76

Memory-conserving Bounding Volume Hierarchies with Coherent Ray Tracing

The authors propose to store the bounding boxes of the nodes using unsigned 8-bit integers instead of floating point numbers. They achieve this by dividing the bounding box of the parent in 256 pieces and placing the bounds of the bounding boxes at these discrete positions. This will result in a slight loss of performance, since the bounding boxes will tend to be slightly larger. However, the savings in memory allow the nodes to be aligned with the caches and to be more compact, improving cache coherence.

They also recommend using coherent ray tracing, so that decompressing the hierarchy from the 8-bit unsigned integers only needs to be done once per packet.

Test results display a decrease of 79% in memory usage when using double precision and 63% when using single precision floating point numbers. However there is a 41% increase of rendering time because of the overhead of decompressing the nodes during ray traversal and because of a small excess in ray traversal since the bounding boxes are larger. This can be solved using coherent ray tracing, which amortizes the calculations over entire packets of rays.

Reference: Jeffrey A. Mahovsky, Brian Wyvill, 2006. Memory-conserving Bounding Volume Hierarchies with Coherent Ray Tracing. IEEE Transactions on visualization and Computer Graphics.

Saturday, September 22, 2012

Octree

Since I will probably need to create or adapt an existing acceleration structure for ray tracing, I thought it was a good idea to implement a new acceleration structure to familiarize myself with the avaible methods and interfaces.

PBRT already comes with a Bounding Volume Hierarchy, Uniform Grid and kD-acceleration structure, so I decided to add an Octree to the list.

An octree subdivides the scene into eight equal sized voxels and each non-empty voxel is again recursively subdivided into eight voxels. Speedups are gained when a ray misses a voxel, since all of the children of that voxel no longer need to be checked.

Furthermore, an octree does not suffer from the "teapot in the stadium" problem, since it adapts to irregular placed geometry by only subdividing non-empty voxels.

The false-colour image below shows the number of intersection tests (including voxel intersections) for the stanford bunny scéne.

Stanford Bunny Scéne
(displaying the correctness of the octree implementation)

False color image when using the Octree.
The maximum number of intersections is 2324 (red).
The total number of intersections is 102,265,467

Next we compare the rendering times against the other acceleration structures:

	BVH with SAH	Uniform Grid	kD-tree with SAH	Octree
Stanford Bunny	2.9s	5.3s	3.1s	5.3s
Utah Teapot	27.7s	34.2s	31.6s	36.2s
Killeroo Scene	160.1s	415.7s	233.7s	669.5s

These results show that my implementation of the octree accelerates ray tracing, but still needs some tuning. I guess "make it work, make it right, make it fast" is the key here. The implementation already works and the "aggegratetest" tests assures that it is also right. Now I still need to make it fast.

Friday, September 21, 2012

Shadow photons

In my blog post about Occlusion Mapping I presented a method to reduce the variance in the probabilistic visiblity estimate. Since this method looked promising, my promotor encouraged me to do a bit more research towards photon mapping. Therefore I read the book Realistic Image Synthesis using Photon Mapping by Henrik Wann Jensen, which is cited in a lot of papers on photon mapping.

To my surprise, this books already proposes a similar method, using a photon map with "Shadow Photons" to reduce the number of shadow rays in a scéne. The description of their method is presented in the paper Efficiently rendering shadows using the photon map also by Henrik Wann Jensen.

Similar to my method they trace photons from the light source, which are only stored after a first intersection. These photons are the "Shadow Photons". In the paper they also trace the regular photons called "Illumination Photons" (see figure 1)

Figure 1: Shadow photon creation.
Courtesy: Henrik Wann Jensen, "Efficiently Rendering Shadow Using the Photon Map", 1995

These shadow and illumination photons are used to identify regions that are completely in shadow and completely illuminated. When a ray is traced through the scéne, a photon lookup is performed. Based on the number of shadow and illumination photons, three distinct cases can be identified:

there are only shadow photons: no shadow rays have to be traced since the point is completely in shadow
there are only illumination photons: no shadow rays have to be traced since the point is completely illuminated
there is a mix of shadow and illumination photons: shadow rays have to be traced.

Jensen also proposes to estimate the visibility v_i as n_i / (n_i+n_s) where n_i is the number of illumination photons and _s is the number of shadow photons. However, this requires a great number of photons.

The results from the paper indicate an average reduction in the number of shadow rays of 90%.

Thursday, September 20, 2012

Introduction to PBRT

The algorithm(s) for my thesis will be developed in the PBRT ray tracer. I familiarized myself with the ideas, code and the used algorithms by reading the book Physically Based Rendering and today I spent some time going through the source code.

As a first test and as a usefull tool, I created a class which outputs a false color image of the number of intersection tests during rendering. Here is some of the output when run on the standard PBRT scénes:

The Stanford Bunny

False color image when using the Bounding Volume Hierarchy.
The maximum number of intersections is 963.
The total number of intersections is 31,468,464

False color image when using a kD-tree.
The maximum number of intersections is 772.
The total number of intersections is 8,967,463.

False color image when using a uniform grid.
The maximum number of intersections is 3519.
The total number of intersections is 40,690,247.

The number of intersection also includes the number of box and voxel intersections for the bounding volume hierarchy and grid.

Next thing to do is setting up an svn or git repository for the code.

Monday, September 3, 2012

Occlusion Mapping

After reading through a small amount of papers during the last weeks, I spend the last couple of days implementing an idea for an algorithm to improve the probabilistic visibility estimation.

The idea

The idea is as follows: in a preprocessing step, a number of rays is traced from the light sources through the scenes. For each ray being traced, an empty list of primitives is created. After the first intersection, the intersected primitive is added to the list. Then the ray is traced further. Upon each following intersection, an occlusion photon is created, which stores the position of the intersection and the current list of occluding primitives. After the photon is created, the currently intersected primitive is added to the front of the list with occluding primitives.

The result is a map of "occluding photons", which contain an approximation of the occluding geometry at a position in the scene.

The following figures show the resulting occlusion map for a test scene.


Cornell box with two triangles	Hot-to cold representation of the density of the occlusion map with 1000000 photons.

When tracing camera rays through this test scéne, we can differentiate between four different situations:

No occlusion
Occlusion by the yellow triangle
Occlusion by the green triangle
Occlusion by both of the triangles

When there is no occlusion at all, we don't need to trace any shadow rays to the light source. When there is only one occluding primitive, we only need to trace rays to that primitive.

The occlusion map allows us to differentiate between these situations. When an intersection in the scéne is found, we perform a lookup in the photon map. When no nearby photons are found, we assume that the visibility is always one. When the nearby photons only have records of one primitive, we only trace shadow rays to that primitive. However, when the photons have references to multiple primitives, we use the probabilistic visibility estimator as described in the previous post about probabilistic visibility.

Results

The test scénes have been rendered with an increasing number of shadow rays using deterministic visibility and the proposed algorithm. When we plot the total number of intersection tests for both methods, we see that the number of intersection tests is much lower with the new algorithm. Furthermore the difference becomes even larger when more shadow rays are traced per pixel. This is because a large portion of the scéne is unoccluded by both the triangles. In these regions, no shadow rays have to be traced with the occlusion mapping method.

Graph displaying the number of intersection tests (in millions) for a given number of shadow rays.

The images below show the results of the rendering with different amounts of shadow rays. These images look a lot better than when we are using the regular probabilistic visibility estimation because the shadows in the penumbra and the unoccluded regions are now exact.


4 shadow rays


64 shadow rays


256 shadow rays


900 shadow rays

However, the occlusion mapping technique introduces one extra artifact in regions that didn't receive enough photons. An example is given in the figure below.

Artifact introduced by the occlusion mapping approach

The penumbra region here shows some gaps where there were no nearby photons. In the proposed algorithm, visibility will always evaluate to one, which is an incorrect result.

Conclusion

The proposed algorithm reduces the number of intersection tests by a great amount. Also the variance introduced by probabilistic visibility is reduced and now only visible in regions with multiple occluding objects.

Downsides of this technique are that a great number of photons needs to be shot. Furthermore, this photon shooting has to be done for every light source, which can an expensive preprocessing step in a real-time application with many light sources.

Storing the photons in a KD-tree for every light source seperately can also be a burden on memory. If this is the case we could use an implicit object presorting to create an acceleration structure as described in M. Eisemann et al. - Geometry Presorting for Implicit Object Space Partitioning

Probabilistic Visibility