This is a one-hour reading class on spatial data structures. For Fall 2008, we meet Fridays at 4pm in 3102 SC.
Topics:
Sept. 5, 2008 Quad trees
Sept. 19, 2008 Data structures for unmanned vehicles
Sept. 26, 2008 KD trees for real-time ray tracing of dynamic scenes
Oct. 3, 2008 KD trees for photon mapping in dynamic scenes
Oct. 10, 2008 Scale Selection for Classification of Point-sampled 3-D Surfaces
Oct. 17, 2008 Perfect Spatial Hashing by Lefebvre et. al.
Nov. 7, 2008 Ray Tracing on Multi-Core and Parallel Algorithms for Trees
Sanjay Patel on spatial data structures for physics engines.
- Point quad trees split along each dimension at the position coinciding with the data points. This is in contrast to the PR (point-region) quad trees we are more familiar with, which splits each rectangular space half-way along each dimension. Another difference is: in point quad trees data points are stored in both interior nodes and leaf nodes while in PR quad trees data points are only stored in leaf nodes.
- Operations in point quad trees:
- Insertion
- Deletion
- Naive approach
Suppose we want to remove x, then remove x and re-insert nodes in the subtree rooted at x.
- Smarter approach (as described in Samet Sec. 1.4.1.2)
The general idea is as follows. We would like to replace node x by some other node in the tree. In most cases, doing so makes the tree "incorrect." For example, some node y that was previously in quadrant II with respect to x may end up in a different quadrant because the "point of decomposition" has changed and yet y is still stored in the quadrant II child. These incorrect nodes need to be identified and re-inserted. The text describes this in detail. The trick for saving work is to choose a replacement node that minimizes the number nodes made "incorrect," thus reducing the number of re-insertions, which ultimately results in a faster deletion algorithm than the naive approach.
- Search
- Trie-based quad trees: MX quad trees and PR quad trees
Samet Section 1.4
Victor's Slides
- Intended application is for laser-based terrain classification for use in unmanned vehicles
- Output from laser system is a stream of points representing points in space that are part of some surface; this collection of points is referred to as a point cloud.
- To help the unmanned vehicle make better decisions, it is useful to classify points based on the type of surface it belongs to (ie. road, tree, building, etc)
- One approach to point cloud classification is to define a set of feature values for a point, forming a feature vector. Computing the feature vector of a point effectively maps a point in 3-D to some point in a n dimensional feature space, where n is the number of feature values. First, feature vectors are computed for a training set (point cloud with points correctly labelled) and techniques such as SVM are used to find surfaces in the n dimensional feature space that separates feature vectors with different labels. WIth this said, a feature values are commonly defined in terms of numbers resulting from performing PCA (principle component analysis). The PCA procedure is applied over points in some ball. For this reason, a data structure supporting efficient nearest neighbor queries is important in such applications.
- Data structure as presented in paper: Efficient scrolling data structure
- Voxelize space (uniform grid)
- Approaches for querying points living in some cube centered at some given point.
- Direct approach
- For each query search all cubes that overlap with the query range.
- Naive approach for data reuse
- Optimized scan approach for data reuse
1 & 2
Sept. 26 2008 - KD Trees for Real-Time Ray Tracing of Dynamic Scenes
- Conventional KD-tree acceleration structure
- SAH cost, as it is defined, tries to measure the average computation time a ray takes to traverse the tree and finally arrive either at a nearest intersection or no intersection at all.
- Exact evaluation of SAH cost is too expensive. A common approach is instead to approximate the cost by a greedy algorithm. Vaguely speaking... for a given node, the cost is computed assuming its children nodes are leafs.
- Candidate splitting planes are the set of all planes that coincide with a side of any of the AABB (axis aligned bounding box) of primitives in the node. The candidate splitting plane yielding the lowest SAH cost is chosen.
- Real-Time ray tracing in dynamic scenes
- Scene is changing; therefore, kd tree must get updated/re-constructed to reflect change in scene for every frame.
- Kun Zhou's implementation of real-time kd-tree construction
- Given a list large nodes. Major steps:
- Divide node into chunks (of 256 triangles).
- Create bounding box of nodes
- First, perform reduction over all AABB in each chunk to obtain a per chunk bounding boxes. Next, perform segmented reduction over all chunks to obtain per node bounding boxes.
- Perform as many empty space splitting as possible followed by a spatial median split.
- Criterion for a empty space split is that the empty space created by the split exceeds some threshold along the splitting direction.
- Sort triangles into children nodes using split primitive (derived from scan primitive).
- Triangle-node association list rearranged so that indices of triangles from the same node form a contiguous part of the list.
- Count the number of triangles and decide if node is large or small. We count the number of triangles by performing reduction within each chunk and segmented reduction over results of each chunk.
- Small nodes
- Conventional node splitting strategy
- Relevant Literature
- K. Zhou, Q. Hou, R. Wang & B. Guo, Real-Time KD-Tree Construction on Graphics Hardware. Tech Report: MSR-TR-2008-52, April 2008.
- Sengupta, S., Harris, M., Zhang, Y., and Owens. J. D. 2007. Scan primitives for GPU computing. In Proceedings of Graphics Hardware, 41-50.
- Presents GPU implementation of the segmented scan primitive
- Harris M., Sengupta S., Owens, J. D. Parallel prefix sum (scan) with CUDA. In GPU Gems 3, Nguyen H., (Ed.). Addison Wesley, Aug. 2007
- Describes GPU implementation of the standard (un-segmented) scan primitive
- Pharr, M., Humphreys, G. Physically Based Rendering: From Theory to Implemenation. 198-220
- Describes the conventional KD-tree acceleration structure used for a ray tracer
- Also explains surface area heuristic (SAH)
Oct. 3, 2008 - KD Trees for Photon Mapping
- An important step in rendering images with indirect lighting and caustic effects is computing radiance estimates from a photon map. The photon map is created by shooting photons from light sources into the scene and recording the locations at which the photons impinge on as "packets of flux." The radiance estimate at some point x on some surface is essentially a weighted sum of the "flux packets" impinging on the surface in some neighbourhood of x. Therefore, these "flux packets" need to be stored in a data structure supporting efficient KNN (k-nearest neighbor) queries.
- KD trees for Photon Mapping vs. KD trees for Ray Tracing
- In photon map KD trees, nodes that overlap the query sphere are traversed. In ray tracing KD trees, nodes that overlap with the query ray are traversed.
- Photon map KD trees store points. Ray tracing KD trees store primitives occupying some spatial extant.
- Conventional (Jensen 2001) Photon Map KD Tree
- Heap-like structure for storing photons
- Node at index i has left child at index 2i and right child at index 2i+1
- Balanced tree construction
- Split along "longest dimension"
- Split node at median point
- KNN Query
- Traverse nodes that might contain points within query ball
- Use priority queue (max heap) to keep track of k nearest points
- Kun Zhou's Photon Map KD Tree for GPU
- Tree construction
- Large nodes (number of points > 32)
- Maintains 3 sorted point list, each containing indices of points sorted along a distinct dimension using cudppSort. This allows for convenient retrieval of bounding intervals, children sorting (how?), and point counting.
- No chunks this time
- Again, empty space splitting followed by spatial median splitting
- Small nodes (number of points <= 32)
- Uses a "iterative KNN search algorithm based on range searching" (how does this work?)
- Relevant Literature/Resources
Oct. 10, 2008 - Scale Selection for Classification of Point-Sampled 3-D Surfaces
- Sparsely sampled surfaces require a larger nearest neighbor query radius while a densely sampled surface may require a smaller query radius. The motivation of this work, it seems, was to develop a technique that automatically finds the optimum nearest neighbor query range for per-sample feature vector computation.
- Relevant Literature
Oct. 17, 2008 - Perfect Spatial Hashing
Nov. 7, 2008 Ray Tracing on Multi-Core and Parallel Algorithms
- Shevtsov's KD-tree construction algorithm for Multi-Core
- Min-max binning (one of the contributions of this work)
- 1 bin for storing min bound of AABB, 1 bin for storing max bound of AABB
- Initial Clustering (here, T is the number of threads)
- Distribute primitives among threads. Divide set of all primitives into equal-sized subsets
- Build thread-local min-max bins. Along some chosen dimension, say x, perform min-max binning on each subset to obtain thread-local bins
- Build global min-max bin. Compute summation of all thread-local bins to obtain a global bin
- Perform median split recursively. Perform median splits recursively along dimension x, until the space has been decomposed, along the x dimension, into T partitions. Since median splits were used, the number of primitives in each partition is roughly equal. Position of median splits are obtained somehow from the min-max bin (how? need to compute prefix-sum?)
- Per-clustering KD-tree Construction
- For each node in higher levels,
- First pass: min-max binning in parallel. To save time, count every l-th primitive, where l = log10(N) and N is number of primitives in the node. Their experiment suggests 32 bins is sufficient for all levels.
- Determine optimal split position. Here, min-max bin is used for evaluating SAH. Specifically, min-max bin can be used for looking up the number of primitives to the left or right of a candidate splitting plane.
- Second pass: geometry splitting in parallel. Create array of reference to primitives in children nodes.
- For nodes deeper down the tree, perform exact SAH evaluation
- Results
- Initial clustering does not degrade tree quality by much (Figure 7)
- implementation is scalable with number of threads (Figure 8)
- Future work
- Auto-tuning kd-tree parameters for different image resolution (motivation: Figure 6)
- For each node, having to perform multiple passes over primitives is a memory bandwidth hungry task
- Lazy construction?
- Ize's BVH construction algorithm for Multi-Core
- Suppose N threads total
- 1 thread performs BVH construction asynchronously
- N-1 threads perform per-frame vertex updates, triangle updates, BVH refits, and render. Furthermore, at the start of each per-frame processing, BVH is replaced with new one, if available.
- Relevant Literature
Some more papers:
|
|
|