Re: Stackless MBVH Traversal for CPU, MIC and GPU Ray Tracin back
Board:
Home
Board index
Raytracing
Links & Papers
(L) [2014/01/21] [tby mpeterson] [Re: Stackless MBVH Traversal for CPU, MIC and GPU Ray Tracin] Wayback!>> jbikker wrote:On CPU, the fastest practical approach is straight MBVH traversal. For first bounce diffuse rays, Tsakok's MBVH/RS is optimal. I presented a paper with an approach that outperforms both (by ~20%), but it's a complex algorithm:
[LINK http://arauna2.ompf2.com/files/cgf_article.pdf]
it goes even much faster then this. stay tuned...
(L) [2014/01/21] [tby voxelium] [Re: Stackless MBVH Traversal for CPU, MIC and GPU Ray Tracin] Wayback!Paper/code or it didn't happen! [SMILEY ;)]
I'm also using single-ray MBVH on CPU & MIC, but for me it's better than MBVH/RS even for first bounce rays: [LINK http://www.cs.ubbcluj.ro/~afra/publications/afra2013tr_mbvh8.pdf]
A ~20% speedup sounds perfectly reasonable (what about multi-threaded perf though?), but I would be really surprised if we could improve this by an order of magnitude for incoherent rays.
(L) [2014/03/06] [tby voxelium] [Re: Stackless MBVH Traversal for CPU, MIC and GPU Ray Tracin] Wayback!My paper can now be downloaded from here: [LINK http://cg.iit.bme.hu/~afra/publications/afra2013cgf_mbvhsl.pdf]
(L) [2018/01/22] [ost
by joedizzle] [Re: Stackless MBVH Traversal for CPU, MIC and GPU Ray Tracin] Wayback!This is an excellent paper, I'll use it for my GPU tracer. One thing that I've noted, it seems the pseudo-code for BVH2 might lead to infinite loops for a novice implementer who might use the code as it is. I tested the code for a simple square plane which is made up of 2 triangles, meaning all nodes have same bounding box properties. This scenario will force the ray to intersect both leaf bounding boxes and do necessary intersection test, for a properly defined bvh (one node, 2 leaves i.e. 2n-1 property). As a result using the code in the paper, the ray traversal never enters the second leaf. Analysis of the code points to the problem of the leaf node code below.
Code: [LINK # Select all]// Leaf node
if (!isInner(nodeId)) {
    ...
}
which should be
Code: [LINK # Select all]// Leaf node
if (!isInner(nodeId)) {
   // do necessary intersection first
   ...
   
   // update parent and sibling status
   parentId = nid.x;
   siblingId = nid.y;
}
the above modified section of the leaf node code, resolved the issue. After a strenuous week of debugging. Lol
Has anyone encountered the similar problem?
(L) [2018/01/31] [ost
by mpeterson] [Re: Stackless MBVH Traversal for CPU, MIC and GPU Ray Tracin] Wayback!again: this is 10 years old crap. go with state of the art  -> google is your friend !
(L) [2018/02/05] [ost
by joedizzle] [Re: Stackless MBVH Traversal for CPU, MIC and GPU Ray Tracin] Wayback!I will definitely, in my TODO list of latest algorithms, [LINK https://github.com/joedizzle/CLTracer] (I will properly organize the repository soon).
 - But if you're dealing with platforms that have limited capability in dealing with stacks, the latest algorithm that I found, but with limited public available information is... "Efficient stackless hierarchy traversal on GPUs with backtracking in constant time"... [LINK http://research.nvidia.com/publication/2016-06_Efficient-stackless-hierarchy].
- This is well implemented in "RadeonGPU Raytracer"... [LINK https://github.com/GPUOpen-LibrariesAndSDKs/RadeonRays_SDK]. I might implement it once I understand the algorithm.
- The final accelerator to implement for GPU is "GPU Ray Tracing using Irregular Grids" (solves teapot in a stadium problem) - [LINK https://graphics.cg.uni-saarland.de/index.php?id=939] which seem to be quite superior.
My previous comment was just a simple identification of a problem that I haven't seen it been discussed anywhere in the internet.
back