A new acceleration structure back

(L) [2007/02/26] [MishoM] [A new acceleration structure] Wayback!

Hi,

In this post, I'd like to present the acceleration structure I'm using in my first ray tracer. The ray tracer is still very, very slow, because it is not optimized (it has a lot of branches, slow bounding box test etc.) but I decided to describe the structure anyway, so that people with more time could try it and say how it performs in real-life when optimized.

In my opinion the pros and cons of the two major acceleration trees used (the BIH and KD trees) are:

The strength of the KD tree is that there is no overlapping between sibling nodes. Its weaknesses are that it needs to either split triangles or include some triangles in more than one leaf and that it needs empty nodes to cut off empty space around objects.

The strength of the BIH tree is that it doesn't split triangles (and decent trees can be built fast). Its major weakness is that in many cases there is overlap between sibling nodes which means that even when an intersection is found in one branch, the other branch still may need to be examined.

I may be totally wrong about some or all of these statements, because I didn't try neither BIH nor KD-tree.

My structure tries to combine some of the strengths of both KD and BIH and diminish their weaknesses. Of course everything comes at a price, so it has its own weaknesses when compared to each of them, and it still remains to be seen how it performs when fully optimized.

The structure is similar to a BIH tree, but in fact it was inspired by another data structure - [LINK http://en.wikipedia.org/wiki/Interval_tree the interval tree]. The main difference of my structure from the BIH and interval tree is that the tree is not binary, instead it is ternary. Every node has zero to three nodes. The nodes are referred to as left, right and middle child.

The tree has the following data:

1) Scene bounding box

2) The root node of the tree.

Each node has the following data:

1) Orientation (axis)

2) Left and right boundary along the axis specified by the orientation.

3) Pointer to an array of triangles

4) Number of triangles in the node

5) Pointers to three child nodes - left, middle and right

In my implementation all triangles are stored in a contiguous array. Here is how the tree is built. For every node we do the following:

1) We find the longest axis of the node bounding box and try to split the bounding box along this axis into intervals with each interval associated with a child node.

2) If the node wasn't split successfully, we try to split by the second longest axis.

3) If this failed too, we try to split by the shortest axis.

4) If all splits failed, this is a leaf node.

Here is how the nodes are split:

1) First we find the minimal interval along the split axis which contains all triangles in the node. We then use the center of this interval as the split plane.

2) We traverse the array from both sides and swap triangles so that all triangles that are entirely to the left of the split plane end up in the beginning of the array. If such triangles are found and they are not all triangles in the node, we create a left node and point it to the found triangles. During the traversal we keep track of the minimal and maximal extents of the triangles along the axis and store them in the node (if created).

3) We traverse the remainder of the array (the triangles that are not entirely to the left) from both sides and swap triangles so that all triangles that are entirely to the right of the split plane end up in the end of the array. If such triangles are found and they are not all triangles in the node, we create a right node and point it to the found triangles. Again. during the traversal we keep track of the minimal and maximal extents of the triangles and store them in the node.

4) If there are remaining triangles between the found beginning and ending portions of the array they are put in the middle node and traversed to find the boundaries of the interval along the axis and store them in the node. These are triangles that intersect the split plane. In the case of the middle node, if it ends up containing all triangles in the node being split, but it is smaller, we still create it. This is done to cut off empty space.

5) We split the created nodes in the same way using their own bounding boxes and subarrays.

So, what happens is:

1) Each node is split into intervals as in BIH, but the intervals are three instead of two. The left and right intervals don't overlap but each of them may overlap the middle interval. The benefit of this is that if an intersection is found in the left node, the right node doesn't need to be traversed. This should help to eliminate triangles earlier than BIH. The triangles in the middle node would be split in KD.

2) Unlike BIH, the nodes have two boundaries, instead of one, so the resulting tree is lower. It is possible to keep only one boundary for each of the left and right node at the cost of a higher tree (more nodes => more memory + more traversal steps), but the middle one requires the two boundaries, so I preferred to keep two boundaries for all nodes.

3) I suppose the tree will be lower than KD and BIH because of the three child nodes.

4) In the nodes closer to the root, most triangles will go to the left and right nodes, so finding an intersection in one of these nodes will eliminate about 40% of the tree without traversing it.

5) If an area of the scene contains a mixture of big triangles and small triangles, the big triangles are likely to end up in the middle node. The bad thing about this is that probably most often these big triangles will be tested after (about half of) the small ones, though they are more likely to be intersected. I'll try to find a solution for this.

6) The leafs are mostly middle nodes of middle nodes created to cut off empty space.

Traversal is just like in KD and BIH, so I won't describe it.

For the most of the scenes that I tested (the RAW scenes contributed by toxie), the leafs of the built tree contain less than two triangles on the average, and the maximum number of triangles in a leaf is between 20 and 30. During traversal an average of 2 to 6 triangle tests per ray are performed. The average number of bounding box intersection tests is about 20-30 times greater than the number of triangle intersections. At the moment my bounding box intersection is crap (probably 5 times more expensive than my signed volume triangle intersection test) and also I should reduce its cost further by using the data from the previous intersections. What I'm trying to say is that my code cannot be used for comparison, that's why I don't provide any timings. I have very little time for coding, so I won't be able to optimize it soon. If someone has the time and is willing to implement it in an optimized manner alongside the other methods, I'd appreciate the comparison results.
_________________
--------

MishoM

(L) [2007/02/26] [goodbyte] [A new acceleration structure] Wayback!

It is an interesting idea, and definitely worth some testing/evaluation.

Traversal cost should be slightly higher than BIH (you may have to push two nodes onto the stack), but I'm not sure it will actually be faster; when the first intersection has been found in a BIH you can adjust the far plane and will only intersect with primitives from the "indeterminate" zone of each parent BIH node. This backtrack can therefore be quite fast.

I'm currently implementing a BIH-SAH compiler, and this AS should be easier to adopt to an existing KD-SAH compiler than what I'm currently writing (at least if you want it to be O(n log n)).

(L) [2007/02/27] [MishoM] [A new acceleration structure] Wayback!

I too don't know if it will be faster. That's why I posted it, so that people who have more time than me could implement it decently and compare it to other acceleration structures. It just seams to me that though the cost for traversing a single node will be higher than with BIH, the number of nodes traversed will be smaller and the overall cost will be the same or lower. But without tests this is just a speculation.

I will definitely write a decent implementation myself, but I have only 3-5 hours a week for coding so I won't be ready in the near 2-3 weeks.
_________________
--------

MishoM

(L) [2007/02/27] [slartybartfast] [A new acceleration structure] Wayback!

Hi MishoM

I think you are under the false assumption that we are all just sitting about with nothing to do [SMILEY Smile]

I think before you ask anybody to try your recipe, you have to cook it yourself. I have a day job too and a family, so I get even less than 3-5 hours a week to play with my own ray-tracer.

I hope this doesn't sound too flamish - I don't intend to flame. I'm just writing this so you don't get too disappointed if you don't get a lot of response.

Just to add my 2cents worth about your algorithm: While I haven't tried your algorithm (obviously), I have played around with various variations of BIH and all of them are slower. The BIH algorithm is so lean-n-mean that any implementations screams through both the tree construction and traversal stages. In my own particular ray-tracer, the bottleneck is now the triangle intersection test, not BIH traversal. Having said that - there are still things that cause BIH a problem - like the large triangle problem. I think the best way to solve that might be to tesselate large triangles up-front so there are no large triangle. It depends on prior-knowledge of the scene, but you could control the maximum size of triangles pretty easily.
_________________
S. Hurry, or you'll be late.

A. Late ? Late for what ?

S. Late. As in "The Late Dent-Arthur-Dent"

(L) [2007/02/27] [MishoM] [A new acceleration structure] Wayback!

Hi slartybartfast

No offence taken, I got your point.

Sorry if my post sounded as if I'm waiting around for someone to do this for me. As I said, I will optimize my own implementation, so if no one picks up the idea, that's OK. I'd just like to see how well implementations by other people would do.

I don't think that all of you have more time than me, however I suppose there are some people here who have much more time than me or you, so if they are interested, and want to try this idea, it would be nice. And it would give me some preliminary results, because they will certainly manage to do it before me [SMILEY Wink].

Also the ray tracers of many people here are in a much more advanced stage than mine, and they have much more experience in ray tracing and graphics programming than me so they are in a much better position to create a well optimized implementation. They also have BIH and KD implementations to compare with.

Anyway, about tessellating the big triangles - I hope a solution can be found which doesn't change the original scene, because this will require additional maintenance for animated scenes especially if the animation logic is not integrated into the ray tracer.
_________________
--------

MishoM

(L) [2007/02/27] [Phantom] [A new acceleration structure] Wayback!

That's exactly what I did recently! I now have 2.5 hours raw coding time (no phone, no internet, no distractions) in which I can do tons of work. Works fine for me. That doesn't mean I'll do your project though. [SMILEY Wink]

BTW I really like it that you search for alternatives that try to overcome some of the hurdles other approaches have. That's quite valuable in my opinion, even if this one doesn't work out. It will also improve your understanding greatly.
_________________
--------------------------------------------------------------

Whatever

(L) [2007/02/28] [MishoM] [A new acceleration structure] Wayback!

Thanks for the ideas.

What I do is, I carry a little notepad (paper [SMILEY Wink]) with me and write down ideas in it, down to step by step descriptions of algorithms. However this doesn't increase my coding and debugging time. A laptop would help some, but I cannot afford it right now.

Anyway, when I manage to optimize my code (better BB intersection, less branches, stack instead of recursion) I'll post my results. Anyone who's interested can probably expect some statistics in one or two months.
_________________
--------

MishoM

(L) [2007/02/28] [greenhybrid] [A new acceleration structure] Wayback!

[SMILEY Very Happy]

Even if it does not increase your coding time, it makes it more efficient, because you can carefully study a paper without having the fingers near the keyboard...dunno if you get what I mean, but, ..., [SMILEY Smile]

EDIT: oops, forgot the core of my text: Then,...! After arriving at home: bat the hell out of the keyboard. No more study at home. Code. Rush.

but I don't wanna spam too much off topic here...
_________________
[LINK http://greenhybrid.net/ greenhybrid.net]

Real Men code Software Graphics.

(L) [2007/02/28] [slartybartfast] [A new acceleration structure] Wayback!

MishoM

I'm not sure I understand your comment about animation management. Assuming the geomerty is given to the ray tracer exactly how it should be rendered, then splitting a big triangle into smaller ones should give you exactly the same result.

Why would the code doing the animation have to get involved ?

greenhybrid

I like the idea of doing work on the trian. Unfortunately, my commute is too short to make it worth it. Instead, I get all my reading / coding done after the family has gone to bed. The down-side to this - of course - is that a lot of papers are very conducive to sleep [SMILEY d'oh!]

However - I do get useful work done sometimes. Last night I wrote a really crude PLY file reader so I could read in the "bunny" and "dragon" data sets. It's the first time I've ever run that many triangles ( > 800,000 ) through my ray tracer. It's really started to work hard [SMILEY Wink]
_________________
S. Hurry, or you'll be late.

A. Late ? Late for what ?

S. Late. As in "The Late Dent-Arthur-Dent"

(L) [2007/02/28] [MishoM] [A new acceleration structure] Wayback!

What I meant is that if you split large triangles into smaller triangles, this would probably mean replacing the the big triangle with the smaller ones. Which would mean that the rendering part of the program would modify its input data. This is no problem, if this data is used only by the ray tracer, but if other parts of the program would need to work with the triangles between frames, for example to move some objects, there is trouble.

I envy all of you who can work after the rest of the family has gone to bed. Before I got married, I used to spend several hours on the computer every night, but now this is impossible, because we have only one room and a kitchen. There is no room for my PC in the kitchen, and my wife and the baby sleep in the other room, so when it's time to go to bed, coding is out of the question. The only solution is a laptop which I could use in the kitchen, but it is not in the family budget for the near years [SMILEY Crying or Very sad].

Anyway, that's enough whining on my part, the good thing is that I'll get to upgrade to a Pentium D in a month or so, so I will be able to try multi threaded techniques on two cores [SMILEY Very Happy].
_________________
--------

MishoM

(L) [2007/03/01] [beason] [A new acceleration structure] Wayback!

I feel for you, MishoM. It's too bad "One Laptop Per Child" isn't a law, or else maybe that baby could score you one. [SMILEY Razz]

(sorry, that was dumb, I know...)

(L) [2007/03/01] [slartybartfast] [A new acceleration structure] Wayback!

MishoM

OK. I'm going to stop complaining about anything, ever. My life is good. [SMILEY Rolling Eyes]

As for the triangle thing - Yes, if something needs to use the triangles between frames, then modifying the geometry could be a problem. However, I was assuming the rendering part of whatever application is running would be a seperate entity and that the triangles would be "thrown over the wall" to the renderer to do with as it pleases.
_________________
S. Hurry, or you'll be late.

A. Late ? Late for what ?

S. Late. As in "The Late Dent-Arthur-Dent"

(L) [2007/03/02] [MishoM] [A new acceleration structure] Wayback!

I think that if the scene size is in the million(s) triangles range, copying them 20 times per second will reduce performance considerably. So in my opinion, for games dynamics, animation and rendering should be integrated if possible.
_________________
--------

MishoM

(L) [2007/03/12] [toxie] [A new acceleration structure] Wayback!

to get back on your first post: i like the idea. but i don't yet get it completly. are you storing 2 values (=one interval) per child (which sums up to 6 floats per node) or is it less?
_________________
The box. You opened it. We came.

(L) [2007/03/13] [MishoM] [A new acceleration structure] Wayback!

Well, things have changed a little since I wrote the first post. Then I was storing one bounding box for the whole scene (the minimum and maximum values for each axis) and for every node I was storing its orientation (axis) and two limits for the interval - the minimum and maximum value for the axis specified by orientation of the node.

However then my bounding box intersection test was very bad - it was full of branches and didn't work for flat boxes so I artificially expanded flat boxes with an epsilon value. It also didn't use the information from the parent node intersection. I wrote a new test which has no branches and no divisions and works (most importantly [SMILEY Very Happy]). But with this test using the information from the parent will reduce its cost with only one third, so now I am storing the full bounding box in each node. I don't think that's bad, because it reduces the height of the tree which means fewer intersection tests, even if they are more expensive. I didn't actually compare the performance of the two variants, but I personally favour the one with the full bounding box.

I still don't have results that can be used for comparisson, because at the moment both the tree building and traversal are recursive and not very optimized. I made some minor optimizations and now my speed is about 40000 primary rays per second for the Fairy Forest scene on a 2.2 GHz Northwood 128 Celeron. The tree is built in about 680 ms. However I still haven't implemented any advanced optimizations and don't use SSE. The next things on my list are implementing a stack based implementation and a solution of the big triangles problem. Next I will probably implement shadow rays with their own intersection test.

Which brings me to the solution I think I have found for the big triangles problem but still haven't tried. As I mentioned earlier, with the ternary tree the big triangles will end in middle nodes high in the tree. The bigger the triangle, the higher it will go. So, what I think can be done is to test if the middle child node overlaps with the side child nodes and if so, the middle node is tested before the side node. Of course this test should be performed separately for left to right rays and for right to left rays. The way I imagine it is: during tree building, when all child nodes for a parent node are created, the middle node is tested for overlap with the side nodes and the two results are stored in the parent node in the form of two arrays of pointers which specify the order in which the children should be traversed by left to right rays and right to left rays respectively. So, instead of storing the pointers to the children once, we store them twice, once for each of the arrays. The other variant is to store them in one array and store indexes to the array in two bytes (three indexes in each byte). The middle node should be tested first if it overlaps for example one forth (or maybe one half) of the side node. And for shadow rays it may be advantageous to always test the middle node first, because that's where most of the big triangles will be.

I don't know if I've managed to explain my ideas clearly, so if there are some unclear points ask me. Also, if you want to try to implement this structure, I'll send you some code in a private message or by e-mail. I don't want to post it here because it is still ugly [SMILEY Embarassed].
_________________
--------

MishoM

(L) [2007/03/13] [Ho Ho] [A new acceleration structure] Wayback!

[OT]

(L) [2007/03/14] [MishoM] [A new acceleration structure] Wayback!

Ho Ho and lycium,

Thank you both for the advice. I was going to buy it in several days. so you were right on time [SMILEY Very Happy]. I don't know much about different processor models, so Pentium D seemed a good choice. I'll look for alternatives. BTW, what is your opinion on Athlon64 X2?
_________________
--------

MishoM

(L) [2007/03/14] [lycium] [A new acceleration structure] Wayback!

glad to be of service :)

i use an athlon x2 4400+, and am quite happy with it. however, the new hotness (not literally, as one might expect from the infernal pentium d) is intel's core 2 duo line of processors, which are twice as fast as the athlons when performing simd computations, and generally quite a bit faster overall (something like 20-30% clock-for-clock). they also overclock like absolute crazy, if you're into getting something like 2ghz for free ;) the trick there is to get a decent motherboard and good memory. however, if you get a socket AM2 x2 you'll be able to upgrade it to amd's brand new quadcore architecture when that comes out, which should in turn trump intel's current quadcore processors...

in any case, you'll be well served by either of those chips, completely night and day different from the experience you'll get with a pentium-d: for one thing, your family will complain about heating up the house, no jokes! that thing runs in the 70-80 range when under load, with a chainsaw-fan, compared to 30-50 for the newer chips, and has absolutely no performance to show for it.

(L) [2007/03/14] [Ho Ho] [A new acceleration structure] Wayback!

If you can you might want to wait until the end of April. There will be a big pricedrop then: [LINK http://www.hkepc.com/bbs/itnews.php?tid=755292&starttime=1173657600&endtime=1173744000]

Of cource in June there will be another, even greater drop: [LINK http://www.hkepc.com/bbs/itnews.php?tid=755282&starttime=1173657600&endtime=1173744000]

[LINK http://www.hkepc.com/bbs/itnews.php?tid=754873&starttime=1173657600&endtime=1173744000]

Q6600 for $266 sounds nice. OC it to 3.6Ghz and you have about the same performance as Phantoms Octopussy [SMILEY Smile]

As for the new AMD quadcore, I don't expect it to be majorly better than Core2. At best, it is 10-15% faster at same clocks. As it will have more or less the same performance in SIMD as Core2 it probably won't be that much better. Of cource it will also have SSE4 with a few nice instructions that might be quite interesting ...
_________________
In theory, there is no difference between theory and practice. But, in practice, there is.

Jan L.A. van de Snepscheut

(L) [2007/03/20] [hanatos] [A new acceleration structure] Wayback!

Hey MishoM,

as i'm still a student i found some time to implement your idea. even though the results are not yet really convincing, i'd like to share them. i do think there is some room for improvement here.

here is a comparison of my implementations of kd/BIH/ternary tree,

stanford bunny 69451 triangles, mono ray, 512x512

(1.6GHz pentium-m, if that matters. not supposed to be strikingly fast)

kdtree

average number of tris/leaf: 2.14

depth is 20.63 avg, 25 max

for a total of 196227 leafs with 419760 tris.

3.59 fps

BIH (+clipnodes), global heuristic

counted 43952 nodes, 18.84% clipnodes, max depth=27

accel construction took 0.051 seconds

3.41 fps

ternary interval hierarchy, adapted global heuristic

counted 39350 nodes,

20821 leaf nodes with an average of 3.34 triangles each,

average depth 13.03, max 18

plus 3378 + 3560 + 3500 (unstored) empty L, M, R children, resp.

accel construction took 0.042 seconds

2.78 fps

ternary interval hierarchy, aabb/2 as split candidate

counted 30875 nodes,

20327 leaf nodes with an average of 3.42 triangles each,

average depth 11.02, max 17

plus 191 + 81 + 166 (unstored) empty L, M, R children, resp.

accel construction took 0.072 seconds

0.98 fps

these screenshots are made using cpu cycles (rdtsc) as the pixel color (all scaled by the same factor to fit the range)

kd:

[IMG #1 ]

BIH

[IMG #2 ]

tih adapted global heuristic

[IMG #3 ]

tih aabb/2

[IMG #4 ]

the timings were made using gcc 4.3.0 20070302. interestingly the performance of tih/gh drops to 1.7 fps when i use icc 9.1. i suspect a serious nobrainer in my traversal code which is discovered by gcc and not by icc..

sourcecode is here:

[LINK http://www.uni-ulm.de/~s_jhanik/ternary.h www.uni-ulm.de/~s_jhanik/ternary.h]

[LINK http://www.uni-ulm.de/~s_jhanik/ternary.c www.uni-ulm.de/~s_jhanik/ternary.c]

ps: how do you call your new accel btw? sorry for choosing tih if you don't like that..

[IMG #1]:Not scraped: https://web.archive.org/web/20071028084845im_/http://www.uni-ulm.de/~s_jhanik/kd-rdtsc-3.59fps.png
[IMG #2]:Not scraped: https://web.archive.org/web/20071028084845im_/http://www.uni-ulm.de/~s_jhanik/bih-rdtsc-3.41fps.png
[IMG #3]:Not scraped: https://web.archive.org/web/20071028084845im_/http://www.uni-ulm.de/~s_jhanik/tih-rdtsc-2.78fps.png
[IMG #4]:Not scraped: https://web.archive.org/web/20071028084845im_/http://www.uni-ulm.de/~s_jhanik/tih-aabb-rdtsc-0.98fps.png

(L) [2007/03/20] [lycium] [A new acceleration structure] Wayback!

awesome, awesome work! thank you for sharing these results and providing code to experiment with :D

(L) [2007/03/20] [MishoM] [A new acceleration structure] Wayback!

Hi, hanatos

Thank you very much for trying out this idea. Is it too much to ask you to provide some more statistics? What I'd like to see is the average number of nodes traversed per ray and average number of triangles intersected per ray for all four structures if possible.
_________________
--------

MishoM

(L) [2007/03/20] [hanatos] [A new acceleration structure] Wayback!

not at all too much asked, just a matter of dumping the info.

here you are:

            avg nodes,          avg tris

kd               14.96              1.58

bih              11.80              3.89

tih, gh           10.98              4.99

tih, aabb/2    11.89             11.99

..and as images, color code: green ~ num nodes,  blue ~ num tris touched.

kd

[IMG #1 ]

bih

[IMG #2 ]

tih, gh

[IMG #3 ]

tih, aabb/2

[IMG #4 ]

[IMG #1]:Not scraped: https://web.archive.org/web/20071028084845im_/http://www.uni-ulm.de/~s_jhanik/kd.png
[IMG #2]:Not scraped: https://web.archive.org/web/20071028084845im_/http://www.uni-ulm.de/~s_jhanik/bih.png
[IMG #3]:Not scraped: https://web.archive.org/web/20071028084845im_/http://www.uni-ulm.de/~s_jhanik/tih-gh.png
[IMG #4]:Not scraped: https://web.archive.org/web/20071028084845im_/http://www.uni-ulm.de/~s_jhanik/tih-aabb.png

(L) [2007/03/21] [MishoM] [A new acceleration structure] Wayback!

Thank you. When I have some time (most probably about ten days from now), I will write some code to be able to read in the bunny model and will compare statistics. Could you please provide your camera position and direction/look at?

I looked through your code and my implementation is very different from yours. The most important difference is that I changed my initial implementation and now store the full bounding box of each node in its data structure. Something like this (I don't have the source with me now):

(L) [2007/03/21] [hanatos] [A new acceleration structure] Wayback!

camera position. unfortunately i didn't invest a lot of time in the code there, so this might be a mess:

i think my coordinate system's x-axis hast the wrong sign, also my version of the bunny is scaled, so the aabb is:

aabb: -946.90 329.87 -618.74 610.09 1873.21 588.00

..given that and dumping {0,0,1} through the quaternion transform, my camera is:

position 120.473740 1035.674438 2233.336426

direction -0.161088 0.062284 -0.984993

hope that helps.

if you want a precise comparison maybe we have to pick the same model..

(L) [2007/03/21] [MishoM] [A new acceleration structure] Wayback!

Thanks. I will post my results as soon as possible. Also thanks for posting the RA2 export and import scripts for Blender. I will certainly use them.

Using the same scenes for comparison sounds good. At the moment my program supports RA2 files only, and I've only been testing with the scenes uploaded by toxie. So it would be nice if you can upload on your site several of the scenes you are using or send them to my e-mail (mihaylov dot mihail at gmail dot com). If that's a problem, I'll just download the scenes from where you downloaded them, and convert them with the script you provided. Or maybe tbp will agree to host them in the scene repository? Then more people will have the same scenes to test on and compare results.

I will post precise results for all scenes I have, but I won't be able to test before the end of the week. I'd also prefer to implement a stack-based variant of the traversal and building routines and to optimize some things before I post my results, but that will wait for now.
_________________
--------

MishoM

(L) [2007/03/21] [hanatos] [A new acceleration structure] Wayback!

okay, here are some stats for sponza, 66453 triangles:

aabb: -348.055 -18.1338 -155.97 348.345 313.066 156.03

position 75.014046 18.364111 28.899490

direction -0.978106 0.154402 -0.139699

timings were made on the same machine as above, again with rdtsc shading (so the images below might have had a different rate, but very similar. just to be precise on that.).

kd

average number of tris/leaf: 2.62

depth is 22.04 avg, 60 max (which is my max-depth. something still wrong here..?)

for a total of 177413 leafs with 465266 tris

avg nodes touched: 65.96 avg num tris touched: 20.62

1.63 fps

BIH counted 44805 nodes, 21.72% clipnodes, max depth=33

accel construction took 0.054 seconds

avg nodes touched: 104.71 avg num tris touched: 14.78 tris

1.13 fps

ternary interval hierarchy counted 46651 nodes,

20289 leaf nodes with an average of 3.28 triangles each,

average depth 14.55, max 24

plus 6995 + 7005 + 7326 (unstored) empty L, M, R children, resp.

accel construction took 0.039 seconds

avg nodes touched: 64.02 avg num tris touched: 35.72

1.09 fps

ternary tree aabb/2:

sorry, my implementation fails to boil the number of tris/leaf down to below 6.. i don't think the timings here would be fair. some leaves end up with > 20 triangles.

and the color coded images that go with it:

kd

[IMG #1 ]

bih

[IMG #2 ]

tih

[IMG #3 ]

[IMG #1]:Not scraped: https://web.archive.org/web/20071028084845im_/http://www.uni-ulm.de/~s_jhanik/sponza-kd.png
[IMG #2]:Not scraped: https://web.archive.org/web/20071028084845im_/http://www.uni-ulm.de/~s_jhanik/sponza-bih.png
[IMG #3]:Not scraped: https://web.archive.org/web/20071028084845im_/http://www.uni-ulm.de/~s_jhanik/sponza-tih.png

(L) [2007/03/21] [MishoM] [A new acceleration structure] Wayback!

Again thank you very much. I will refrain from more comments for now, until I get the chance to test as much of your scenes as possible [SMILEY Wink]
_________________
--------

MishoM

(L) [2007/03/21] [hanatos] [A new acceleration structure] Wayback!

welcome.. glad to be helpful. [SMILEY Smile]

(L) [2007/03/26] [MishoM] [A new acceleration structure] Wayback!

Here are some preliminary results. I didn't have time to implement tree height counting or to find your view points based on the bounding box, so my screen shots are rendered from camera origins different from yours. I took the stanford bunny model directly from the original source. The Sponza scene is taken from some version of phantom's Arauna demo, so it may be transformed in some way.

Here are the stats (not full) and visualizations of the tree colour coded as in your visualisations:

Bunny

Number of leaves in tree: 44073

Average number of triangles per leaf: 1.58

Max number of triangles per leaf: 6

Average BB intersections per ray: 26.27

Average triangle intersections per ray: 1.174

[IMG #1 ]

Sponza

Number of leaves in tree: 39557

Average number of triangles per leaf: 1.925

Max number of triangles per leaf: 18

Average BB intersections per ray: 78.73

Average triangle intersections per ray: 2.85

[IMG #2 ]
_________________
--------

MishoM

[IMG #1]:Not scraped: https://web.archive.org/web/20071028084845im_/http://layoutsource.net/uploads/images/2007-03-26/HO2dYeTU33.png
[IMG #2]:Not scraped: https://web.archive.org/web/20071028084845im_/http://layoutsource.net/uploads/images/2007-03-26/spo3Is9mkO.png

(L) [2007/03/26] [tbp] [A new acceleration structure] Wayback!

Is it me wrongly interpreting those renders or both implementations have lots of issues with traversal noise?
_________________
May you live in interesting times.

[LINK https://gna.org/projects/radius/ radius] | [LINK http://ompf.org/ ompf] | [LINK http://ompf.org/wiki/ WompfKi]

(L) [2007/03/26] [hanatos] [A new acceleration structure] Wayback!

wow, very cool! very promising numbers of average tri intersections/ray!

one reason is definitely that i stop building the tree at 6 tris/node.

but maybe you also have a better traversal strategy? is it very different from mine?

the next interesting point would be how the full aabb test during traversal and the increased memory usage is going to affect the speed..

sorry, for some reason i missed your post about file formats. i must have typed mine in the meantime..

so: i can only import .ra2 as well. the blender scripts work for most (small) meshes, but i still like the idea of sharing meshes somewhere. i just don't really have enough space for a lot of meshes on my student account. also i'm not sure about legal issues?

but telling from the numeric artifacts on your sponza i assume that it's also still axis aligned, so it's really only the scale factor that would be different. my bunny is also just a scaled original ply version, so i guess that's fine.

(L) [2007/03/26] [toxie] [A new acceleration structure] Wayback!

@tbp: i think these are the cases where 2 of the splitplanes have the same value AND a triangle is exactly on these splitplanes. Then sometimes precision of the triangle test is good enough to stop the traversal, other times the second node is also traversed?!
_________________
The box. You opened it. We came.

(L) [2007/03/26] [MishoM] [A new acceleration structure] Wayback!

tbp:

I'm not experienced enough with ray tracing yet, so I don't understand what you mean by "traversal noise". Do you mind explaining?

hanatos:

Yes, the number of triangle intersections per ray is good, however the node checks are much more expensive, so maybe the overall cost of tracing a ray will still be greater than other methods. My traversal strategy is exactly the same as yours. I think the main reason for the small number of triangle intersections is the small number of triangles in the leafs of the tree.

As for scenes, there is a scene repository here, which is maintained by tbp. Anyone should be able to upload scenes there, as long as he approves it.
_________________
--------

MishoM

(L) [2007/03/26] [MishoM] [A new acceleration structure] Wayback!

Now I understand what you are talking about. I noticed it too, but haven't had the time to look into it yet. But toxie may be onto the problem, because in axis aligned scenes my tree produces a lot of flat cells. hanatos's implementation should have them too, but deeper in the tree, which would explain why they are less noticeable in his case (if this is the cause of the noise).
_________________
--------

MishoM

(L) [2007/03/26] [MishoM] [A new acceleration structure] Wayback!

After inspecting the noisy areas in the wireframe view, it seems to me that in these areas the tree should have overlapping flat nodes. In my implementation there are no empty nodes, so each of these overlapping leaves contain axis aligned triangles, most probably only one.

So what I think happens is: after the ray hits a triangle, it gets clipped to prevent hitting triangles behind, but in some cases this prevents it from entering the next overlapping node, and in others it doesn't, because of precision loss when clipping the ray. This theory is supported by the fact that the noisy areas seem to outline actual triangles in the model.

If this is really what happens, I don't think it is a problem, because the correct hit is found and that's what is needed.
_________________
--------

MishoM

(L) [2007/03/26] [hanatos] [A new acceleration structure] Wayback!

@tbp: yes, my implementations definitely don't solve that problem. except maybe for BIH, which just inherently works. i did not spend more time on that than thinking hard about every < or <=. especially the kd can't handle flat nodes at all. i didn't claim the opposite... really just wanted to say that it's probably axis aligned.

(L) [2007/08/20] [MishoM] [A new acceleration structure] Wayback!

Hi, gfyffe

Thanks for your response. I have also given up on the ternary tree for the moment. My implementation of it was a bounding volume tree with each node a full-blown bounding volume. It turned out that the more expensive traversal of the nodes in comparison to KD-trees and BIH outweighed the benefits of having a middle node. I hadn't the time to implement the splitting plane variant. I suppose it could give performance similar to BIH and KD, but I have another idea now. It is still in a very early stage, so I don't want to discuss it yet. When it's a little bit more clear I'll probably post something here.
_________________
--------

MishoM

back