Performance issues with Wald&havran's nlogn kdtree construction back

Board: Home Board index Raytracing General Development

(L) [2019/02/07] [ost by olliej] [Performance issues with Wald&havran's nlogn kdtree construction] Wayback!

I've been attempting to implement Wald and Havran's nlogn kdtree construction for a photon map implementation, and it consistently has vastly worse construction perf, and I'm not sure how much is due to splitting semantics.

If I implement the naive nlog^2n algorithm using a median split as suggested by Jensen then the split logic (assuming sorted list) is (pseudo code)
Code: [LINK # Select all]split_axis = {longest axis}
List<Photons> the_photons = {list of photons sorted along split axis}
split_index = the_photons.len
split_plane = the_photons[split_index].position[split_axis]
left = the_photons[0..split_index)
right = the_photons[split_index..the_photons.len)

This fairly obviously results in a guaranteed max time complexity of nlog^2n, and performance (by measurements) seems to grow at approximately that rate.

When I try to use the W&H approach the constant factors seem to be vastly higher, and the tree ends up being much worse. I believe the issue is how I'm splitting.

Here's what I'm doing (pseudo code again)
Code: [LINK # Select all]
 initial setup:
   events = [];
   for photon in photons {
      for axis in {x,y,z} {
         events.push({axis:x, photon:&photon}) // Actual data structure is saner, but functionally equivalent
      }
   }
   sorted_events = sort events by position
 
 splitting:
   events = sorted list of events for this node
   split_index = the_photons.len
   split_axis = {longest axis}
   left_count = events.len / 3  / 2 // we have 3 events per photon
   
   // find the actual split point
   split_event = {}
   max_axis_count = 0 // actual variable name is also terrible :D
   for event in events {
      if (event.axis != split_axis)
        continue
      max_axis_count++;
      if (max_axis_count != left_count)
        continue
      split_event = event
      break
   }
   ...
 
At this point I have to actually split around Code: [LINK # Select all]split_event, but I can't work out how to do so efficiently.

The reason my current code goes wrong is because I split on Code: [LINK # Select all]event.position[split_axis] < split_event.position[split_axis].

This trivially ends up breaking the invariant that left and right nodes are split in the middle of the photons along the split axis.

The problem I can't work out is how to do the split given that the events along the left and right of the split may include events belonging to photons that belong in the other half. What we logically want is something like:
Code: [LINK # Select all]   ...
   left_events = set{}
   for event in events {
      if (event.axis != split_axis)
        continue
      left_events.add(event.photon)
      max_axis_count++;
      if (max_axis_count != left_count)
        continue
      split_event = event
      break
   }

then our split logic easily becomes a matter of splitting on Code: [LINK # Select all]left_events.contains(event.photon)
But that introduces a hash table or similar which both uses a large amount of memory and is expensive to build.

The alternative method I can think of it to essentially store left/right in the photon structure itself, and then during the split search update as appropriate, doing that seems kind of ugly, but seems like it would be the computationally fastest way to do it?
(L) [2019/02/08] [ost by friedlinguini] [Performance issues with Wald&havran's nlogn kdtree construction] Wayback!

Seems like unnecessary complexity to use a SAH construction algorithm just to organize a point cloud. Especially if the heuristic is just a median split. That means you don't need to do a plane sweep or a full sort. I'd also get rid of mixing component values from different dimensions. As you've noticed, it just makes it harder to separate them back out.

Just do a partial sort to move the median photon into place (e.g., std::nth_element in C++), using a comparison function that compares a single axis value. Now you have two subranges that represent the contents of the child subtrees. It's essentially like performing a quicksort and a tree construction simultaneously.
(L) [2019/02/10] [ost by olliej] [Performance issues with Wald&havran's nlogn kdtree construction] Wayback!

I am just doing the naive median split, the problem is doing a median split still requires finding the median element which is an nlogn so you end up with nlog^2n construction. The w&h approach should lead to nlogn construction complexity, but it my implementation is vastly slower (proportionally) than the naive approach.
(L) [2019/02/10] [ost by friedlinguini] [Performance issues with Wald&havran's nlogn kdtree construction] Wayback!

Calculating a median can be O(n) (see std::nth_element, median-of-median algorithm, etc), so the overall algorithm I described should run in O(n log n).

If you’re using a language that doesn’t have a linear median/partition library function, then that can be a bit of a rabbit hole.
(L) [2019/02/11] [ost by olliej] [Performance issues with Wald&havran's nlogn kdtree construction] Wayback!

The only O(N) median searches I'm aware of require a sorted input, which (as we end up functionally alternative split axis) requires a sort at each level of recursion, which gives you logN nLogN operations. I *could* do a radix sort which would technically make it linear, but I suspect I would fall down the whole of debugging why an adhoc inplace radix sort implementation has bugs [SMILEY :)]
(L) [2019/02/11] [ost by olliej] [Performance issues with Wald&havran's nlogn kdtree construction] Wayback!

And immediately following that post I remembered Quickselect as a thing that existed. *sigh*

back