Disappointing TriAccel results

(L) [2007/08/03] [Zakalwe] [Disappointing TriAccel results] Wayback!

I've implemented TriAccels and compared to Moeller's test I'm only registering marginal gains. I suppose any speedup is great, but I was really hoping to break the 30fps barrier on one of my benchmarks that is currently sitting at ~27fps [SMILEY Smile]. This is of course a mono-ray path.

What are everyone's elses results using TriAccels?

Here's my current code if anyone's interested. Ignore the funny looking variables like mBvCupBuCv. They're a leftover from reduction of the variables (mBvCupBuCv = (-B[v]*C[u]) + (B[u]*C[v]) ) etc. I'll clean them up later. What's important is that the code produces the same u,v,t as the Moeller code.

(L) [2007/08/03] [moris] [Disappointing TriAccel results] Wayback!

I use a simplified version when triangle is axis aligned : my_tri.n_u == 0 && my_tri.n_v == 0.

In this case the computation of "t" becomes :

(L) [2007/08/03] [Michael77] [Disappointing TriAccel results] Wayback!

Actually, I found Moeller´s test to be within a 10% range compared to Wald´s triangle test - and that´s without storing additional information like the triangle edge vectors. Handling double sided triangles is a bit more complicated for SIMD but it gives you the advantage of doing the division late (after doing the inside test, because those fail most the time). So after some tweaking, the Moeller triangle test can be just as fast on the average as Wald´s Test (especially if you store the edge vectors).

My theory, why most people think, Wald´s test faster: Wald tested it using a kd-tree. In a kd-tree you test far less triangles compared to a BVH/BIH and most of the time, you get hits. Wald´s test is pretty fast in that case. On the other hand, in a BVH/BIH the number of triangle tests increases and most of the time a triangle is rejected based on the barycentric coordinates. And this is where Moeller´s test is faster.

Just my 2 cents

(L) [2007/08/04] [tbp] [Disappointing TriAccel results] Wayback!

Just a remark: i find that, at least for SIMD, making that Wald intersector branchless is consistently a win (certainly by virtue of early div + queuing of many ops).

Even for mono ray you may want to delay (or remove some) branches.
_________________
May you live in interesting times.

[LINK https://gna.org/projects/radius/ radius] | [LINK http://ompf.org/ ompf] | [LINK http://ompf.org/wiki/ WompfKi]

Disappointing TriAccel results back