gcc, snapshots & the meaning of life back
(L) [2006/02/07] [tbp] [gcc, snapshots & the meaning of life] Wayback!There's many reasons to use a bleeding edge gcc, especially when dealing with c++, SSE or x86-64 (or a deadly combination of these).
The procedure is the same on *nix and cygwin, provided you already have some version of gcc installed to boostrap with. I will only cover this case.
a) Grab a [LINK ftp://gcc.gnu.org/pub/gcc/snapshots/ snapshot] or do a svn checkout. Snapshot are frozen each week for each 'lines' of gcc currently active. Because you're bold & have no fear, you'd follow the [LINK ftp://gcc.gnu.org/pub/gcc/snapshots/LATEST-4.2 LATEST-4.2] symlink.
b) Untar.
c) You now have a directory named, say, 'gcc-4.2-20060204'
d) It's time for a little incantation.
(L) [2006/02/08] [UlfJack] [gcc, snapshots & the meaning of life] Wayback!Ok, I did it and recompiled Yve RT with gcc 4.2. And to say the least, I am surprised - it is 0.09 fps faster. YMMV, but for me that falls in the range of "certainly not worth the hassle".
(L) [2006/02/08] [tbp] [gcc, snapshots & the meaning of life] Wayback!Ok, i've quickly browsed your code. I haven't done C in years, but i know it now has const, restrict  and so on. Use them.
And please use that handy ternary operator, my eyes refuse to go through that many if [SMILEY Smile]
I'm not sure what exactly you feed intersectBBox_SSE but it seems to require too much operations for what it does (it's obviously not NaN proof). The traversal, traceBBH, is a bit branchy but otherwise funny.
I've only seen what you call a standard C traversal for kd-trees, and there's no doubt it can't be fast (it's not even iterative).
You can implement Wald aproach for mono rays in straight C (that is without the slightest explicit intrinsic) and it will give you a serious boost.
(L) [2006/02/10] [UlfJack] [gcc, snapshots & the meaning of life] Wayback!Ahh, the black art of code optimizations.
I've looked at the generated assembly code on x86_64 and it doesn't look like there is anything hugely wrong with it. Maybe you could get a few percent with reordering instructions so as to get better pipeline utilization, but that's about it. At least as far as I can see.
back