[quadrille] alpha binaries. back

(L) [2006/02/22] [tbp] [[quadrille] alpha binaries.] Wayback!

Better than smashing my puter because i can't fix that damn compiler bug, i thought i'd do something constructive instead.


Bins:

[LINK http://ompf.org/alpha/quadrille-20060222.rar quadrille-20060222.rar] (win32, mscv8)


DLLs:

1. [LINK http://ompf.org/alpha/quadrille-dlls.rar quadrille-dlls.rar]

2. [LINK http://ompf.org/alpha/vcredist_x86.exe vcredist_x86.exe]

note: package #1 contains lua 5.0.2 and FreeImage 3.8.x, package #2 contains much crap and, hopefully, vcomp.dll the OpenMP dll that comes with msvc8 (but that you can't install locally, yay m$).


Scenes to play with:

[LINK http://ompf.org/vault/jagd.rar Jagdpanther SdKfz 173].

[LINK http://ompf.org/vault/dwc51.rar Dodge WC-51]

Or just put Jacco scenes to a better use [SMILEY stick out tongue]


The only warranted byproduct of running this program is a segfault. In that case, or if you hit a breakpoint, the resulting diagnosis won't help you much.


Modus operandi:

When you start it, it will look for 'bootstrap.lua' and will then proceed to load the Wavefront obj file you'll have diligently provided. Screen, camera and lights are setup via Lua.

Each scene will be compiled twice, first with the new compiler then the older one.

By default, the brand new compiler output is used. It is at best slower and will most certainly exhibit 'holes' and other traversal horrors. Just press 'b' once to switch to the other tree.

Titillate arrows, left click or hit 'l' to bring some light.
(L) [2006/02/22] [Ho Ho] [[quadrille] alpha binaries.] Wayback!

Any hope of a source release for us (hopefully many) Linux users [SMILEY Smile] ?
_________________
In theory, there is no difference between theory and practice. But, in practice, there is.

Jan L.A. van de Snepscheut
(L) [2006/02/22] [tbp] [[quadrille] alpha binaries.] Wayback!

Ah. Hmm.


The plan is to release the kd library first (that suppose that i fix that f**** bug first). Then brush up the rest and remove the cruft. And release the source.

There's 2 supported platforms, win32 and linux 32/64, and 3 compilers msvc8, icc, gcc. But as i no longer have icc, that's not going to fly.

Then the linux side hasn't been graced with my buggy kd-lib yet (f**** bug), and i have no 32 bit cross compiler/environment anyway.


All that prose to say: not now.

I guess i could push a linux 64bit binary out without too much hassle, but that wasn't what your question was about [SMILEY Wink]
(L) [2006/02/23] [tbp] [[quadrille] alpha binaries.] Wayback!

It ran?  [SMILEY Shocked]

You had to install that freaking vcredist_x86.exe i suppose. Glad it did the trick.


ICC behaves in radically different ways under windows & linux (i used to support both): on windows it mimics msvc (and you have no access to ie gcc-like inline asm), on linux it mimics gcc. Weird to say the least. Anyway, in 64bit mode, gcc put it to shame that's why i don't even bother now (plus Intel tends to ignore bug reports -> i ignore their compiler).
(L) [2006/02/23] [tbp] [[quadrille] alpha binaries.] Wayback!

Dived in to find some pix with AA.


[LINK http://ompf.org/ray/wip/pix/20060201-01.jpg]

[LINK http://ompf.org/ray/wip/pix/20060201-02.jpg]


Then there's the old gallery where the last few shots show some AA experiments.

[LINK http://ompf.org/spgm/index.php?spgmGal=wip]


Jacco if you read this, what about running that damn thing and double my end-user population?  [SMILEY Razz]
(L) [2006/02/23] [Phantom] [[quadrille] alpha binaries.] Wayback!

I'm missing freeimage.dll...
_________________
--------------------------------------------------------------

Whatever
(L) [2006/02/23] [tbp] [[quadrille] alpha binaries.] Wayback!

That's in the first dll package, along with lua.

If msvc8 isn't installed, you'll also need the 2nd dll package, for OpenMP.
(L) [2006/02/23] [Phantom] [[quadrille] alpha binaries.] Wayback!

Now it complains it can't find bootstrap.lua. I tossed everything in one dir, the exe's, obj's and dll's.
_________________
--------------------------------------------------------------

Whatever
(L) [2006/02/23] [tbp] [[quadrille] alpha binaries.] Wayback!

That should definitely work; bootstrap.lua is looked up for in the current dir and then if it fails, it's also probed for in a static place... i've registered quadrille as the handler for .obj in windows, that's why it works that way. That's not the most sensible thing to do tho.


I don't get how you now have no link errors wrt to dlls and can't boot it up with that damn file around.


edit: i'll hack in a lookup for that file in the same dir as the binary being executed.
(L) [2006/02/23] [Ho Ho] [[quadrille] alpha binaries.] Wayback!

I "installed" it by dumping everything to one directory, including the models, materials and textures. I also installed the msvc-thingie yesterday.


Benching this thingie with default parameters. I use models that came with Phantom's last program. I think thos numbers should be enough to get some useful information out of the benching. Usual running command was: quadrille.exe -f . blah.obj


sponza_clean:

~137-140ms per frame

camera.set_fovx(102.999977);

camera.set_eye(vec3(0.339127, 0.010864, 0.052900));

camera.look_at(camera.get_eye() + vec3(-0.877810, -0.226860, -0.421882));


cloister:

~130-132ms per frame

camera.set_fovx(102.999977);

camera.set_eye(vec3(0.339127, 0.010864, 0.052900));

camera.look_at(camera.get_eye() + vec3(-0.877810, -0.226860, -0.421882));


interior:

~125ms per frame

setting light0 position to (-1.221276,1.636658,0.509589).

camera.set_fovx(102.999977);

camera.set_eye(vec3(-3.316039, 0.905277, 0.017478));

camera.look_at(camera.get_eye() + vec3(0.975045, -0.087738, 0.203935));


kitchen:

~92-93ms per frame

camera.set_fovx(102.999977);

camera.set_eye(vec3(0.339127, 0.010864, 0.052900));

camera.look_at(camera.get_eye() + vec3(-0.877810, -0.226860, -0.421882));


legocar:

~43-44ms per frame

setting light0 position to (-0.007129,0.034221,0.049383).

camera.set_fovx(102.999977);

camera.set_eye(vec3(0.128133, 0.016447, 0.222503));

camera.look_at(camera.get_eye() + vec3(-0.547851, -0.420414, -0.723264));


officespace:

~78-79ms per frame

setting light0 position to (105.693924,91.342323,-23.361271).

camera.set_fovx(102.999977);

camera.set_eye(vec3(149.119873, 41.743050, -37.031517));

camera.look_at(camera.get_eye() + vec3(-0.974218, -0.209583, 0.083517));


porche:

~176-178ms per frame

setting light0 position to (-0.270225,0.276894,-1.071087).

camera.set_fovx(102.999977);

camera.set_eye(vec3(0.127628, 0.276894, -0.801270));

camera.look_at(camera.get_eye() + vec3(-0.555278, -0.145863, 0.818774));




Does this program run on a machine without SSE2 too? If it does I could bench some more on [LINK mailto:athlonxp@2.3GHz athlonxp@2.3GHz] [SMILEY Smile]
_________________
In theory, there is no difference between theory and practice. But, in practice, there is.

Jan L.A. van de Snepscheut
(L) [2006/02/23] [Phantom] [[quadrille] alpha binaries.] Wayback!

Ho ho: I tried to send a mail to your [LINK mailto:Athlon@2.3Ghz Athlon@2.3Ghz], but I got an error. [SMILEY Smile]
_________________
--------------------------------------------------------------

Whatever
(L) [2006/02/23] [tbp] [[quadrille] alpha binaries.] Wayback!

Damn phone diversion.


Especially for Jacco. It will lookup for bootstrap.lua also where the image/binary sits.

[LINK http://ompf.org/alpha/quadrille-jacco-edition.exe]


Thanks for those numbers Ho Ho, now let me look at them [SMILEY Smile]


edit: No, it will complain if there's no SSE2 or rdtsc. I could make the mono path SSE unencumbered, but i lack the motivation.

There's a '-b' switch that tinkers some more with priorities and might help getting more stable results.


edit²: Ok, your box is running at the same pace as my old mono opteron box (2ghz) from what i collect from sponza.

As those scenes are quite small, they don't exert memory bandwidth much. Still it seems to fly.

Now i wonder what it will look like on a P4 or P-M...
(L) [2006/02/23] [Phantom] [[quadrille] alpha binaries.] Wayback!

It now starts. Here's the output:
(L) [2006/02/23] [tbp] [[quadrille] alpha binaries.] Wayback!

Jacco, throw a -f . (or point to a dir with those textures) to get them loaded properly.


I'll make it more user-friendly, that thing is obviously a pain in the behindment to get running. Sorry.


edit:
(L) [2006/02/23] [UlfJack] [[quadrille] alpha binaries.] Wayback!

I've implemented Halton (multidimensional van der Corput), Sobol and NX (Niederreiter/Xing) sequence generators. We've used them for photon mapping for point and area lights. They were not noticably slower than simple random numbers and they gave some visual improvements for point lights. They didn't do anything for area lights, except when we put the emission points on a fixed grid and only used those sequences for the direction.


I've never heard of Hammersley or Larcher & Pillichshammer sequences (up to now), but I can give out (Java) source for the other ones, if someone is interested. Translation to C should be straightforward.


I can also recommend using the mersenne twister for random numbers, it's faster and in some instances (read: in java) more random.
(L) [2006/02/23] [Phantom] [[quadrille] alpha binaries.] Wayback!

I have code for the Mersenne twister ready in a separate source file. I once used it for my photon mapper; it gave far better results than the build-in random function of C, and it's a lot faster.


Let me know if you want me to paste the code, it's just a few lines.
_________________
--------------------------------------------------------------

Whatever
(L) [2006/02/24] [Lynx] [[quadrille] alpha binaries.] Wayback!

A paper with code for van der Corput and Larcher&Pillichshammer sequences (at the end):

[LINK http://www.uni-kl.de/AG-Heinrich/EMS.pdf]

"Hammersley point set" is just the name for points with coordinates [ i/N, VanDerCorput(i) ] (in the 2-dimensional case), while v.d.Corput sequence is yet another name for the Halton sequence of base 2, if i understood that correctly. Due to base 2 it can be done with some bitwise operations rather than float operations needed for other bases.

And when taking the sequence from Larcher&Pillichshammer instead of v.d.Corput is supposed to be a bit better (though the question is better for what...)

These quasi-random sequences are mainly used for multidimensional sampling, so the rest of the paper might not really interest you [SMILEY Smile]


A paper also describing the use of Hammersley point sets for AA more specifically is this one:

[LINK http://graphics.uni-ulm.de/CourseNotesSIG.pdf]

It shows how to directly compute the position inside a grid cell for AA, and shows a comparison of hammersley and stratified jittered (i.e. random position inside sub-pixel cell) sampling...(see about halfway through the paper). I think you could just completely precompute say 128x128 sample positions, with 4x4 samples per pixel they would then repeat every 32x32 pixels etc.


But for just smoothing mainly straight edges a rotated grid may still be a good choice...texture AA and reducing sampling noise would be more predestined for low-discrepancy sampling...


i already suspected that more advanced reconstruction filters are performance killers, but i wonder if GPUs could do that for you in realtime...if they are idle anyway... [SMILEY Smile]
(L) [2006/02/25] [tbp] [[quadrille] alpha binaries.] Wayback!

I've uploaded a win32 Horde enabled version while i struggle with futexes (or futicies?).

It would be nice if someone with a dual core something or a HT P4 could give it a try.

F7: horde, F8: OpenMP.


[LINK http://ompf.org/alpha/quadrille-horde-0225.exe]
(L) [2006/02/26] [Ho Ho] [[quadrille] alpha binaries.] Wayback!

A friend of mine ran the same tests as I did on his [LINK http://img113.imageshack.us/my.php?image=untitled123co.jpg P4 HT box]


Unfortunately he used the new tree* but at least he used the same coordinates and light settings as I did. He also ran the Horde version. Too bad he didn't test the OpenMP one though. Also he had some stuff running in the backround (antivirus, a movie and some programs). I think the results are not a total loss and can give some hint on P4 performance.

*)I forgot to tell him to press 'b' and he didn't read this thread very carefully


After he understood how to use windows commandline it was quite easy. He just installed the runtime, extracted everything to one directory and ran it.


Here are his results on the singlethreaded test:

Sponza_Clean 146ms

Cloister 132ms

Interior 54ms

kitchen 96ms

Legocar 29ms

Officespace 48ms

porsche 135ms


And here are the Horde results:

Sponza_Clean 136ms

Cloister 123ms

Interior 37ms

kitchen 91ms

Legocar 27ms

Officespace 50ms

porsche 122ms



If I'm lucky I can test the same thing at work tomorrow with my colleague's computer. She has Northwood @2.8GHz IIRC, If I'm really lucky I can run it on Prescott 3GHz too. Who wants to estimate which one is faster. NW has quite a bit shorter pipelines but half the L2. With branch heavy code (KDtree traversal) it should be quite a bit more efficient than Prescott.


[edit]


It would greatly simplify the benchmarking if users wouldn't have to modify the bootstrap.lua manually. Do anyone has enough time to write a script/program that would just run some tests automatically? I would think swapping the bootstrap.lua per-scene basis should be good enough for start.
_________________
In theory, there is no difference between theory and practice. But, in practice, there is.

Jan L.A. van de Snepscheut
(L) [2006/02/27] [toxie] [[quadrille] alpha binaries.] Wayback!

Numbers on my [LINK mailto:P4@2.8GHz P4@2.8GHz]:


Jagdpanzer:

F7: 17.4FPS

F8: 13FPS


B U T: There are some strange things happening! Once in a while when i fire the .exe up and press F8 i also get 17-18FPS, but only once in maybe 10runs !?!

Oh.. Just noticed that your command line dump tells me that the OMP Version just uses 1 Thread (dump_gomp_info: 2 procs, 1 threads (max 2))?!

"Forcing" it (environment variable) to use more Threads (you can only tell the maximum number) doesn't change a thing.. [SMILEY Sad]
(L) [2006/02/27] [toxie] [[quadrille] alpha binaries.] Wayback!

for the DCW:


F7: 20.2FPS

F8: 15FPS


(and same issues as the stuff mentioned above)
(L) [2006/02/27] [tbp] [[quadrille] alpha binaries.] Wayback!

First, a big thank you to Ho Ho, lyc & toxie for being bold enough to run the thing and take the time to report.


Apparently that thing runs ok on dual core and HT P4 (that's a surprise), and i have a first hand experience about dual k8. The question whether or not it scales to 4/8/etc way systems is still open though [SMILEY Smile]
(L) [2006/02/27] [tbp] [[quadrille] alpha binaries.] Wayback!

That's really interesting.

Let me say that HT is just a marketing stunt, given proper wind condition, phase of the moon and timely goat sacrifice you might get a 20% boost somewhere. That's why i was a bit sceptical.


Sum up:

F7: 20.2FPS

F8: 15FPS

kD-Vision-DWC: 20-21FPS


Or put another way, next to no gain or within a 20% tolerance: the F7/F8 modes still do some crude Schlick when the kd-vision doesn't.

At least the Horde version doesn't get drown into overhead, and that makes me all warm inside.
(L) [2006/02/27] [toxie] [[quadrille] alpha binaries.] Wayback!

And now for my own numbers:  =)

I've rendered the stuff with our own engine (it uses SIMD only for tracing (4x-)rays, not shading. OpenMP is enabled (but also only used for tracing rays, not shading))

and approximately(!) same camera.


For the DWC: 19.6FPS (kD-tree build: 0.516 sec.)

For the Jagdpanzer: 16.2FPS (kD-tree build: 0.406 sec.)
(L) [2006/02/27] [tbp] [[quadrille] alpha binaries.] Wayback!

Let's put kd-tree building times aside as only my Havran style compiler might be able to compete (that's an optimistic statement).


My dirty Schick shading, used in Horde/OpenMP modes, is SIMDified; my regular shading isn't (or not completly, and it kinda suck). And in those modes i don't bother unbundling packets.

Yet, you match that speed and you do a full shading.


Now there's some factors:

. I only have k8 around and make absolutely no efforts to accomodate those gimped P4, they are a dead-end evolution wise so why bother.

. You use ICC, it beats the crap out of msvc any day, even more so for P4 (it brings biased codegen to new heights).

. In the same vein, i guess Intel took great care to make their OpenMP fly on HT cores (that's another selling point).

. I hate msvc with passion, only kludge around the most glaring deficiencies, and for those reasons it's the slowest of all my builds.

. insert more hand waving.


What would be really interesting would be a cross benchmark, with both renderers doing the same thing this time, on those 2 architectures P4/K8 to eliminate obvious biases.


Still, i don't have much margin  [SMILEY Cool]
(L) [2006/02/27] [toxie] [[quadrille] alpha binaries.] Wayback!

in that case i might also add some "handicap"-stuff of our code:  =)


-it's not optimized solely for primary-ray-shading, but allows for adding any kind of full shaders (GI, etc.).

-(primary-)rays are not defined directly as 4-ray-bundles, but as single rays, which have to be packed (internally) by the engine to allow for 4x-traversal (=>massive slowness!).

-kD-tree is optimized for size rather then speed.

-OpenMP only does a 12/15% increase in these specific scenes.

-The code hasn't been specifically optimized for P4 (no SSE2, f.e.), but runs cool on anything which offers MMX and SSE.
(L) [2006/02/27] [tbp] [[quadrille] alpha binaries.] Wayback!

Fair enough  [SMILEY Laughing]  I mean, obviously your thingy spanks some, i'm not trying to belittle what you're acheiving at all.


We don't exactly aim for the same stuff, and if you pile up hardware/platform issues it gets rather difficult to objectively analyse what's going on.


I optimize my tree for speed, or try to; you'll note that both scenes, jagd/dwc, annoy my old compiler. That's something i never got around and one of the reason for the rewrital (look at them in kd-vision mode and you'll see the horror). In the end, i should be faster as that's my goal; "should" is the key word here atm, eh.


Then in my case SSE2 is a requirement. I could relax that rule, but that would ask for x more paths (well, there some already), more code etc... in the end that's a one man project and i have to make choices.

Plus now i only really care about the 64bit version. The old x86, 387, cramped register set etc deserves to die. Now.


I don't agree about the P4 stuff. You're using ICC, and if you've never seen biased codegen just look at what it spits. On top of that, it does a whole bunch of neat tricks behind your back; ie look what's generated for a 1/sqrt pattern.

No other compiler goes that length... hell i had to implement all that crap, and then some, for gcc [SMILEY Wink]
(L) [2006/02/27] [Phantom] [[quadrille] alpha binaries.] Wayback!

Toxie: I wrote Intel about doing a demo; I don't have access to decent hardware (just some mono-cores) so I made them a proposal to add their logo to a 'Realstorm-replacement' in exchange for some hardware (even a loan would have done the job). They responded, and I was offered to write an article, which I did. They didn't seem to care much about a demo, and all kinds of promises about beta-hardware where broken, sadly.


But I still would like to do a RT demo one day. Each and every mention of rt on a coders forum spawns a huge discussion about how cool it is, how fast it could become and how it will replace rasterizing hardware. There's a huge audience waiting for more/better rt demos.


Besides, I have been in the demo scene. [SMILEY Smile]
_________________
--------------------------------------------------------------

Whatever
(L) [2006/02/27] [toxie] [[quadrille] alpha binaries.] Wayback!

Cool.. Any releases you were involved in?

(-> [LINK http://ainc.de/] to see mine =)


If i'd had more spare time i'd be the first to start such a project, but all i got to was a 4k

for last years buenzli-party ([LINK http://buenz.li/], 4k-procedural-gfx-competition). [SMILEY Sad]

btw: This year there will be a procedural-4k-compo again, so i'd be glad if anyone here would

send in a competition entry (rules are: 4k standalone .exe, 30sec. calculation time, max. 800x600 resolution).
(L) [2006/02/27] [Phantom] [[quadrille] alpha binaries.] Wayback!

Given the way we are working right now it would be hard to cooperate on a demo, I'm afraid. Basically I am frantically chasing you (tbp) as your performance is usally just a tad beyond mine. Wouldn't make sense to use my code in that case... Btw I'll soon be teaching wannabee game developers, including graphics guys. It might be an interesting project for them? Ray tracing code would be provided, demo including art would be required for a good grade. Exposure is practically guatanteed.
_________________
--------------------------------------------------------------

Whatever
(L) [2006/02/28] [tbp] [[quadrille] alpha binaries.] Wayback!

A quick reply before i hit the lardsack.


That thing is about threads, how can it be portable? Throw in some asm and then you have to support all combo between compilers & platforms. Then on linux i've done it in 3 different ways (futex, pthreads conds & co, sem). Add some code rotting in the middle and you get like 1k. That will be always more than a one line OpenMP directive anyway [SMILEY Wink]


OpenMP doesn't have massive teams, you usually get one running thread per cpu + master unless you nest stuff. Again the only trouble with it is when you have a // section being done in a matter of millisec with a whole bunch of tiny slices, that's where you can cut some bloat. So it's certainly not worth the hassle, like most fun stuff, eh.

BTW there's really interesting things going on lately regarding wait free structures.


Indeed you can expect a NUMA aware kernel to hand you local mem on allocation, even if it might be more involved than that, but the thing is that xp32 is not NUMA aware.

I've done a silly 512 threads test on linux and it didn't break a sweat (about 15% perf went missing). I haven't bothered trying that on xp. Heck, i'd be happy if i could get a full time slice every now and then on it.
(L) [2006/03/01] [tbp] [[quadrille] alpha binaries.] Wayback!

Ah didn't know they ported it. But those Posix threads suck anyway.

Efficiency of such 'system calls' aren't a big deal in my case, because the goal is to do most of the stuff in user-space.
(L) [2006/03/01] [tbp] [[quadrille] alpha binaries.] Wayback!

Because it's snowing, i'll indulge you yet-another-half-baked-binary.

OpenMP path is deprecated, don't use it in comparisons.


[LINK http://ompf.org/alpha/quadrille-horde-0301.exe]
(L) [2006/03/01] [lycium] [[quadrille] alpha binaries.] Wayback!

*sigh* this business of needing to log in like 10k times really sucks :/ also, i swear i was logged in when i posted ^^
(L) [2006/03/03] [tbp] [[quadrille] alpha binaries.] Wayback!

I've done a cheap try at deferred shading (while trimming the shading code), to get the feel of it.


It's late so i can't properly test it, but even with just 1 light it's ~5% faster. And that's with the mono path.

F2: regular mono path F3: deferred.

There's some difference in the way shadow rays are generated, but depending on scene they may match allowing to measure relative performance, eh.

[LINK http://ompf.org/alpha/quadrille-horde-0303.exe]


If that really proves to be worth the hassle, i'll implement it for packets.


edit: 15% with 16 lights.
(L) [2006/03/03] [tbp] [[quadrille] alpha binaries.] Wayback!

Finally fixed the Cygwin port, so now i can directly compare msvc8 & gcc in the 32bit world.

When in the right mood, gcc's bin is >15% faster.


But it seems that my new packet shading code isn't its cup of tea. While msvc8 decently handle it, in the gcc 32bit world an [LINK http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19240 ugly codegen problem] shows up again (i guess register pressure is way too high, but then safeguards are supposed to kick in, [LINK http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19463 PR/19463],[LINK http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19252 PR/19252]) and in the 64bit world midway through the code gcc goes crazy moving things in/out of stack (first half is perfect).

Ain't that fun?


Add on top of that a miscompilation of another function, just for the thrill. When they say experimental, they mean it.
(L) [2006/03/06] [tbp] [[quadrille] alpha binaries.] Wayback!

I felt the urge to check, as is, what it would give.

1 thread, gnnnnnnniiiiii.

[IMG #1 ]


Didn't have the patience for the mono/deferred render.

On the other hand i doubt 400 lights would be enough for that purpose and that would still be slow even if 10x faster with their method.


Also, it's not that i don't like talks about GI in this thread, it's just that every thread down here tend to derail; we could just blob all posts together [SMILEY Smile]
[IMG #1]:Not scraped: https://web.archive.org/web/20061004022910im_/http://ompf.org/ray/wip/pix/20060306-01-400lights-small.jpg
(L) [2006/03/06] [lycium] [[quadrille] alpha binaries.] Wayback!

sorry, more hdrenv spam as i find it: [LINK http://gl.ict.usc.edu/research/MedianCut/] [just googled debevec to see if i'd spellt it correctly, and found this :) reminds me of octree colour quantisation a little...]


edit: and by that i mean submissive's octree colour quant algo. however, having actually RTFP, it's actually a lot simpler (when implemented using a summed area table). something they don't mention in the paper is that you can also store the x and y differences from the sample centr(e|oids) to their bounding boxes to do some jittering during rendering ;)
(L) [2006/03/08] [tbp] [[quadrille] alpha binaries.] Wayback!

[LINK http://ompf.org/alpha/quadrille-20060308.rar]


That's asking for some documentation as the lua interface has been souped up (among other things). Gimme some time to edit the first post... there.


So, now, you can control lights/materials/camera from lua on a per frame basis. You can't yet shoot rays or plot into the framebuffer (tho you can save it), so a Lua renderer is not yet possible or better said, practical.

That will be done Soon[tm].


Then compilation, kd-tree composition etc should also endup being controlled by Lua.


I've also listened to complaints and scripts, scene and textures are searched for in various repositories; that should help getting the damn thing to run.


Things have been tighten up, and an embarrassing bug squashed: on win32, the window wasn't sized properly and only a part of the framebuffer was displayed. Doh. I'm not even sure i've displayed it all ever before.
(L) [2006/03/08] [tbp] [[quadrille] alpha binaries.] Wayback!

Yet another binary.

[LINK http://ompf.org/alpha/quadrille-0308bis.exe]


And a saner basic bootstrap script (saner compared to the amusing one found in the archive posted earlier).

[LINK http://ompf.org/alpha/bootstrap.lua]
(L) [2006/03/13] [toxie] [[quadrille] alpha binaries.] Wayback!

I know it arrives kinda late, but here is the downloadable RGI-paper:

[LINK http://graphics.uni-ulm.de/Singularity.pdf]


And one hint: There is a (half mathematical, half algorithmic) trick to do this

"error-correction" faster then mentioned in the paper.


Have fun.. [SMILEY Smile]
(L) [2006/03/13] [lycium] [[quadrille] alpha binaries.] Wayback!

404 :(
(L) [2006/03/13] [Phantom] [[quadrille] alpha binaries.] Wayback!

I've put up a mirror:

[LINK http://www.bik5.com/Singularity.pdf]

It's on my slow ADSL line though.
_________________
--------------------------------------------------------------

Whatever
(L) [2006/03/13] [lycium] [[quadrille] alpha binaries.] Wayback!

much appreciated :)


[btw, your 1.6 p-m is fixed it seems... true?]
(L) [2006/03/13] [Phantom] [[quadrille] alpha binaries.] Wayback!

Nah, it's batteries are dead, so I can't use it on my train trip. But at least I have sse2 hardware when I am near a power outlet. [SMILEY Smile]
_________________
--------------------------------------------------------------

Whatever
(L) [2006/04/04] [toxie] [[quadrille] alpha binaries.] Wayback!

>In your code do you guys really just shoot one of these rays?


It's flexible.. If you just shoot one ray you get noise in all corners (so more rays = less noise, as usual).

What we do: Trace a picture with only "few" virtual pointlights and one "correction" ray. Then generate new pointlights and accumulate on the first pic

(and so on and so on). So you could say it's basically the same as shooting multiple rays (for a good rendering).


>The images in the paper (and in the bp video) seem relatively noise free.

>Of course its kind of hard to judge since they are not full res. Could you (toxie) post some full screen renders(+settings) if its possible?


The stuff in the paper is more or less converged. The pics in the bp-video are a bit noisy, but its harder to notice as a) it's filmed off from the

screen and b) the scenes are more complex then a cornell box and so it's harder to notice the noise in the "corners".


>The parameter b is pretty important it seems.


Yup. But we haven't found a -cool- solution to automatically/heuristically choose a useful b for all kinda scenes as it depends on scene size, camera, etc.


>I didn't use c because I don't have access to the brdf at the point where I'm doing my calculations

>I'm not too clear on the advantage that it brings aside from speeding things up in slightly darker regions.


It's more then noticeable! I'd recommend using the BRDF alongside.


>I'm still scratching my head about the speedup you mentioned for the error correction term.

>The cost of that is mainly going to be related to (recursively) tracing the ray isn't it ?

>One idea would be to just switch to plain path tracing for those second generation eye rays. Is that the right track?


Naah. Not really. But i'm sorry that i can't provide any more hints on this. [SMILEY Sad]


>Finally, is the conference data set available anywhere in a more practical format than mgf


I could offer a raw triangle-dump, but that misses the color data then.


As for the pic: Here is scene6 of Shirleys GI Test scenes, one pass (=1 sample per pixel, 1 correction ray), pathlength=3, lightpaths=21, b = 0.75 (uses BRDF).

The contrast/brightness has been increased to spot the noise, in the normal render it's almost not noticeable (only if you know that there should be noise in the corners -somewhere-).

Time to picture approx. 25 sec. (don't trust the HUD in the bottom left corner [SMILEY Wink] -> mostly nonsense numbers there as it's not adopted to my GI stuff).

[IMG #1 ]
[IMG #1]:Not scraped: https://web.archive.org/web/20061004022910im_/http://ainc.de/RTcoreTest/scene6RGI.png
(L) [2006/04/12] [Ho Ho] [[quadrille] alpha binaries.] Wayback!

Just a little note.


I'm upgrading my PC to P4 920. That means I'll be running a 64bit dualcore in about a week or so if everything works out OK.

Any hope of getting Linux binaries or preferrably source to play with?
_________________
In theory, there is no difference between theory and practice. But, in practice, there is.

Jan L.A. van de Snepscheut
(L) [2006/04/13] [Phantom] [[quadrille] alpha binaries.] Wayback!

Almost the same news here: I will be running a dual core laptop in a month or so. Preparing for that event now. [SMILEY Smile]
_________________
--------------------------------------------------------------

Whatever
(L) [2006/04/23] [Phantom] [[quadrille] alpha binaries.] Wayback!

I'll have a brand new 64bit dual-core machine with pretty extreme clock speed on my desk at the university early next week. [SMILEY Smile] Jay. That should give some new opportunities for optimizations. Me happy. Fresh dual-core laptop is also coming; should be somewhere in may. Jay again. [SMILEY Wink]
_________________
--------------------------------------------------------------

Whatever

back