Board:
Home
Board index
Raytracing
Visuals
(L) [2016/06/02] [ost
by jbikker] [GPU VCM] Wayback!I tried to implement BDPT based on SmallVCM, ported it to the GPU,  and then decided to do the full VCM thing on the GPU. Turned out better than I hoped; here's some footage:
[LINK https://www.youtube.com/watch?v=ipbtJ64n6Yg]
Youtube is not great for noisy path tracing videos, so here's the (very large) original:
[LINK http://www.cs.uu.nl/docs/vakken/magr/materials/vcm_gpu.avi]
GPU is an old GTX580, and this was done in OpenCL.
- Jacco.
(L) [2016/06/02] [ost
by atlas] [GPU VCM] Wayback!It's interesting seeing the clear distinction between sampling methods, great work and great to hear from you. Time to upgrade that GPU though...
(L) [2016/06/06] [ost
by jbikker] [GPU VCM] Wayback!Here's another BDPT variant, this time with directional regularization, as in "Improving Robustness of Monte-Carlo Global Illumination with Directional Regularization", Bouchard et al., and "Path Space Regularization for Holistic and Robust Light Transport", by Kaplanyan and Dachsbacher.
I'm looking for the optimal approach for 16-32spp, for interactive graphics obviously. Quality of this directional regularization is very good at low sample counts, and the implementation is quite straight-forward (assuming BDPT is already in place).
Youtube:
[LINK https://www.youtube.com/watch?v=CMlFooAsKj8]
High quality download:
[LINK http://www.cs.uu.nl/docs/vakken/magr/materials/dirreg_gpu.avi]
- Jacco.
(L) [2016/06/07] [ost
by atlas] [GPU VCM] Wayback!Do you mind sharing a detail? How do you create and store light vertices on the GPU - in global memory, and then all blocks and threads select from them randomly? Or in shared memory where each block has its own unique light vertices local to the block that are connected to?
P.S. - looks great, appears to be better than VCM, do you agree?
(L) [2016/06/07] [ost
by jbikker] [GPU VCM] Wayback!The BDPT is executed in two separate stages; the first one (light paths) indeed emits all vertices to a rather massive buffer in global memory. I'm rendering at 800x480, and the buffer is 234MB, for a maximum light path length of 8. During eye path construction, I connect each vertex of the eye path to the corresponding light path (so no randomness is involved).
The path length of 8 obviously only works for limited scenarios (for me it's fine, considering the goal). I don't think the massive global mem i/o is a problem here by the way; calculations are rather massive, so there's plenty of opportunity to hide latencies.
What is a problem currently is occupancy; the russian roulette will make long paths quite lonely in their warps, so this needs some wavefront path tracing. That will be another set of massive buffers. I will also need to work around the fact that an eye path of length N will potentially produce N rays for the connections; especially shadow ray buffers are going to be huge unless I break it up in even more stages.
I would like to reduce buffer sizes a bit so I can also render at higher resolutions; probably half floats will help here, and perhaps storing normalized vectors using 2 components, I'll see.
One thing I wanted to try is this: the connection of an eye path vertex to the light path can be done to an arbitrary light path; perhaps it's interesting to see what happens to variance if I consider several, then randomly pick one based on estimated contribution (i.e., resampled importance sampling). Calculations will be going through the roof (I use a microfacet BRDF), but on the GPU this may be worth it.
It does look better than VCM, but this could be my implementation. [SMILEY :)] Also note that I use a massive radius for mollification: the paper suggests 1.0, I use 5.0. As a consequence I had to disable regularized connections of the light path to the camera; my mirrors became too diffuse... The massive radius helps a lot for the first couple of samples; as you can see even a single sample per pixel gives a very good impression of overall lighting.
(L) [2016/06/07] [ost
by atlas] [GPU VCM] Wayback!That's a good idea, this should be close to what you're talking about (if you haven't seen it yet): [LINK http://www-sop.inria.fr/reves/Basilic/2015/PRDD15b/PCBPT.pdf]
If you're going to create all those light paths, better to use them for something right? [SMILEY :D]
(L) [2016/06/07] [ost
by jbikker] [GPU VCM] Wayback!Haven't read that paper, and it definitely looks similar. Thanks for the link!
(L) [2016/06/10] [ost
by jbikker] [GPU VCM] Wayback!One more: [LINK https://www.youtube.com/watch?v=W87zWTuF0k0&feature=youtu.be]
This is the same algorithm, but ported to CUDA, and running on a decent GPU (GTX 980 Ti). Compared to the poor Quadro this is doing 8 full BDPT samples in ~43ms. Code is not optimized at all. Note that youtube compression does a far better job than last time; this is partially due to the reduced noise, but it also helps that I upscaled the video to 200% before uploading.
High quality video can be downloaded here: [LINK http://www.cs.uu.nl/docs/vakken/magr/materials/dirreg_cuda.avi].
(L) [2016/06/10] [ost
by zsolnai] [GPU VCM] Wayback!Really amazing results indeed. Thanks for sharing! [SMILEY :)]
(L) [2016/06/10] [ost
by atlas] [GPU VCM] Wayback!Very fast, great work. Hard to get that caustic in the mirror, those are some difficult paths.
You mentioned you use a microfacet BRDF, I've been looking for an efficient parameterized microfacet implementation, something branchless if that's even possible. I'm new to the materials side of things, do you have a good resource for that?
back