Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon.

Pages: 1-

Distribooted Ray Tracing

Name: Anonymous 2011-05-24 18:25

Does /prog/ think it's feasible to get more than 1 or 2 frames a second with some old desktop computers I've got sitting around?

Name: Anonymous 2011-05-24 18:26

64x48

Name: Anonymous 2011-05-24 19:24

depends on the hardware

Name: Anonymous 2011-05-24 21:14

A hodge podge, really. I know you have to account for latency, of course, but do you think it would be an okay rough estimate to simply add the frequencies of each core of each machine together for a sort of total power? I've got two quad core machines each around 2.7 GHz, and two older, single core machines, each with clock speed of around 1.8 GHz.

Name: Anonymous 2011-05-24 21:22

>>4
Won't do fuck all. Intel's Larrabee prototype was a 32-core monster with cores each running at 2GHz and it still couldn't perform very well at ray tracing.

Modern GPUs have thousands of shader cores running at around 1GHz and they don't fair too well either.

Ray tracing isn't FLOP limited, it's limited by memory latency and cache coherency.

John Carmack once analyzed the memory access patterns of ray tracing at a low level and concluded that it will NEVER be feasible for real-time graphics with graphical qualities that will exceed those attainable through traditional rasterization.

Instead, he found that ray-casting using deferred shading and scene composition is far more feasible. Note that ray-tracing != ray-casting.

Name: Anonymous 2011-05-25 5:56

>>5
I had to double-check that I was indeed still on /gorp/ after reading this.

Name: Anonymous 2011-05-25 7:48

>>5
Ray tracing isn't FLOP limited, it's limited by memory latency and cache coherency.
In other words, the more computers (or CPU-RAM pairs) you have in your network, the better it works.

Name: Anonymous 2011-05-25 10:08

>>5
I'm pretty sure >>4,1 doesn't know or care about the difference between raytracing and raycasting. And I don't think he should. Carmack needs to be a pedant about that, but even Intel is using them interchangeably. (Guess which term they prefer.)

Name: Anonymous 2011-05-25 12:07

depends on the algorithm

Name: Anonymous 2011-05-26 2:22

>>1 & >>4 here

>>5
Isn't ray-casting just ray-tracing without the recursion?

I've been considering diving through thrift stores for old machines that I could put to the task, and if I understand your post correctly, a cloud of 20 of these would perform the task better than a cloud of 5 or 10 quad-core machines?

>>8
It's all interesting. More interesting than half the other threads right now.

Name: Anonymous 2011-05-26 7:13

>>7
IHBT

>>8
Fuck Intel.

>>10
Yes, ray-casting is just ray-tracing with only one collision per ray. No bouncing of rays off of surfaces to compute inter-reflection and radiance. You can use deferred shading and global illumination techniques to better approximate inter-reflection and radiance with MUCH better performance.

And no, a cloud of 20 wouldn't necessarily perform the task better. Ray-tracing does not scale at all. You are completely misunderstanding what I mean when I say it's limited by memory latency and cache coherency. Adding more RAM doesn't improve memory latency. Adding more machines doesn't improve memory latency. In fact, adding faster RAM doesn't improve memory, it actually makes it worse. The only thing you can do really is improve the size of the on-die cache on the CPU, but there are even limitation as to what you can do there.

When you think about how ray-tracing works, you cast a ray out into the scene and find where it first intersects with the geometry. This pulls a whole bunch of model data into the CPU's cache, and when it finds a collision, the data representing the surface that was intersected is left hot in the cache. But when you recurse and bounce the ray off of it, it goes in a completely different direction and everything in the cache ends up getting thrown out. And then it happens again, and again! Extremely poor scalability.

With ray-casting, you cast your ray, get an intersection, leave the model hot in the cache, and then cast the ray for the next pixel over from the current, and there's a good chance it will end up intersecting with the same surface as the previous ray, and therefore the cache doesn't have to be invalidated and reloaded with different data as often.

If you aren't really familiar with the intricacies and performance characteristics of memory and cache coherency on modern CPU architectures, then I suggest you read ``What Every Programmer Should Know About Memory'':

http://www.akkadia.org/drepper/cpumemory.pdf

Yes, you could design specialized hardware that is better at ray-tracing then a general purpose computer, but that's a big waste of silicon. You will get better quality real-time graphics and be able to render a lot more per frame if you invest that silicon into conventional GPUs and use rasterization or ray-casting in conjunction with deferred shading and scene composition.

Name: Anonymous 2011-05-26 10:07

Best thread on /prague/.

Name: Anonymous 2011-05-26 15:50

>>12
there was some other interesting thread lately (the one with [OPINION] in the title)

Name: Anonymous 2011-05-26 17:07

OP here.

>>11
Thanks. That makes a lot more sense now.

Do you think it would improve performance to do ray-tracing in a sort of breadth-first manner? You could implement an algorithm that did each level of recursive firing in succession (Ray cast every pixel, then go back and compute reflected rays, etc), making the memory in the cache relevant more frequently (if I understand what you're saying that is. Thanks for the recommended reading. It's next on my list.)

Also, even it if isn't a very elegant solution, supposing you had n identical machines, each having its own copy of the scene in memory, wouldn't you perform at least n times faster, so long as each machine worked on different portions of the image?

Name: Anonymous 2011-05-26 17:08

>>14
Also, even it if isn't a very elegant solution, supposing you had n identical machines, each having its own copy of the scene in memory, wouldn't you perform at least n times faster, so long as each machine worked on different portions of the image?
http://en.wikipedia.org/wiki/Embarrassingly_parallel

Name: Anonymous 2011-05-26 17:44

>>11
Ray-tracing does not scale at all.
What the fuck are you talking about? Across multiple machines, it scales linearly. Tile the image in 32x32 blocks and do every n-th tile in one machine.

If you're doing video, just divide by frames (or even scenes for longer stuff).

Adding more machines doesn't improve memory latency.
This is the most retarded thing I've read today. Adding more cores doesn't improve frequency, why do multithreaded compute-bound programs run faster when you add more cores? Adding more hard disks doesn't improve their individual capacities, why can you store more data?

Name: Anonymous 2011-05-26 17:59

Despite being latency bound, it turns out the newest CPUs still smoke anything else. Improvements such as moving the memory controller on-die and larger and better OOE engines do really pay off. Hyper Threading also helps a lot. In benchmarks, within a CPU family with the same memory type and speed, results usually correlate lineally with frequency (both for ray-tracing and compression).

I'm talking off my ass here but I suspect if you pay for electricity and you plan on doing this for months anything less than a Conroe is a net waste (better to buy a newer processor), and Nehalem's easily 25% faster at the same frequency.

In any case you should test it yourself, and watch power usage too. Having a machine wasting 200W just to get 5% more throughput is probably a bad idea.

Name: >>7 2011-05-26 18:03

>>16
What I said. But YIHBT

Name: Anonymous 2011-05-26 21:12

Name: Anonymous 2011-05-26 21:22

>>19
What about destructible environments? With sparse voxels you can't cut trees, do terraforming or simulate water and fire liquids. Rebuilding oct-tree takes a good amount of time.

Name: Anonymous 2011-05-26 21:26

>>20
Yes you can. You underestimate what modern GPUs are capable of.

Name: Anonymous 2011-05-26 21:33

>>21
it's fuckin sparse. there is no data about internals. how is it better than triangles?

Name: Anonymous 2011-05-26 21:41

Name: Anonymous 2011-05-26 21:42

>>22
You can optionally store model internals, it's not hard to augment it to do so for destructible models.

Name: Anonymous 2011-05-26 21:43

[b]SEEEEEEEE ZZZZZZZZUUUUUUUUUUUER[b/]

Name: Anonymous 2011-05-26 21:44

<b>wtf?<b>

Name: Anonymous 2011-05-26 21:44

<b> /kills self </b>

Name: Anonymous 2011-05-26 21:46

>>25-27
Out.

Name: Anonymous 2011-05-26 22:51


Don't change these.
Name: Email:
Entire Thread Thread List