Slashdot Log In
Add Another Core for Faster Graphics
Posted by
ScuttleMonkey
on Tue Aug 29, 2006 05:22 AM
from the ray-tracing-still-fun dept.
from the ray-tracing-still-fun dept.
Dzonatas writes "Need a reason for extra cores inside your box? How about faster graphics. Unlike traditional faster GPUs, raytraced graphics scale with extra cores. Brett Thomas writes in his article Parallel Worlds on Bit-Tech, 'But rather than working on that advancement, most of the commercial graphics industry has been intent on pushing raster-based graphics as far as they could go. Research has been slow in raytracing, whereas raster graphic research has continued to be milked for every approximate drop it closely resembles being worth. Of course, it is to be expected that current technology be pushed, and it was a bit of a pipe dream to think that the whole industry should redesign itself over raytracing.' A report by Intel about Ray Tracing shows that a single P4 3.2Ghz is capable of 100 million raysegs, which gives a comfortable 30fps. Intel further states 450 million raysegs is when it gets 'interesting.' Also, quad cores are dated to be available around the turn of the year. Would octacores bring us dual screen or separate right/left real-time raytraced 3D?"
This discussion has been archived.
No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Full
Abbreviated
Hidden
Loading... please wait.
need a reason (Score:5, Funny)
Gaming (Score:5, Interesting)
http://graphics.cs.uni-sb.de/~morfiel/oasen/ [uni-sb.de]
Re:Gaming (Score:5, Informative)
It just looks good as well: http://graphics.cs.uni-sb.de/~woop/rpu/rpu.html [uni-sb.de]
Parent
It's been done... (Score:5, Interesting)
Put it on the GPU (Score:5, Interesting)
Take a look at the proceedings from any graphics conference in the last three or four years, and you will see several papers which involve ray-tracing on a GPU. Actually, not so many recently, because it's been done to death. The most impressive one I saw was at Eurographics in 2004 running non-linear ray tracing. As the rays advanced, their direction was adjusted based on the gravity of objects in the scene. The demo (rendered in realtime) showed a black hole moving in front of a stellar scene and all of the light effects this caused.
It's not JUST FP that's the issue (Score:5, Interesting)
It's not just the sheer number of FP calculations that can be the problem. Once you get away from the first (or perhaps even second) level of rays, you end up losing coherence between neighbouring rays which causes memory page/cache thrashing. This is not a nice thing on a GPU.
Parent
Re:Put it on the GPU (Score:5, Interesting)
But the GPU is interesting for raytracing. As it moves closer towards a giant floating point vector machine the motivating application will become raytracing. So at the moment a 7800gtx can push 280Gflops. That is 2800 cycles per ray for a single frame. (BTW Intels figures in the article are bullshit. 100mil rays at 30fps = 3 billion rays per second. Roughly one ray per cycle on averge. They are counting a huge number of rays that have been optimised out of the scene, eg shadows or interpolated from pervious frames using a cache).
The raw horsepower is getting there on the card but at the moment the communication soaks up all of the time. Raytracing is the poster-child problem for parallelisation - assuming that you have random access (readable) global memory. If you need to partition the memory into the compute nodes it begins to get harder. In a GPU building datastructures to hold the information is the bottleneck, and it drops the speed by factors of 100s or 1000s. Nvidia and ATi have given the general-purpose community hints that they will improve performance in reading data-structures so this particular roadblock may disappear. A real scatter operation in the fragment shader would be nice, but you would have to gut the ROPs in order to do it. This may happen anyway as the local-area operations that the ROPs compute could fold into fragment operations. To increase the write bandwidth in the card the retirement logic needs to start retiring 'pages' of pixels anyway, over a much wider bus. Otherwise the number of feasible passes per pixel will always be capped by the speed that the ROPs can retire the data.
So given how hard it would be to *efficiently* raytrace on a GPU - why bother when you can throw so much more raw horsepower at faking it with cheap raster techniques?
Parent
Not quite (Score:5, Insightful)
If this is still the case, then going from the current rendering techniques in games to raytracing would result in images with more realistic reflections and lighting but, due to performance tradeoffs, few reflective surfaces and light sources.
Besides, at the moment what games need the most is beter AIs and procedurally generated content, not yet another layer of eyecandy that requires gamers to upgrade their hardware (again).
Re:Not quite (Score:5, Interesting)
Considering a non-reflective ray traced world at 800x600 needs 320,000 rays to be cast to calculate an image, so 9,600,000 at 30fps, the claim of 450 million ray segments makes sense... thats 45+ per pixel at 800x600, which is a lot of reflections. Usually you'd limit the number to a fairly low because 100 deep reflections don't add noticable detail, especially in motion. Thats a lot of room for both refractive and reflective objects to be in the scenes.
Parent
Re:Not quite (Score:5, Interesting)
Parent
Re:Not quite (Score:5, Interesting)
There was a paper published a couple of years ago (at Eurographics?) about this. Each ray was independent, and would return a value at each intersection (i.e. you get the primary ray value quickly, and then refine it further with secondary, tertiary, etc ray data). When a ray was no longer lined up with a pixel, it was interrupted and terminated. This meant that you got a fairly low quality image while moving quickly, but a much better one when you let they rays run longer. I found it particularly interesting, since it completely removed the concept of a frame; each pixel was updated independently when a better approximation of its correct value was ready, giving a much better degradation.
Parent
rabbit rabbit rabbit (Score:4, Informative)
"Oh, blast. Rabbit, I seem to have forgotten my pocketwatch. May I borrow yours?"
Rabbit: I'm late, I'm late, I'm late...
---
anyway, if these technology becomes a reality in the 3-5 years and if I read the article right, the whole graphics architecture would change, there would only be a need for a super graphics processor and less need for too much memory and those graphics pipeline/shader thingies...
The reason that they might want it in a CPU is that, why have a separate add on GPU to handle the job while the CPU could do it alone by that time. You would then only need a "basic" video card that would just do the display.
Hmmm... could this be one of the reasons why ATI and AMD merged?
Quake 3: Raytraced (Score:4, Interesting)
http://graphics.cs.uni-sb.de/~sidapohl/egoshooter
Rumors are there's a q4 version on the way.
If you can't beat them, obviate them! (Score:4, Interesting)
If they're having trouble, for staffing or other reasons, producing good GPU designs, then it would be pretty clever of them to revolutionize the industry AND capitalize on their CPU strengths in a single move. More power to them, I say. (More power = about 120 watts, I'm guessing.)
Won't happen soon. (Score:5, Informative)
The main benefits of raytracing in games would be:
1) Shadows; they'd be Doom 3-like. Several games have full stencil shadows and that's just how raytraced ones would look: sharp and straight. The difference? Raytraced ones would take a ton more power and time to compute.
2) True reflection and refraction. We can "fake" this well enough - for example, see the Source engine's water, incorporating realtime fresnel reflections and refractions. Though Source's water's "fake" refraction/reflection aren't pixel-perfect, and are only distorted by a bump-map, it certainly looks great.
Honestly, considering the small gain in visual quality (although a major gain in accuracy) - it's like going after a fly with a bazooka. Sure, once we get to the point where there's enough processing power to deal with this well enough in realtime, it will happen - but don't expect it soon, and don't expect that huge a difference. Nicer reflections and refractions (which already look good today) and pixel-perfect shadows (looking just the same as stencil shadows in some newer games).
Re:Won't happen soon. (Score:5, Informative)
If you read the Intel paper that inspired TFA's author to write his ill-informed article, you'll see that raytracing scales better with scene complexity, and Intel did benchmarks to show that after about 1M triangles per scene, software raytracers will outperform hardware GPUs using triangle pipelines (e.g. openGL, directX, shaders).
Sure, once we get to the point where there's enough processing power to deal with this well enough in realtime, it will happen
The benchmarks in the Intel paper show that we are very close to that point right now.
Parent
30 fps - unlikely (Score:5, Interesting)
Ray tracing also suffers terribly from "jaggies". Edges look bad because rays can just miss an object and cause really bad stepping on the edges of objects. To eliminate jaggies and do anti-aliasing, you need to do sub-pixel rendering with jitter (slight randomness) to produce an average value for the pixel. So you might have to trace 4 or more rays in a pixel for acceptable anti-aliasing. Effects like focal length, fog, bump mapping etc. cause things to get even more complex. Most pictures rendered with high quality on Blender, POVRay etc. would take minutes if not hours even on a fast / dual core processor.
The only way you'd get 30fps is if cut your ray trace depth to 1 or 2, used a couple of lights, cut the screen res down and forgot about fixing jaggies. It would look terrible. Oh and find time for all the other things that apps and games must do.
Raytracing vs. Scanline for Realtime (Score:5, Informative)
All rendering algorithms boil down to a sorting problem, where all the geometry in the scene is sorted in the Z dimension per pixel or sample. Fundamentally, scanline algorithms and ray-tracing algorithms are the same. For primary rays, here's some simpliefied pseudocode:
foreach pixel in image
trace ray through pixel
shade frontmost geometry
The trace essentially sorts all the geometrty along its path.
A scanline algorithm looks like this:
foreach geometry object in the scene
foreach pixel geometry is in
if geometry is in front of whatever is in the pixel already
shade fragement of geometry in pixel
replace pixel with new shaded fragment
As you can see, the only distinction is the order of the two loops. For ray-tracing, traversing the pixels is in the outer loop, and the geometry in the inner loop. For scanline rendering, it's the opposite. This has huge consequences in terms of cache coherency. With scanline methods, since the same object is being shaded in the inner loop, and neighboring fragments of the same object are being shaded, cache coherency tends to be extermely high. The same shader program is used, and likelyhood of the texture being accessed from cache is very good. The same can't be said for ray-tracing. You can shoot two almost identical rays but touch wildly different parts of the scene. Cache coherency relative to scanline rendering is abysmal.
This one performance side-effect of ray-tracing is the only reason we haven't seen any serious ray-tracing for realtime applications. Even in offline rendering, scanline rendering dominates even though software ray-tracing has been available from the beginning of CG. For ray-tracing to become viable, we need more than just more CPU cores. We need buses fast enough to feed all the cores in situations where we have an extremely high ratio of cache misses. Unfortunately, the speed gap between memory speeds and compute power seems to be increasing in recent years.
Film at 11 (Score:5, Insightful)
Extra, extra! This just in! Report from CPU vendor discovers that you should spend more money on your CPU and less on your graphics card!
Shocking, I tells ya. Shocking.
Lies, Damned Lies and RT Raytracing (Score:5, Informative)
1) Static Objects Only. The huge majority of computation time is traversing a spatial subdivision structure. It happens that K-d trees offer the best characteristic (typically, fewest primitive per leaf for a given memory limit). However, these are really heinous to dynamically update. You can cheaply re-create it with median partitioning, but your trees are crappy. You can do a much nicer SAH (surface area heuristic), but to do this per frame blows out your CPU budget.
2) Bandwidth. Even if you could update your subdivision structure very cheaply, that structure still needs to be propogated out to all the CPUs participating in the raytrace. For the 1.87 MTri model they list on page 6, their spatial structure was 127 MB. Say you have a bandwidth of 6 GB/s, it takes 20ms just to transfer the structure (and there are other problems here). So your ceiling is 50 Fps before you trace your first ray.
3) Slower than a GPU. Even though they give you some little graph showing that raytracing (a static model, with static partitioning) beats a GPU at a MTri in the frame, this is very deceiving. The GPU pipeline works such that zillions of sub-pixel triangles simply can't get into pixel shaders fast enough, and force the pixel shader to be run many times extra. Double the resolution, however and the GPU won't take a cycle longer... with raytracing, performance will halve. So they found a bottleneck in the GPU which is totally unrepresentative of a game in every single sense, and said LOOK! BETTER! (in theory).
4) Hey, Where's my Features? All the cool things about raytracing (nice shadows, refraction, implicit surfaces, reflection, subsurface scattering) all get tossed out the window to make it real-time! What's the point, then? Given all the pixel shader hacks invented to make a GPU frame look interesting, the quality that can be achieved in a real-time raytrace is sadly tame. Especially when you consider that quality is the supposed advantage of raytracing.
And c'mon. It's Gameplay that counts anyway :P
How many do I need (Score:5, Funny)
One core to rule them all
One core to find them
One core to bring them all
And in the darkness bind them
Parent
Re:How many do I need (Score:5, Funny)
One core to rule them all
One core to find them
One core to bring them all
And in the darkness bind them
You just installed Vista onto that rig didn't you
*Ducks*
Parent
Re:How many do I need (Score:5, Funny)
Yeah, but asteroids will look AMAZING!
Parent
Re:How many do I need (Score:4, Insightful)
One core to find them
One core to bring them all
And in the darkness bind them
You must be talking about the one core that's part of the TPM.
Parent
"entirely vectors" (Score:5, Insightful)
No, ray tracing is all about searching databases for ray-object intersections. That's what GPUs can't do at all.
Parent