Slashdot Log In
Rambus Takes Another Shot At High-End Memory
Posted by
timothy
on Tue Jan 25, 2005 10:40 PM
from the memory-belongs-in-heads dept.
from the memory-belongs-in-heads dept.
An anonymous reader writes "Tom's Hardware is running an article about Extreme Data Rate memory (XDR DRAM for short), which was developed by Rambus and now entered mass production in Samsung's fabs. Right now, Rambus says the memory is only for high-bandwidth multimedia applications such as Sony's Cell processor, but the company ultimately hopes to push XDR into PCs and graphics cards by 2006. Time will tell if Rambus has learned from the mistakes it made with RDRAM a few years ago."
This discussion has been archived.
No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Full
Abbreviated
Hidden
Loading... please wait.
It will be awhile (Score:5, Interesting)
Re:It will be awhile (Score:5, Interesting)
Unfortunatly, no one seems to be pushing for this despite the headaches it would remove. All you'd have to do is make your memory controller able to recieve faster (like going from DDR333 to DDR400). Plus, with the memory not directly connected, memory makers would not only compete evenly (since the user wouldn't need to know the difference between DDR2 and XDR except speed and price), but they could add other things like an extra cache level in front of the memory just by replacing RAM. And it would mean that the computer you bought today would take the memory that was available 3 years from now. Right now SDRAM costs a FORTUNE. But if you had a computer that takes FBDIMMs, instead of paying $50 a stick for 256mb sticks, you could buy at the price of DDR today (say 512mb for $25 or whatever it is today).
Just think, you wouldn't need to buy new types of RAM for your PC every 2 years.
Parent
Re:It will be awhile (Score:5, Interesting)
And (since I think it's serial, instead of parallel like current RAM) it would SERIOUSLY decrease the pincounts of the Opteron and northbridges. Think if you could have quad channel memory in your desktop as an option. Right now the CPU would need THOUSANDS of pins to do that. But you might be able to do it with the current 939 pins on an Opteron if you used FBDIMMs.
Ah, dreams.
Parent
Re:It will be awhile (Score:5, Interesting)
IMHO, FBDIMM is just Intel's hedge against RAMBUS going bust. The point of RAMBUS was to reduce pincount per chip by reducing the width of the channel between the memory controller and the chip, and to decouple the notion of a memory controller controlling specific banks of memory. FBDIMMs are to solve the same problem, except RAMBUS is shipping already.
Reducing pincount is one important reason for FB-DIMM, but the real reason for it is to get out of the capacity/speed tradeoff game. See, many systems need lots of memory. However, with current DDR-400 or DDR2-667 you can only put two devices per channel. If you want more RAM than what fits in two devices, you have to reduce the speed. FB-DIMM gets around this problem by using point-to-point links between the devices.
Yes, this increases latency a little bit, but there really isn't any other practical way to increase speed without reducing capacity. However, FB-DIMM compensates for the increased latency by allowing many outstanding transactions on each channel; because of this, latency under high load is actually supposed to be lower than for traditional RAM tech with the same specs.
Parent
if it increases latency, "no, thanks" (Score:4, Insightful)
What kills RAM nowadays in common scenarios is latency. Whenever there's a cache miss, or a mis-prediction makes you flush the CPU's pipeline and start again, what causes the CPU to stall is latency. You get to wait until that request is processed by the RAM controller, is actually delivered by the RAM, makes its way back through the RAM controller, and only then you can finally resume computing. That's latency, in a nutshell.
And it's already _the_ problem, and it's gotten steadily worse. A modern CPU has to wait as many cycles for a word from RAM as an ancient 8086 would have if you ran it with a HDD instead of RAM. It's _that_ bad.
That's why everyone is putting a ton of cache and/or inventing work-arounds like HyperThreading. And even those only work so far.
And again, it's only going worse. DDR did increase bandwidth, but did buggerall for latency. Your average computer may well yet transfer two words per clock cycle with DDR, but still has 3 cycles CAS latency like SDR had. And DDR 2 has made it even worse.
So FBDIMM's great big advantage is that it lets you have _more_ latency? Well, gee. That's as much of a solution as a kick in the head as a cure for headache.
As I've said, "no, thanks." If Intel wants to go into fantasy land and add yet another abstraction layer just for the sake of extra latency, I'm starting to think Intel has plain old lost its marbles.
Parent
Re:It will be awhile (Score:5, Interesting)
Funny abstraction layers and everything being agnostic of everything else is a nice CS theoretician fantasy. In a CS theory utopia everything should be abstracted, or better yet virtualized. Any actual hardware or other implementation details should be buried 6 ft deep, under layers after layers of abstraction or better yet emulation.
The problem is that reality doesn't work that way. Every such abstraction layer, such as buffering and translating some generic RAM interface costs time. Every single detail you play agnostic about, runs you the risk of doing something extremely stupid and slow. (E.g., from another domain: I've seen entirely too many program implementations that, in the quest to abstract and ignore the database, end up with a flurry of connections just to save one stupid record.) Performance problems here we come.
The AMD 64 runs fast precisely because it has one _less_ level of abstraction and virtualization. Precisely because their CPU does _not_ play agnostic and let the north-bridge handle the actual RAM details. No, they know all about RAM, and they use it better that way.
So adding an abstraction layer right back (even if one that moves the north-bridge on the RAM stick) would solve... what? Shave some 10% out of the performance? No, thanks.
Or you mention SRAM. Well, the only advantage to SRAM is that it's faster than DRAM. Adding an extra couple of cycles of latency to it would be just a bloody stupid way to get DRAM performance out of expensive SRAM. Over-priced under-performing solutions, here we come.
Wouldn't it be easier to just stick to DRAM _without_ extra abstraction layers to start with? You know, instead of then having to pay a mint for SRAM just to get back to where you started?
Not meant as a flame. Just a quick reflection on how the real world is that-a-way, and utopias with a dozen abstraction layers are in the exact opposite direction.
Parent
Latency (Score:4, Interesting)
Oh, I agree with your abstraction comment.
Putting faster things into an FBDIMM just won't do that much, because the speed is physically in the same spot. I did an extensive study of this back prior to 1990 and found these results, and the consolidation of L2 and even Northbridge onto the CPU shows that it's still valid, today. Main memory is going to be slow. Main memory is always going to be slow, because that's a side effect of being "big". Main memory is always going to be "big" as long as the appetite for bits exceeds what can fit onto one chip. Learn to live with it.
Incidentally DRAM latency grows beyond minimum the moment you multiplex row and column addresses. There is a Trcd(max) spec where access is purely row-limited, but in practice that's just about impossible - access is almost always limited by Column access. Trade speed for pins.
Beyond that, even SDR traded off latench for bandwidth, compared to EDO. (I've designed both.) I don't think DDR is that bad a deal, compared with SDR, though I haven't actually done a DDR design, myself. At the very least, DDR offers the half-cycle latency options, and the DDR designs have been architected to scale far higher in frequency than SDR ever was.
Parent
Why do we use DRAM in this day and age? (Score:3, Interesting)
Who needs a gig of RAM when you can have a gig of cache?
If they need swap space, they can always write back out directly to a disk-based swap file.
Re:Why do we use DRAM in this day and age? (Score:5, Informative)
That and the benefits of cache go DOWN as the size of the cache goes up. Past a MB or two the benefits would be lowered. Also as the # of address lines goes up the access gets slower. And finally a bigger bottle neck is that "external memory" is external.
So unless you want to pay for a cpu with a GB of onboard "memory" in the form of SRAM.... the benefits won't be that high.
Tom
Parent
Re:Why do we use DRAM in this day and age? (Score:5, Interesting)
Parent
Re:Why do we use DRAM in this day and age? (Score:4, Insightful)
This means that for SRAM to be useful, it has to be paired with a lower-latency interconnect. Some apps would benefit tremendously from 128M of what would amount to an L3 cache, even to the point that the $400 or so extra it would cost might be worth it. It's clear however that the market doesn't consider that a worthwhile expenditure.
Although newer system architectures such as AMD's Opteron platform are moving to more closely-attached RAM, the engineering and manufacturing challenges involved in attaching memory as tightly as it is to a GPU have so far proven more expensive than the payoffs warrant. With improvements in manufacturing and interconnect technology, I'm sure we'll see ever-tighter CPU-memory integration. I doubt however the technology will move to SRAM or an SRAM-equivalent simply because the performance/heat trade-off isn't favorable. Saving a few ns of latency on the memory chips is peanuts compared to the 10s of ns of latency in the connection to the CPU, which is probably a much more tractable problem.
Parent
Re:Why do we use DRAM in this day and age? (Score:4, Funny)
This is like saying why paint your walls with off-white stuff when you can coat them in a layer of gold that resists tarnish?
Well, for one thing, it's greatly more expensive.
Parent
Re:Why do we use DRAM in this day and age? (Score:5, Interesting)
For a given area of silicon, you could have 1 gigabit of DRAM or 128 Megabit of SRAM. Is it worth that trade-off? One can make more chips, but making chips uses a lot of expensive and toxic chemicals, and fab time isn't free either.
Parent
Never mind (Score:5, Interesting)
Good marketing sense (Score:5, Interesting)
Pathetic attempt at FPing (Score:5, Interesting)
In the long run, if they can't significantly drop manufacture prices to (let's say) 150% or even 200% of "regular" (by that date) RAM, the boost in speed a computer with "XDR DRAM" will get compared to (again, let's say) "PC800 RDRAM" will be not significant... and I'll bet (regular) people would rather choose 8 GB of "PC800 RDRAM" over 2 GB of "XDR DRAM" any time of the day.
Bottom line: they're either stuck with "speciality hardware" (like graphic cards or high-end servers) or they have to drop (manufacture) prices rapidly if they want to keep selling.
latency? (Score:5, Insightful)
People seem to forget that the "Random" part of RAM is kinda crucial.
Tom
Re:latency? (Score:5, Informative)
Parent
Rambus seems to forget (Score:5, Insightful)
I have adamantly refused to purchase any system that would use their memory for years, and more to the point have made that decision for others that depend on me making that decision. That's a lot of computers over the years were talking about. I am also far from alone.
submarine patents (Score:4, Informative)
It's a tough act for Rambus to carry out; on the one hand, they have to deal with a small group of manufacturers who have (reportedly) been trying to defraud them and put them out of business, on the other hand, they have to rely on that same small group of manufacturers for all of their future revenue, so aggravating them too much is probably also a bad idea.
Of course, it's also possible that the judge was Just Plain Wrong, and Rambus was just trying to get submarine patents in place while they were a member of JEDEC. I don't have the expertise to make that judgement.
Parent
Time Will Tell? (Score:5, Informative)
Well, Rambus has expanded their latest lawsuit blitz to include DDR2 patent claims [infoworld.com], so do you think they've learned?
Why does RAM suck so much? (Score:5, Interesting)
2. RAN changes to quick. I buy RAM for one computer, it's only for that computer. No portability.
I get a hard drive, I can put that in my new system. I get a new mouse, can use that on my new system. Display? Yep. Graphics card? Most likely.
RAM? Not likely.
IMHO they need to standardize RAM like AGP or PCI-X. That way users feel more comfortable investing in it... you can upgrade and keep your RAM.
TFA has short memory (Score:4, Informative)
Actually, RDRAM was introduced around 1995, and was used by industry heavyweights such as SGI and Nintendo.
Re:The numbers don't lie! (Score:5, Insightful)
If you were dealing with slightly different steppings of the same CPU (I assume a P4?) it would be possible that you had two CPU's of the same clock speed, but the newer stepping was less efficient per clock. The P4's, over time, have been tweaked to be less and less efficient over time, in order to facilitate higher clock speeds. RDRAM was popular with the very first generation of P4's, so it'd be logical that the benchmark you saw may have been a newer core. That shouldn't explain a 20% speed difference, but it's an example of a small thing that may have contributed to making the memory system appear to be the determinant item in performance.
Parent
Re:The numbers don't lie! (Score:5, Interesting)
3.06 GHz Pentium 4, 512KB cache, 533MHz FSB, RDRAM
3.00 GHz Pentium 4, 1MB cache, 800MHz FSB, DDR400 RAM
The DDR system is only 86% as fast as the RDRAM system (the RDRAM system is 16% faster). This is despite the DDR system having been purchased almost two years later, and having more cache!
The DDR system does pull ahead for compositing tasks (by quite a bit - in some cases it's twice as fast). I assume this is due to the larger cache.
But ray tracing takes about 90% of my total render times, so it's far more important to optimize. I am disappointed that I can't buy hardware today with the same RAM performance as I got two years ago.
Parent