Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Slashdot Log In

Log In

Create Account  |  Retrieve Password

Rambus Takes Another Shot At High-End Memory

Posted by timothy on Tue Jan 25, 2005 10:40 PM
from the memory-belongs-in-heads dept.
An anonymous reader writes "Tom's Hardware is running an article about Extreme Data Rate memory (XDR DRAM for short), which was developed by Rambus and now entered mass production in Samsung's fabs. Right now, Rambus says the memory is only for high-bandwidth multimedia applications such as Sony's Cell processor, but the company ultimately hopes to push XDR into PCs and graphics cards by 2006. Time will tell if Rambus has learned from the mistakes it made with RDRAM a few years ago."
+ -
story
This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • It will be awhile (Score:5, Interesting)

    by I_am_Rambi (536614) on Tuesday January 25 2005, @10:43PM (#11476752) Homepage
    before AMD might even thinking about accepting it. Since AMD now puts the memory controller on chip, AMD will have to see proff that it is faster. AMD will not go for DDR until it gets faster. Their reasoning, DDR2 adds cost and decreases performance. Without help from AMD, Rambus might be heading down the same track.
    • Re:It will be awhile (Score:5, Interesting)

      by MBCook (132727) <foobarsoft@foobarsoft.com> on Tuesday January 25 2005, @10:57PM (#11476870) Homepage
      All the more reason to move to FBDIMMs. AMD would put one memory controller on their chips, and it would work with SD, DDR, DDR2, Rambus, XDR, or anything else someone wants to put on. Makes things easy. Becuase the physical interface is constant and buffered, you don't get the problems of needing a different socket for every kind of RAM out there.

      Unfortunatly, no one seems to be pushing for this despite the headaches it would remove. All you'd have to do is make your memory controller able to recieve faster (like going from DDR333 to DDR400). Plus, with the memory not directly connected, memory makers would not only compete evenly (since the user wouldn't need to know the difference between DDR2 and XDR except speed and price), but they could add other things like an extra cache level in front of the memory just by replacing RAM. And it would mean that the computer you bought today would take the memory that was available 3 years from now. Right now SDRAM costs a FORTUNE. But if you had a computer that takes FBDIMMs, instead of paying $50 a stick for 256mb sticks, you could buy at the price of DDR today (say 512mb for $25 or whatever it is today).

      Just think, you wouldn't need to buy new types of RAM for your PC every 2 years.

      • Re:It will be awhile (Score:5, Interesting)

        by MBCook (132727) <foobarsoft@foobarsoft.com> on Tuesday January 25 2005, @11:01PM (#11476904) Homepage
        One other thing I forgot. With FBDIMMs it would be easy to replace your DRAM with SRAM (if prices dropped enough) because the refresh circuitry is on DIMM. That means one less thing that the memory controller has to do, which means less complexity and less silicon (not that the refresh logic takes up a huge ammount, but every little bit). When magnetic RAM comes along, you wouldn't need yet another memory controller.

        And (since I think it's serial, instead of parallel like current RAM) it would SERIOUSLY decrease the pincounts of the Opteron and northbridges. Think if you could have quad channel memory in your desktop as an option. Right now the CPU would need THOUSANDS of pins to do that. But you might be able to do it with the current 939 pins on an Opteron if you used FBDIMMs.

        Ah, dreams.

          • Re:It will be awhile (Score:5, Interesting)

            by joib (70841) on Wednesday January 26 2005, @01:33AM (#11477682)

            IMHO, FBDIMM is just Intel's hedge against RAMBUS going bust. The point of RAMBUS was to reduce pincount per chip by reducing the width of the channel between the memory controller and the chip, and to decouple the notion of a memory controller controlling specific banks of memory. FBDIMMs are to solve the same problem, except RAMBUS is shipping already.


            Reducing pincount is one important reason for FB-DIMM, but the real reason for it is to get out of the capacity/speed tradeoff game. See, many systems need lots of memory. However, with current DDR-400 or DDR2-667 you can only put two devices per channel. If you want more RAM than what fits in two devices, you have to reduce the speed. FB-DIMM gets around this problem by using point-to-point links between the devices.

            Yes, this increases latency a little bit, but there really isn't any other practical way to increase speed without reducing capacity. However, FB-DIMM compensates for the increased latency by allowing many outstanding transactions on each channel; because of this, latency under high load is actually supposed to be lower than for traditional RAM tech with the same specs.
            • by Moraelin (679338) on Wednesday January 26 2005, @06:03AM (#11478590) Journal
              As someone else already said, "people seem to forget what the R in RAM stands for".

              What kills RAM nowadays in common scenarios is latency. Whenever there's a cache miss, or a mis-prediction makes you flush the CPU's pipeline and start again, what causes the CPU to stall is latency. You get to wait until that request is processed by the RAM controller, is actually delivered by the RAM, makes its way back through the RAM controller, and only then you can finally resume computing. That's latency, in a nutshell.

              And it's already _the_ problem, and it's gotten steadily worse. A modern CPU has to wait as many cycles for a word from RAM as an ancient 8086 would have if you ran it with a HDD instead of RAM. It's _that_ bad.

              That's why everyone is putting a ton of cache and/or inventing work-arounds like HyperThreading. And even those only work so far.

              And again, it's only going worse. DDR did increase bandwidth, but did buggerall for latency. Your average computer may well yet transfer two words per clock cycle with DDR, but still has 3 cycles CAS latency like SDR had. And DDR 2 has made it even worse.

              So FBDIMM's great big advantage is that it lets you have _more_ latency? Well, gee. That's as much of a solution as a kick in the head as a cure for headache.

              As I've said, "no, thanks." If Intel wants to go into fantasy land and add yet another abstraction layer just for the sake of extra latency, I'm starting to think Intel has plain old lost its marbles.
      • Re:It will be awhile (Score:5, Interesting)

        by Moraelin (679338) on Wednesday January 26 2005, @06:19AM (#11478624) Journal
        One reason the AMD 64 works so well is precisely because they _reduced_ latency. That's basically the great advantage that the IMC (Integrated Memory Controller) offers.

        Funny abstraction layers and everything being agnostic of everything else is a nice CS theoretician fantasy. In a CS theory utopia everything should be abstracted, or better yet virtualized. Any actual hardware or other implementation details should be buried 6 ft deep, under layers after layers of abstraction or better yet emulation.

        The problem is that reality doesn't work that way. Every such abstraction layer, such as buffering and translating some generic RAM interface costs time. Every single detail you play agnostic about, runs you the risk of doing something extremely stupid and slow. (E.g., from another domain: I've seen entirely too many program implementations that, in the quest to abstract and ignore the database, end up with a flurry of connections just to save one stupid record.) Performance problems here we come.

        The AMD 64 runs fast precisely because it has one _less_ level of abstraction and virtualization. Precisely because their CPU does _not_ play agnostic and let the north-bridge handle the actual RAM details. No, they know all about RAM, and they use it better that way.

        So adding an abstraction layer right back (even if one that moves the north-bridge on the RAM stick) would solve... what? Shave some 10% out of the performance? No, thanks.

        Or you mention SRAM. Well, the only advantage to SRAM is that it's faster than DRAM. Adding an extra couple of cycles of latency to it would be just a bloody stupid way to get DRAM performance out of expensive SRAM. Over-priced under-performing solutions, here we come.

        Wouldn't it be easier to just stick to DRAM _without_ extra abstraction layers to start with? You know, instead of then having to pay a mint for SRAM just to get back to where you started?

        Not meant as a flame. Just a quick reflection on how the real world is that-a-way, and utopias with a dozen abstraction layers are in the exact opposite direction.
        • Latency (Score:4, Interesting)

          by dpilot (134227) on Wednesday January 26 2005, @08:27AM (#11479090) Homepage Journal
          There's another aspect of latency here that's being ignored. Here and elsewhere in this thread tree folks are talking about circuitry issues, like the memory controller, DRAM itself, DDR, etc. Those are all valid, but there's one more that's being neglected - wires, drivers, and receivers. By simply putting the DRAM somewhere away from the CPU/Northbridge, up on a DIMM socket, you take a big hit in latency. Even getting Zero-access DRAM wouldn't speed things up that much, because of the physical-related delays.

          Oh, I agree with your abstraction comment.

          Putting faster things into an FBDIMM just won't do that much, because the speed is physically in the same spot. I did an extensive study of this back prior to 1990 and found these results, and the consolidation of L2 and even Northbridge onto the CPU shows that it's still valid, today. Main memory is going to be slow. Main memory is always going to be slow, because that's a side effect of being "big". Main memory is always going to be "big" as long as the appetite for bits exceeds what can fit onto one chip. Learn to live with it.

          Incidentally DRAM latency grows beyond minimum the moment you multiplex row and column addresses. There is a Trcd(max) spec where access is purely row-limited, but in practice that's just about impossible - access is almost always limited by Column access. Trade speed for pins.

          Beyond that, even SDR traded off latench for bandwidth, compared to EDO. (I've designed both.) I don't think DDR is that bad a deal, compared with SDR, though I haven't actually done a DDR design, myself. At the very least, DDR offers the half-cycle latency options, and the DDR designs have been architected to scale far higher in frequency than SDR ever was.
  • SRAM is much faster, closer to the core of the CPU, and plentiful (if the chip manufacturers wanted it to be).

    Who needs a gig of RAM when you can have a gig of cache?

    If they need swap space, they can always write back out directly to a disk-based swap file.
    • ...SRAM is much more expensive to produce? It also takes more power and generates more heat.

      That and the benefits of cache go DOWN as the size of the cache goes up. Past a MB or two the benefits would be lowered. Also as the # of address lines goes up the access gets slower. And finally a bigger bottle neck is that "external memory" is external.

      So unless you want to pay for a cpu with a GB of onboard "memory" in the form of SRAM.... the benefits won't be that high.

      Tom
    • by be-fan (61476) on Tuesday January 25 2005, @10:54PM (#11476846)
      Because SRAM takes up 6 transistors per bit, while DRAM takes up 1 transistor per bit. The biggest mainstream CPUs run about ~150m transistors, and that's only enough (if everything were cache), about 3MB.
      • by ottffssent (18387) on Wednesday January 26 2005, @12:10AM (#11477292)
        On the other hand, I can buy quality 1GB DIMMs for $250. Divide by 4 (rough guess. SRAM at 6T should be 6x the price, but DIMMs have caps too. 4x the manufacturing costs seems reasonable, assuming the infrastructure were in place), and you've got 256M SRAM modules for $250. Obviously that's a bit on the spendy side for large capacity RAM, but clearly there's a market for faster DIMMs. Unfortunately, DRAM access time, at about 5ns, isn't the major component of memory latency, which even on the best systems runs 10x that. The market won't bear 4x the price for a 10% increase in speed.

        This means that for SRAM to be useful, it has to be paired with a lower-latency interconnect. Some apps would benefit tremendously from 128M of what would amount to an L3 cache, even to the point that the $400 or so extra it would cost might be worth it. It's clear however that the market doesn't consider that a worthwhile expenditure.

        Although newer system architectures such as AMD's Opteron platform are moving to more closely-attached RAM, the engineering and manufacturing challenges involved in attaching memory as tightly as it is to a GPU have so far proven more expensive than the payoffs warrant. With improvements in manufacturing and interconnect technology, I'm sure we'll see ever-tighter CPU-memory integration. I doubt however the technology will move to SRAM or an SRAM-equivalent simply because the performance/heat trade-off isn't favorable. Saving a few ns of latency on the memory chips is peanuts compared to the 10s of ns of latency in the connection to the CPU, which is probably a much more tractable problem.
    • by Stevyn (691306) on Tuesday January 25 2005, @10:59PM (#11476887)
      Interesting?

      This is like saying why paint your walls with off-white stuff when you can coat them in a layer of gold that resists tarnish?

      Well, for one thing, it's greatly more expensive.
  • Never mind (Score:5, Interesting)

    by LittleLebowskiUrbanA (619114) on Tuesday January 25 2005, @10:45PM (#11476767) Homepage Journal
    if they plan on charging exorbitant prices for their memory again. I inherited a network full of fairly fast (2ghz) Dell boxes using RAMBUS. Sure is fun spending about $300 for a 512 upgrade. Of course you can only install this crap in pairs so there goes your slots.... Junk.. Rather buy a cheap new box than a memory upgrade using this overpriced crap.
  • Good marketing sense (Score:5, Interesting)

    by gbulmash (688770) * <semi_famousNO@SPAMyahoo.com> on Tuesday January 25 2005, @10:46PM (#11476778) Homepage Journal
    Smart plan not to try to make it main RAM. By going after multimedia applications like HDTV, video games, etc. they're targeting a market historically willing to pay a premium to get the best performance. I'll be really interested to see the graphic cards based on it and how they compare with the alternatives.
  • by tibike77 (611880) <tibikegamez.yahoo@com> on Tuesday January 25 2005, @10:47PM (#11476790) Journal
    Well, looks like they haven't learned much from their old mistakes, but are trying to avoid the consequences... smart move targetting heavy bandwidth apps for now.

    In the long run, if they can't significantly drop manufacture prices to (let's say) 150% or even 200% of "regular" (by that date) RAM, the boost in speed a computer with "XDR DRAM" will get compared to (again, let's say) "PC800 RDRAM" will be not significant... and I'll bet (regular) people would rather choose 8 GB of "PC800 RDRAM" over 2 GB of "XDR DRAM" any time of the day.

    Bottom line: they're either stuck with "speciality hardware" (like graphic cards or high-end servers) or they have to drop (manufacture) prices rapidly if they want to keep selling.
  • latency? (Score:5, Insightful)

    by tomstdenis (446163) <tomstdenisNO@SPAMgmail.com> on Tuesday January 25 2005, @10:56PM (#11476866) Homepage
    8GB/sec is good but not if the latency is higher than DDR.

    People seem to forget that the "Random" part of RAM is kinda crucial.

    Tom
    • Re:latency? (Score:5, Informative)

      by be-fan (61476) on Tuesday January 25 2005, @11:15PM (#11476984)
      Not necessarily. It depends on the application. In "streaming" applications (hint: 3D rendering like on a graphics card!) the latency doesn't matter nearly as much as bandwidth.
  • by onyxruby (118189) <(ten.tsacmoc) (ta) (yburxyno)> on Tuesday January 25 2005, @10:57PM (#11476873) Homepage
    Rambus seems to forget their attempt to shanghai the entire memory business through fraud a few years ago. Perhaps they should be reminded that the IT community has not. They should sell their IP and disolve themselves to avoid losing their stockholders any more money.

    I have adamantly refused to purchase any system that would use their memory for years, and more to the point have made that decision for others that depend on me making that decision. That's a lot of computers over the years were talking about. I am also far from alone.

          • submarine patents (Score:4, Informative)

            by Dink Paisy (823325) on Wednesday January 26 2005, @01:49AM (#11477745) Homepage
            According to the judge in the anti-Rambus case, Rambus did disclose their patents, and their intent to charge for them. He went so far as to say that he would have charged the manufacturers for conspiracy to put Rambus out of business in order to obtain their IP, except that he believed that that was outside the jurisdiction of the case he was trying. That finding is likely making it much easier for Rambus to make good on their patent claims.

            It's a tough act for Rambus to carry out; on the one hand, they have to deal with a small group of manufacturers who have (reportedly) been trying to defraud them and put them out of business, on the other hand, they have to rely on that same small group of manufacturers for all of their future revenue, so aggravating them too much is probably also a bad idea.

            Of course, it's also possible that the judge was Just Plain Wrong, and Rambus was just trying to get submarine patents in place while they were a member of JEDEC. I don't have the expertise to make that judgement.

  • Time Will Tell? (Score:5, Informative)

    by cacepi (100373) on Tuesday January 25 2005, @11:36PM (#11477098)
    Time will tell if Rambus has learned from the mistakes it made with RDRAM a few years ago.

    Well, Rambus has expanded their latest lawsuit blitz to include DDR2 patent claims [infoworld.com], so do you think they've learned?
  • by digitalgimpus (468277) on Tuesday January 25 2005, @11:42PM (#11477135) Homepage
    1. Fast RAM is still expensive.

    2. RAN changes to quick. I buy RAM for one computer, it's only for that computer. No portability.

    I get a hard drive, I can put that in my new system. I get a new mouse, can use that on my new system. Display? Yep. Graphics card? Most likely.

    RAM? Not likely.

    IMHO they need to standardize RAM like AGP or PCI-X. That way users feel more comfortable investing in it... you can upgrade and keep your RAM.
  • TFA has short memory (Score:4, Informative)

    by arekusu (159916) on Wednesday January 26 2005, @01:18AM (#11477609) Homepage
    "The introduction of XDR however is reminiscent of RDRAM around 2000/2001. The technology provided significantly more speed than DDR and was promoted by industry heavyweights such as Samsung and Intel."

    Actually, RDRAM was introduced around 1995, and was used by industry heavyweights such as SGI and Nintendo.
    • That would be an unusual special case. First off, most (non realtime) 3D rendering isn't terribly bandwidth or latency sensitive. Assuming the CPU is fast enough that it isn't the main bottleneck, such apps will tend to be more sensitive to latency than to bandwidth. When tracing a ray, for example, one may need to access data from all over memory to do hit-testing, but not need very much information in total. So, the relatively poor latency characteristics of RDRAM don't really suggest a keen funtansticness for 3D rendering. And, considering that current single channel DDR400 has as much bandwidth as dual channel RDRAM did... Well, I'm just surprised that your app would have such a benefit. I'd suspect that there were other differences that caused such a difference in your benchmarks. Do you have any more specifc information, such as what app you use, what sort of scene it was, and what the test systems were?

      If you were dealing with slightly different steppings of the same CPU (I assume a P4?) it would be possible that you had two CPU's of the same clock speed, but the newer stepping was less efficient per clock. The P4's, over time, have been tweaked to be less and less efficient over time, in order to facilitate higher clock speeds. RDRAM was popular with the very first generation of P4's, so it'd be logical that the benchmark you saw may have been a newer core. That shouldn't explain a 20% speed difference, but it's an example of a small thing that may have contributed to making the memory system appear to be the determinant item in performance.
      • by captaineo (87164) on Wednesday January 26 2005, @12:39AM (#11477454)
        The test case was intensive ray tracing with Pixar's RenderMan on two systems:

        3.06 GHz Pentium 4, 512KB cache, 533MHz FSB, RDRAM
        3.00 GHz Pentium 4, 1MB cache, 800MHz FSB, DDR400 RAM

        The DDR system is only 86% as fast as the RDRAM system (the RDRAM system is 16% faster). This is despite the DDR system having been purchased almost two years later, and having more cache!

        The DDR system does pull ahead for compositing tasks (by quite a bit - in some cases it's twice as fast). I assume this is due to the larger cache.

        But ray tracing takes about 90% of my total render times, so it's far more important to optimize. I am disappointed that I can't buy hardware today with the same RAM performance as I got two years ago.