Forgot your password?
typodupeerror
Hardware Technology

The Impact of Memory Latency Explored 162

Posted by CmdrTaco
from the stuff-to-read dept.
EconolineCrush writes "Memory module manufacturers have been pushing high-end DIMMs for a while now, complete with fancy heat spreaders and claims of better performance through lower memory latencies. Lowering memory latencies is a good thing, of course, but low-latency modules typically cost twice as much as standard DIMMs. The Tech Report has explored the performance benefits of low-latency memory modules, and the results are enlightening. They could even save you some money."
This discussion has been archived. No new comments can be posted.

The Impact of Memory Latency Explored

Comments Filter:
  • The link seems to crash my Firefox...

    A bug to be reported, or what is happening?
  • by 0110011001110101 (881374) on Wednesday November 02, 2005 @12:04PM (#13932640) Journal
    FTFA - Lowering memory latencies is a good thing, of course, but low-latency modules typically cost twice as much as standard DIMMs.

    I'd have to say this is right on when applied picking a woman to spend your life with... low-latency memory is a BAD BAD thing, and VERY expensive. My next time around, I'm going with the "CHEAPER", high-latency model that can't immediately recall everything I've ever said while arguing her point... Roses and jewelry can cost you over the long run friends...

  • by phpm0nkey (768038) * on Wednesday November 02, 2005 @12:09PM (#13932699) Homepage
    I have no doubt that hardcore PC gamers will shell out the cash for these, regardless of the cost/performance ratio. Once you start paying $500+ for a graphics card, all rational decision making skills are lost.
    • One of my friend's friend had this obsession with having the bleedy edge gaming rig and would always upgrade his video card. I think his upgrade cycle compared to mine was something along the lines of 3:1. I'd have a TNT2 Ultra (hand me down), then grabbed a GeForce3Ti200 (second hand from cousin for $100CDN), and just recently I bought a GeForce6600GT when the 7000-series just came out. I'm sure this guy has one of the top of the line ATi cards or something in the $400USD+ range.
    • by timeOday (582209) on Wednesday November 02, 2005 @12:50PM (#13933103)
      I wouldn't buy a $500 card either but, sheesh, at least they're faster than the cheap ones. This low-latency memory is twice the price for a ~3% boost... I think not.
    • by Iriel (810009) on Wednesday November 02, 2005 @12:52PM (#13933128) Homepage
      This isn't all that funny. I mean, it does make me laugh, but it's far more true than humorous. I constantly get berated by the 'hardcore' gamers for not having the fastest CPU/RAM/GPU/HD when I can still run a lot of games just as well as anyone else. The problem with hardcore gaming equipment is that it has become something like MTV selling you 'cool'.

      Guess what? That wicked dual-core CPU actually runs games slower than its single core cousin. That brand-spankin' new video card that cost you $400(or more)? I pay that much once every several years on my video card. The difference is that I don't care if I squeeze out my maximum frames per second because most people can't even detect the difference if the game didn't have an option to show the number in the corner of the screen like some veritable rating of thier manhood (sorry for my gender bias on that). And that super ultra OHMYFUCKINGGODITMAKESMYEXPLODEITSSOFAST low-latency RAM is giving you a performance boost of 2% of what I've got now.

      I find it educational to read these reports so I can make educated purchasing choices. For that, I'm quite grateful. However, I find it kind of sad that the parent post is unsettlingly accurate in that the 'hardcore pc gamers' will shove this to the side for the ATI SXL 10G Super Elite XTRME Pro card next week. Witness what happens when PC gaming meets MTV-esque marketing.
      • > when I can still run a lot of games just as well as anyone else.

        is that right ? What FPS are you getting in Quake 4 at 1280x1024x32 with tri-linear and models/textures set to HIGH ?

        • by Iriel (810009) on Wednesday November 02, 2005 @01:12PM (#13933307) Homepage
          You seemed to have missed the point that 'a lot of games' does not mean 'all games', 'any games', or any derivative thereof. And honestly, the point of my post is that I'm willing to sacrifice some detail and put my settings at 75-80% instead of maxed-out if it'll save me from spending close to a thousand dollars a year in upgrades.
        • It's a bogus question - no real gamer that isn't a complete waste of oxygen and silicon plays at 1280x1024.
          Now if he had said 1600x1200, then maybe we could have taken the question seriously - but I assure you guys that he is just yanking your chains (and you fell for it!)

          That said, I second the guy that requested a roundup where they put 512M of the uber1337 memory with chrome heat-spreaders and blinkenlichten in one machine, and 2G of Crucial/Kingston budget line memory in the other - and do a full suite
      • "I constantly get berated by the 'hardcore' gamers for not having the fastest CPU/RAM/GPU/HD when I can still run a lot of games just as well as anyone else."

        Why are you associating with such people?

        "However, I find it kind of sad that the parent post is unsettlingly accurate in that the 'hardcore pc gamers' will shove this to the side for the ATI SXL 10G Super Elite XTRME Pro card next week. Witness what happens when PC gaming meets MTV-esque marketing."

        Sad? Who cares? Let people spend their money the

      • Guess what? That wicked dual-core CPU actually runs games slower than its single core cousin.

        Is this actually a true statement? I can't do any current testing since I don't have a reasonable 3D card in my machine, but I remember testing Quake 3 on my old dual Celeron machine with a TNT2 card. top showed Quake was using 95% or more of one CPU, and the X server was using 30% or more of the other CPU.

        I don't expect the numbers to be the same today, but shouldn't there be at least some slight increase i

    • by vmcto (833771) * on Wednesday November 02, 2005 @01:43PM (#13933594) Homepage Journal
      Hey don't knock gamers that spend tons of money on computer gear.

      It's thanks to them that the rest of us can get normal gear at such reasonable prices...
  • by Anonymous Coward on Wednesday November 02, 2005 @12:10PM (#13932719)
    http://anandtech.com/memory/showdoc.aspx?i=2392 [anandtech.com]

    You'll basically find that the performance of value memory is very on par with the high end stuff. You basically pay for the ability to overclock on a more consistent basis.
  • by Ed Almos (584864)
    I'm running Firefox 1.0.7 under Ubuntu. When I click on the link firefox exits, am I the only one having this problem?

    Ed Almos
  • Insightful article (Score:3, Informative)

    by mindaktiviti (630001) on Wednesday November 02, 2005 @12:15PM (#13932772)
    Although I didn't read all the text (about 50% of it), the benchmarks were what I was interested in, as well as the conclusion. So to sum it up:

    2-2-2-5 timings at 400MHz t1 memory is the fastest but costs twice as much and the performance gains are almost non-existant except in lower resolution games (i.e. 800x600 you may see an increase in 20 fps, which I think is a lot!), and of course the cost of the ram in this case would not be justified because putting that extra money into a better video card would be the better thing to do.

    Only if you're an overclocker is this worth it, at least from their benchmarking and perspective, which I'll accept.

    Oh yes, and that website also crashed my Firefox.
  • By the same logic as those cheesy insurance commercials, where you can afford the policy if you can afford a cup of coffee a day, if you can afford to spend 5 minutes reading slashdot each day, then you can afford not to drop twice the amount of money on ram for the 2% time savings it offers in most programs.
  • by Cr0w T. Trollbot (848674) on Wednesday November 02, 2005 @12:25PM (#13932865)
    After all, that's the main feature of Crucial Ballistix Tracer Memory [crucial.com]. I'm sure those LEDS must be worth at least 10 fps in Doom 3...

    Crow T. Trollbot

  • Ask a builder (Score:3, Insightful)

    by Dragoon412 (648209) on Wednesday November 02, 2005 @12:30PM (#13932901)
    Seriously, this has been known very well amongst the gaming PC builder crowd for a long time. Most of them, anyways; there's unfortunately still that level at which people know enough to put the PC together, but don't know enough to tell you what any of the numbers mean.

    The difference between, say, Corsair Value Select memory, and Corsair 1337 Ultra X2000 - the memory equipped with LCDs, heat spreaders, and a spoiler with metal-flake yellow paint that add at least 10 horsepower - is going to be absolutely unnoticeable in the real world. Even benchmark scores will show little to no improvement.

    Ricer RAM - you know, the PC equivalent of this crap [hsubaru.com] - is for overclocking. If you're not planning on overclocking it, you're paying too damned much.
    • Compared to the people who are willing to cool their CPUs with liquid nitrogen for five minutes and risk cracking the core, hyper-accelerated chip creep, shorting from rapid condensation, and a number of other potential issues, just to turn out a few extra 3dmarks (thankfully, Futuremark's killed off that crowd with ridiculous releases time and time again) or gain a few more MHz than the next hardcore overclocker down the line...paying near double the price for a 2-5% increase in performance (on a good day)
    • Re:Ask a builder (Score:4, Insightful)

      by theantipop (803016) on Wednesday November 02, 2005 @12:49PM (#13933102)
      What most people don't realize is that the only way to improve your performance at the top end of the performance spectrum is through a combination of small tweaks such as this. Sure spending twice the money for 103% of the performance sounds dumb, but when you combine that with small tweaks to your processor, graphics card and a 10,000rpm hard drive they add up.

      These products are not for people who want to achieve a useable level of performance and as such are not marketed at those crowds. They are for people who have already fast equipment but want more. I won't say this is a good or bad thing as it is simply a hobby for most of these people. Just like import tuners: they may drive funny-looking cars, but it's their choice of hobby.

      • Want in on a few secrets?
        If you have to use a stopwatch to tell which system is faster, they are the same speed.
        If you calculate that one system is 7% faster than another system, they are the same speed.
        If you have one system getting 127 frames per second, and another system getting 136 fps - they are the same speed.
        103% for memory tweaks, 102% for OCing the CPU, 104% for a different tweak all add up to : same speed.

        There are two magnificent pieces of equipment that are going to make your computer faster :
        • Tell the NHRA that the two top-fuel dragsters that are 0.001 seconds apart are the same speed. Tell the baseball team that loses the World Series by 1 run that they're equally as good. Look, you may not understand that some people benchmark as a form of recreational competition, but it's reality whether you like it or not.
          • Two dragsters are 0.001 seconds apart ARE the same speed.
            From run to run if they ran 20 runs and in all 20 runs the same car was 0.001 seconds faster each and every single time, letting the drivers swap cars a few times so each driver got 10 runs in each car, I would be willing to budge a little. But you and I know that that isn't the case. The machines are the same, but one lane had a little more rubber on the ground, or the driver was a little better (or a few lbs lighter, or had on his lucky shoes, or
  • by Anonymous Coward
    But they're doing this on an AMD-64 platform...
  • These tests underestimate the performance impact of latency because they are conducted using software optimized over the years for the high-latency realities of current-day memory architectures. CPU clock speeds have been outstripping RAM clock speeds for about 15 years. Software developers have spent years optimizing their code to mitigate the impacts of latency.

    In the short-run, these tests help a person decide whether to buy low-latency RAM. But they provide little long-term insight into how much fa

    • "Software developers have spent years optimizing their code to mitigate the impacts of latency."

      Really? MS hand-tunes the ASM code generated when they do a build of winword.exe ? Maybe thats why OO.o is so slow?

      If I sound sarcastic, I suppose I am. With a few exceptions, almost every coder I've worked with in multiple jobs, has been of the 'throw CPU cycles' at the problem. I can count on one hand those who actually design for a HW architecture, since most of the coders these days are VBScript and Java
    • by Zathrus (232140) on Wednesday November 02, 2005 @01:05PM (#13933256) Homepage
      Sorry, I call BS on your entire post. The difference in latencies here is miniscule -- it's not like we're talking about having the CPU wait 2 clock cycles vs 30 clock cycles. It's closer to 13 vs 25 (not exact, but the magnitude of difference is close). That just doesn't matter that much -- the reality is that if you have a cache miss then you're looking at 20-30 cycles (or, more likely, 40-60 cycles) of stall while you fetch the data from main memory.

      The kind of changes you're talking about require vastly faster memory. Not the kind of latency differences being discussed here at all. Both of these are "high latency" compared to what would be needed for your theoretical redesign of the entire software stack. And even then, you just become utterly and completely screwed if you have to hit virtual memory, possibly more so than you are now because you've re-orchestrated everything around the idea that latency is a non issue.

      Oh, and latency is getting worse, not better, and has been for a long, long time. CPU speeds long ago outstripped the speeds of our fastest memory (well, fastest while still not costing absurd amounts of money...), and the newer memory formats (DDR, DDR2, DDR3, RDRAM, etc) have higher latencies in exchange for greater bandwidth.
      • The kind of changes you're talking about require vastly faster memory. Not the kind of latency differences being discussed here at all. Both of these are "high latency" compared to what would be needed for your theoretical redesign of the entire software stack.

        Exactly!

        Oh, and latency is getting worse, not better, and has been for a long, long time.

        Very true my first full-sized computer had a 8 MHz processor and 150 ns RAM in 1985. Now there's more than an 8:1 ratio between CPU and RAM clocks (and th

        • You seem to misunderstand the question this article is addressing. The question is "what performance benefits should I expect from buying low-latency RAM?". The question is not "Should all computers be designed for lower latency?" Me buying lower latency RAM doesn't make anyone design games or the majority of software for lower latency.
    • The software knows nothing about memory latency, the software only knows it needs to move a block of data from point A to point B. That Java/C/C++ Move_Memory function translates at the lowest level to machine code instructions which are implemented in the logic of the silicon. The coder or the compiler may optimize the ORDER of execution of the instructions, or use different instructions (such as BlockMoves) to speed things up, but the basic underlying machine instructions execute the same way every time (
      • In Other words -

        It's like putting 93 octane gas in a 87 octante tuned car. You waste your money and get nothing out of it but maybe a check engine gas light.
      • However, if you have algorithmicly intensive software (spending lots of time in the same loops or crunching large amounts of data), it's worthwhile to instrument your code and see how you're doing for cache hits/misses. You might discover that by tweaking the inner-most loops or the size blocks you crunch, you can better fit the cache of the target processor.

        Word/Excel isn't going to bother, but a game might be worth stuffing a few versions of tweaked loops in that are selected by a loop invariant, or by f
        • If you are doing a lot of FFTs on an x86 processor you probably have either the wrong processor (DSPs are much better at FFTs) or the wrong algorithm. Estimating things (or pre-calculating the common values and interpolating the rest) and using table lookups is much faster even with using the extra registers. Memory alignement on structures USED to be an issue, but not anymore. I did some research about 4-5 yrs ago that showed on a PowerPC processor alignment of structures made no difference, the tiny bit o
    • What does this mean? (Score:2, Interesting)

      by Flying pig (925874)
      All memory has an access time, and the further you get from the CPU the longer it is going to be. CPU registers have the shortest access time, with (nowadays) subnanosecond access. L1 cache comes next, then L2, then external RAM, then HDD, and finally the slow backing store represented nowadays by CD and DVD. This heirarchical memory architecture changes with time mostly in that the caches grow bigger, so the 640K of RAM from DOS days now fits into the cache of each processor in a pentium-D with room to spa
      • by ionpro (34327)
        Intel uses an inclusive cache archetecture, so you actually don't get the 640K you were looking for, and even so it'd have to be backed by DRAM (AFAIK, that cache isn't programmer or even OS accessible). AMD uses an exclusive-cache, so the L1 and L2 (and any L3) would all be additive in which data they could store.

        JOC, why don't you specify Athlon X2 4400+ or 4800+s? They all have 1MB L2 per core, as well.
    • RAM not on the CPU chip has been and will continue to be a burden on performance. Low latency RAM will reduce this trend but not reverse it. The physical distance between the CPU and RAM adds to latency due to speed-of-light delays, and it's hard to avoid that.
  • What about cache? (Score:4, Interesting)

    by antifoidulus (807088) on Wednesday November 02, 2005 @12:35PM (#13932947) Homepage Journal
    Improvements in memory speed crawl compared to improvements in CPU speed, however larger caches can mitigate this problem to a certain extent, so why is it that growth in cache size continues to crawl? The Apple G5 updates FINALLY gave us 1mb l2 cache per core(and of course the industry standard 64k L1 cache per core) and whil the Intel/AMD world is slightly better in this regard, it's not by much. So why is it so hard to increase cache size?(of course you will need good cache allocation/replacement policies to go with them)? I'm not trolling, I honestly want to know. I realize that the people that design these chips are a lot smarter than I, but so far I haven't really seen a good reason why they don't increase cache size.
    Also, outside of the HPC world, it seems very few programmers optimize their cache usage. Are there any tools(open source or otherwise) that can actually help you locate/fix inefficient uses of cache?
    • Why only a 64k L1 and 1mb L2? Why not a 1mb L1 and 4mb L2?
    • They've moved towards on die cache, and that makes it expensive. A typical CPU uses a hell of a lot of chip area for cache already. Why do you think P4 EE costs so much?
    • >> The Apple G5 updates FINALLY gave us 1mb l2 cache per core(and of course the industry standard 64k L1 cache per core) and whil the Intel/AMD world is slightly better in this regard, it's not by much.

      Remind me again, how much L1 cache exactly does a Pentium 4 have? Wasn't it something like 8KB fast cache on the older ones and 16KB at half the speed on the newer ones?
    • Obvious reason - much more expensive (you can't get as good yields at manufacturing fab)
    • Re:What about cache? (Score:3, Informative)

      by harrkev (623093)
      For one simple reason -- die size. Cache eats up a lot of real estate. A 1MB (B as in byte) is 8 million bits. If the cache uses DRAM-style cells, that is at least 8 million transistors. If the cache is more like SRAM, then you can count on a lot more This increases the size of the die, which decreases both the number of chips per wafer, and also increases the percentage of defective dies.

      So, the bottom line is that cache is the most expensive type of memory in a computer. Some methods have been made
    • Re:What about cache? (Score:3, Informative)

      by timeOday (582209)
      Because you get diminishing returns for more and more cache. At some point it's better to use all those transistors as a second core instead.
    • by Sycraft-fu (314770)
      Cache is SRAM, since SRAM is much faster. Ok, except that SRAM takes 6 transistors per bit to make. So for 1 megabyte of cache, that's 48 million transistors to implement. That's a major budget of silicon. As transistor count goes up, so does die size, heat, cost, failure rate, etc. So putting large caches on just isn't feasable. A 8MB cache would use more transistors than most processors today do in total between core and cache.

      Ok, you say, so move it off the chip. Well the problem is that part of the reas
    • It's very hard to do fast and large. Fast and small or slow and large are easy. Increasing the size of the cache also increases the delays in the circuits that manage the cache. Signals have to travel longer distances and drive more loads.
    • Price. Well, price and size, but mostly price.

      Cache isn't some magical thing. It's simply RAM. SRAM, usually, which is why it's so fast (don't have to waste power/time refreshing your contents). At the end of the day, it's just some very fast RAM. It sits between your CPU and the rest of your RAM, and uses its increased speed to "trick" the CPU into performing as if your main RAM is much faster than it is.

      In my computer arch course a while back, someone asked why, if cache is so fast, we don't just build co
    • The Apple G5 updates FINALLY gave us 1mb l2 cache per core

      What are you talking about? My G3 running at 450MHz has a 1MB L2 cache, and it has since 1999. Pentium Pros and various workstation/server class chips had multimegabyte caches a decade ago.

      The reason you've seen less cache is that it didn't make sense to have a slow CPU with a 4MB ache that had to dissipate 100+ watts to operate. on-die cache is expensive in terms of heat, die space, and clock speed.

      There's also the marketing factor, Intel would have
  • They could even save you some money
    How so? The article indicates that the benefits are marginal. How can this RAM save money?
  • If people are concerned about the speed of their memory, then having fast DDR SDRAM running on an equally fast FSB is what really makes a difference. This is especially true on P4 Celeron based systems where the L2 cache isn't huge and cache misses are common. While memory latency is important to consider, it isn't critical that your modules have the absolute fastest timings ever. I think that the importance of the other components that connect to your memory like the FSB are underestimated. You can have fa
  • The real issue ... (Score:4, Insightful)

    by TheCrig (3178) on Wednesday November 02, 2005 @12:48PM (#13933083) Homepage
    ... Is not memory performance as such, but system performance. If a 5 percent increase in system performance increases the cost of your system by 10 percent, you have to want it pretty badly or be on the edge of required performance or just be in a schoolyard comparison. But if it's reversed, and a 10 percent increase in system performance can be had for a 5 percent increase in system price, then if you can afford the 5 percent (say $100 for a $2000 system), go for it.
    • And the real way to increase system performance is to focus on the current bottleneck. Just like optimizing code, you look for the weakest link in the chain and work on that first. I wish someone would put a performance review together where they really put things into perspective. Something like:
      1st best leveraged enhancement: Disk access- once you get something to such-and-such speed, then the next important step is:
      Bus speed. Once you get this to speed x:
      Video card.
      then RAM latency...
      (and of course C
  • I mean did anyone seriously think that these memory latencies were going to have a great impact on anything that the most common users care about? I mean game performance is barely touched at all, which is another Duh! I think their conclusion is probably right, the people buying these things are the idiots who want to post how they have the ultimate system with great RAM and everything, where they probably only could afford 1 GB of their stuff, my performance and load times are better because I could aff
  • by Orp (6583) on Wednesday November 02, 2005 @02:49PM (#13934233) Homepage
    I do large 3D thunderstorm simulations. With some of the larger simulations I am integrating lots of things, contained in 3D floating point arrays, over 1 billion or more gridpoints (using distributed computing, such as a beowulf cluster made up of dual Xeons or an SGI Altix system). Each scientific calculation requires accessing floating point values stored in these arrays, doing some math, and updating another array.

    Memory latency, and memory bandwidth, both impact how long it takes my simulations to complete. Let's say it is the difference between a simulation taking a week vs. five days... this is significant to me and how much I can get done. With these heavy duty scientific models and such, you really can see a noticable benefit with the fancier hardware, and clock speed is certainly not the the only factor to consider by a long shot.
  • by frank249 (100528) on Wednesday November 02, 2005 @06:25PM (#13936133)
    Better latent than never.

Wherever you go...There you are. - Buckaroo Banzai

Working...