Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Hardware

Elevation Plays a Role In Memory Error Rates 190

alphadogg writes "With memory, as with real estate, location matters. A group of researchers from AMD and the Department of Energy's Los Alamos National Laboratory have found that the altitude at which SRAM resides can influence how many random errors the memory produces. In a field study of two high-performance computers, the researchers found that L2 and L3 caches had more transient errors on the supercomputer located at a higher altitude, compared with the one closer to sea level. They attributed the disparity largely to lower air pressure and higher cosmic ray-induced neutron strikes. Strangely, higher elevation even led to more errors within a rack of servers, the researchers found. Their tests showed that memory modules on the top of a server rack had 20 percent more transient errors than those closer to the bottom of the rack. However, it's not clear what causes this smaller-scale effect."
This discussion has been archived. No new comments can be posted.

Elevation Plays a Role In Memory Error Rates

Comments Filter:
  • Heat related? (Score:5, Insightful)

    by Anonymous Coward on Friday November 22, 2013 @12:00PM (#45491577)

    Top of the rack tends to get toasty, but is this too simple?

  • by ledow ( 319597 ) on Friday November 22, 2013 @12:34PM (#45492003) Homepage

    On Mount Everest, time slows by 0.00261261 seconds (2.6ms) compared to sea level.

    Every foot higher you go is 90 billionths of a second difference, if you want to check the maths for me. The problem is, we're not talking about a sea-level / Mount Everest communication here. The RAM chips are about a foot long at absolute maximum.

    And these sorts of effects then suddenly skitter into insignificance compared to solar radiation, different pressures, different air make-ups, heat, etc.

    The fact is, we know that this effect exists. We know that time-slowing exists (GPS wouldn't work if we didn't compensate for such things). We know that solar radiation exists. But this single statistic barely bothers to eliminate memory manufacturer, operating voltage, or ambient temperature as a cause rather than these exotic causes.

    Chances are, they might just have had a batch of dodgy RAM chips from a single manufacturer more than ANYTHING else combined.

    And, even then, you'd need thousands of test sites / machines to even hint at the cause. But, why bother? We know there would be an effect, we also know it wouldn't be this large or obvious and that - chances are - there's a much simpler explanation. The whole "top of the rack fails more often" hints at what complete and utter bullshit this is. That would be an effect we'd notice at sea-level and most likely things like ventilation and heating have orders-of-magnitutide more to do with it.

  • Re:Heat related? (Score:3, Insightful)

    by Anonymous Coward on Friday November 22, 2013 @02:08PM (#45493001)

    Also stack turned-off servers above an active one on bottom to see if it's shielding.

THEGODDESSOFTHENETHASTWISTINGFINGERSANDHERVOICEISLIKEAJAVELININTHENIGHTDUDE

Working...