US Supercomputer Uses Flash Storage Drives 72
angry tapir writes "The San Diego Supercomputer Center has built a high-performance computer with solid-state drives, which the center says could help solve science problems faster than systems with traditional hard drives. The flash drives will provide faster data throughput, which should help the supercomputer analyze data an 'order of magnitude faster' than hard drive-based supercomputers, according to Allan Snavely, associate director at SDSC. SDSC intends to use the HPC system — called Dash — to develop new cures for diseases and to understand the development of Earth."
Wow.. (Score:3, Funny)
Re: (Score:2)
"... intends to use the HPC system -- called Dash -- to develop new diseases for cures...
There, fixed that for ya.
Re: (Score:2)
Problems to solve with it: (Score:1)
Re: (Score:3, Informative)
You not been following that thread enough, you will note the new patriot SSD have a 10yr warranty, but of course a "supercomputer" wouldn't use those.
Other pci-e based SSD I have seen around give upto 50yr life span.
Damage.Inc here btw
Re: (Score:2, Informative)
Re: (Score:2)
They say they are using Intel SSD via SATA, their higher end drives typically have a 2M hrs MTBF, almost double what most HDD are these days.
http://www.intel.com/design/flash/nand/extreme/index.htm [intel.com]
Re: (Score:1)
Re: (Score:2)
FusionIO SSD are rated upto 48 years of continuous use at 5TB written/erased a day.
Not sure if intel rate theirs that high.
Re: (Score:1)
Re: (Score:2)
Re: (Score:3, Informative)
But even if drives start to fail they'll just replace them like they do with any other supercomputer setup, so it's more a cost factor than a problem.
Re: (Score:1)
Re: (Score:2, Insightful)
"But that's okay, I'm sure English is your first/only language." That seems to be a really lame attempt to insult native English users. There's no grammatical rules against "problems to solve with it." Even "To problem solve with it" is acceptable because the rule against split infinitives is considered obsolete and old fashioned. English has amazing flexibility. It is the perl of human languages!
Re: (Score:1)
Re: (Score:2)
A 250GB SSD can have over 2.5PB (petabytes) written to it before it cannot be written to anymore.
Re: (Score:1)
10 years from now, there are going to be people with 7 year old flash drives fretting about the fact that they wear out.
BS (Score:2)
FLASH is about read access time. Throughput can be gotten far cheaper with conventional drives and RAID1.
The rest is the usual nonsense for the press.
Re: (Score:2)
Re: (Score:2)
Re: (Score:2, Informative)
FLASH is about read access time. Throughput can be gotten far cheaper with conventional drives and RAID1.
You mean RAID0 [wikipedia.org]. Note that you could do RAID0 with Flash drives and have both.
Re: (Score:2)
You mean throughput on sequential reads. What makes you assume that is the type of throughput they are measuring?
Re: (Score:2, Informative)
No, I'm Batman!
Re: (Score:2)
Yeah I am burning karma, but its there to be burnt.
Why did the OP just wholesale copy one of my posts from earlier in the thread? Including the comment at the end which states my user name from the forum my original post was in reference to.
Re: (Score:2)
No you're not!
Everybody knows that Dr. Sheldon Cooper is Batman!
Cost savings? (Score:5, Insightful)
"Hard drives are still the most cost-effective way of hanging on to data," Handy said. But for scientific research and financial services, the results are driven by speed, which makes SSDs makes worth the investment.
Why is the super computer ever being turned off? Why not just add more RAM?
SSD is cheaper than DDR ( ~$3/GB vs ~$8/GB ), but also ~100 times slower.
Exactly. Just use RAM. (Score:2)
RAM + uninterruptible power supply, and you're done. The only thing you need storage for is loading apps and data to begin with.
Re: (Score:3, Interesting)
If their data sets are that big that they are working on, writing out interim results and reading those back in is going to really hurt.
They're a supercomputing centre, so yes, the data sets are that big. And the users like taking copies of them and moving them around; there are even reasons for doing this that aren't linked to recovering from a crash (such as being able to rerun a simulation from part way through, rather than having to wade through the whole lot from the beginning).
Re: (Score:2, Interesting)
It could be a technical issue (i.e., they are targeting simplicity). Hooking up 1 TB of SSDs involves 4 SATA cables, hooking up an additional terabyte of RAM involves finding special widgets that hold as much RAM as possible, and the parts to make them talk to the nodes.
Re: (Score:1)
If the "special I/O nodes" are connected using a 10Gb network, then the 4x3Gb SATA drives would let them saturate the network bandwidth.
Re: (Score:1)
That's 768 GB of RAM across all of the nodes whereas the SSDs are 1 TB per node.
Re: (Score:2)
Space, Heating, Electricity will all be a factor building pure RAM Drives.
Re:Cost savings? (Score:5, Informative)
Space requirements.
Biggest DDR3 SO-DIMM modules I could find were 4 GB. They are 30 mm x 66.7 mm [valueram.com] and the standard allows for [jedec.org]
You now have an absolute minimum size of 2,701.35 mm^3 (1.35 mm x 30 mm x 66.7 mm), or 675.3375 mm^3/GB. This is a very very idealized minimum by the way.
An Intel 2½" drive is 49,266.28 mm^3 (100.4 mm x 7 mm x 70.1 mm) [intel.com] and currently maxes out at 160 GB leaving you with 307.91425 mm^3/GB. That's 46% of the space that would be needed for DDR3 RAM. Add to that that Intel's 2nd generation SSDs are only using one side of the PCB, and you can expect the storage space requirements to be halved.
Then there's the fact that the SSDs are directly replaceable. In other words, they don't need to rebuild the computer, buy super special boards or anything like that - you can replace a harddrive with an SSD without having to spec out a new supercomputer.
In the end, if you wanted to replace the system with something that could provide 1 TB of RAM per node, they would need a VERY expensive system. Even with 8 GB modules, you would need to somehow fit 128 of them onto a board. I'd really love to see the mother- and daughter-boards involved with that.
In the end it doesn't just come down to raw price or speed of the storage device (RAM vs SSD vs HDD vs tape), but also all the other factors involved, such as space, power, heat and the stuff you need to use it (i.e. a brand new super computer that can support 1 TB RAM/node vs 48 GB at the moment.
Or to use a really bad car analogy, some company has found out that using a BMW M5 Touring Estate [bmw.co.uk] gives them faster deliveries than using a Ford Transit. Now you're suggesting that they should be delivering stuff via aeroplanes. Yes, it's much faster, but you need a brand new transportation structure built up around this, which you also need to factor into your cost assessments.
Re: (Score:1)
You have a couple facts wrong.
They aren't maxing out the RAM slots on each node and they seem to be relying on these IO nodes to increase performance. I'd like to know how/why and this article doesn't explain anything.
DRAM/DDR drives aren't anything new; hooking up 1TB of DDR would be expen
Re: (Score:2)
Long-running simulations can run completely awry if one of the DIMMs dies part-way in.
Being able to record snapshots for later reuse or verification helps ensure the correctness of the simulation.
Re: (Score:1)
Long-running simulations can run completely awry if one of the DIMMs dies part-way in.
Being able to record snapshots for later reuse or verification helps ensure the correctness of the simulation.
Sure, but mechanical disks make more sense for storing snapshots. They have 768GB of RAM and 4TB of SSDs in their cluster.
Re: (Score:2)
Sure, but mechanical disks make more sense for storing snapshots. They have 768GB of RAM and 4TB of SSDs in their cluster.
Perhaps. But 768GB would take a really long time to write to disk. Maybe they don't want to lose all that time? The fastest SSD setups I've seen have multi-GB/sec throughput.
If you can read and write 2GB per second, you can use flash as a sort of "slow RAM" - although I'm not saying they're doing that in this case.
Re: (Score:2)
The SATA SSDs they are using only transfer data about twice as fast as a mechanical disk.
True. Too bad they aren't using ioDrives.
Re: (Score:2)
Re:Cost savings? (Score:4, Insightful)
There are plenty of reasons why supercomputers have to be shut down....besides the fact that even with generators and UPSes facilities outages are still a fact of life. What if there is a kernel vulnerability (insert 50 million ksplice replies here...yeah yeah yeah)? What if firmware needs to be updated to fix a problem? You can't just depend on RAM for storage. HPC jobs use input files that are ten's of Gigabytes and produce output files that can be multi Terrabytes. The jobs can run for weeks at a time. In some cases it takes longer to transfer the data to another machine that it takes to generate/process the data. You can't just assume that the machine will stay up to protect that data.
Where and how is it used? (Score:3, Interesting)
TFA isn't particularly detailed, beyond saying SSD's are used on "4 special I/O nodes".
One obvious thing would be to use SSD's for the Lustre MDS while using SATA as usual for the OSS's. That could potentially help with the "why does ls -l take minutes" issue familiar to Lustre users on heavily loaded systems, while not noticeably increasing the cost of the storage system as a whole.
I won't be impressed (Score:1)
Re: (Score:2)
With SDXC going up to 104MB/sec and 2 TB, it's only a matter of time.
Re: (Score:2)
"up to"
Hehe... ;D
I'll take some of that 5.0gbit USB 3.0 as well, please.
Re: (Score:2)
I've gotten quite good speeds on SD cards. The usual problem is a lackluster USB interface, but they don't all connect via USB. There's nothing wrong with SD that eliminating the USB connection won't solve.
Hardware guys.... (Score:1)
--------
Boot time is O(1).
Disk speed should be irrelevant! (Score:2)
I think it was Amdahl who said that a "supercomputer is a machine which is fast enough to turn cpu-bound problems into io-bound problems", which means that disk speed could become a limiting factor.
I have trouble seeing how having SSD arrays can make a big difference though!
All current supercomputers have enough RAM to handle the entire problem set, simply because _all_ disks, including SSDs, are far slower than RAM.
A supercomputer, like those which are used by oil companies to do seismic processing, does n
Re: (Score:1)
I can only agree, but notice that they are talking about a very small HPC (5.2 teraflops) and claim that they can significantly speed up data searches. There are certainly a few scenarios where you need to quickly and frequently search through a sparse, permanent dataset that is an order too large for your RAM and can benefit from the decreased delay in SSD storage.
However, for general purpose HPC systems the SSD is still a hard disk, and therefore way too slow for anything involving the computation. The ex
Re: (Score:2)
The article specifically talk about document searching. Some of the document searching people we have require >110 Terabytes of memory (per job). On a cluster that is pretty difficult to store in ram.
A lot of supercomputers are clusters and those clusters typically don't have huge amounts of memory ... they are high on compute.
Re: (Score:2)
Re: (Score:2)
That's what you have SMART for. Just run smartd [sourceforge.net] or add smartctl [sourceforge.net] to your own scripts. Intel SSDs report the wear parameter in SMART attribute 233.
SSDs and databases (Score:3, Interesting)
I've just gone through the process of setting up a pair of servers (HP DL380s) for Linux/Postgres. Our measurements show that the Intel X25-E SSDs beat regular 10k rpm SAS drives by a factor of about 12 for fdatasync() speed. This is important for a database system, as a transaction cannot COMMIT until the data has really, really hit permanent storage. [It's unsafe to use the regular disk's write cache, and personally, I don't trust a battery-backed write cache on the RAID controller much either. So not having to wait for a mechanical seek is really useful. Read speeds are also better (10x less latency), and the sustained throughput is about 2x as good.
So, yes, SSDs are a good idea for database loads, where the interaction is with the real world, and where once a transaction has completed, some other real-world process has happened. BUT, most supercomputer workloads are, in principle, re-startable (i.e. if you lose an hour's work due to a hardware failure, you can just re-run the simulation code, and throw away the intermediate state).
So, for simulations, the cost of dataloss is an hour of re-work, not irretrievable information. Given that, we can get much better performance by storing everything in RAM, enabling all the write-caches, and sticking with standard SATA, provided that, every so often, the data is flushed out to disk. If something goes wrong, just revert to the last savepoint, which could be an hour ago, rather than having to be 10ms ago.
[BTW, HP "don't support" SSDs in their servers, but the Intel SSD X25-E disks do work just fine. Though I did, unfortunately, have to buy some of HP's cheapest SAS drives ($250 each) just to obtain the mounting kits for the SSDs.]
Re: (Score:2)
The disk speed increase was enormous -- it really blew us away. What used to take between 3 and 4 hours can be done in about 8 minutes now.
IMHO using physical drives is much safer than using the SSD's and to scale up all we do is add additional shelves
Re: (Score:2)
Could you be more specific about what actually gave the improvement? Was it just something simple, eg RAID 5 -> RAID6.
My main point though was that for supercomputer simulations (but not email or warehouse management), it's OK to risk data, and then just re-start the simulation from an hour ago. So why not just enable the disk write-caches or put the database on a RAMdisk? Without the safety requirements (such as no write-cache), the benefits of SSDs aren't needed.
BTW, I am very happy with the Intel SSDs
Re: (Score:2)
There's not a direct comparison with RAID on an individual server and RAID on the EVA4400. Yes it's still a RAID5 (safety is most important to us) but the leveling aspect of the SAN provides additional performance. If I want to increase performance I add drives to the SAN and give them to that slice I've allocated for
Re: (Score:2)
Your RAID 5 partition allocated to this server (think of it like a slice of the whole pie) is smaller than the total SAN storage. In a "normal" single-server storage environment you probably allocate all space among the local drives. Each hardware RAID partition usually goes on separate drives, so that if you have 7 drives and need 2 partitions one of which is RAID-5, at least
Re: (Score:2)
Thanks for your informative posts. Sadly we can't afford a SAN anyway, though it might be a nice idea in future. What I don't understand is, how can a SAN improve the time for fdatasync() - i.e. for the data to be flushed to physical disk, and then control to return to the application? This is essential for database stuff.
As to my disinclination to trust battery backed cache - if the power goes out, it means we have about 4 hours to get it back. If that also fails, we have dataloss.
Re: (Score:2)