IBM's High Performance File System 208

Posted by Zonk on Friday March 10, 2006 @01:19PM from the but-who-will-use-it dept.

HoosierPeschke writes "BetaNews is running a story about IBM's new file system, General Parallel File System (GPFS). The short and skinny is that the new file system attained a 102 Gigabyte per second transfer rate. The size of the file system is also astonishing at 1.6 petabytes (petabyte == 1,024 terabytes). IBM has up a page with more information and specs on the system.."

This discussion has been archived. No new comments can be posted.

IBM's High Performance File System

Load All Comments

Search 208 Comments Log In/Create an Account

Comments Filter:

Nothing new here. Move along. (Score:3, Informative)

by kperrier ( 115199 ) writes: on Friday March 10, 2006 @01:21PM (#14891656)

There is nothing new about GPFS. Its been around for years.

Share
twitter facebook
- Re:Nothing new here. Move along. (Score:2, Informative)
  
  by Mes ( 124637 ) writes:
  
  I was working on this 5 years ago, and Im sure its been around much longer than that.
- Re:Nothing new here. Move along. (Score:5, Informative)
  
  by MoxCamel ( 20484 ) * writes: on Friday March 10, 2006 @01:37PM (#14891812)
  
  Agreed. We've been using GPFS for 2 1/2 years. The long and short of it is that it's much more stable on AIX than it is on Linux. It's getting better on Linux, but it's still got a ways to go.
  Mox
  
  Parent Share
  twitter facebook
  - Re:Nothing new here. Move along. (Score:2, Interesting)
    
    by Mostly a lurker ( 634878 ) writes:
    
    If it is available on AIX, does this mean that The SCO Group will claim IBM had no right to make it available for Linux in the first place? I am not joking (at least, not deliberately). If I understand tSCOg's derivative works theory, they claim that any code that has touched SYSV is automatically a UNIX derivative and under their control.
    - Re:Nothing new here. Move along. (Score:2, Funny)
      
      by soft_guy ( 534437 ) writes:
      
      SCO would claim you have to pay them for the air you breathe because Darl McBride farted once.
  - Re:Nothing new here. Move along. (Score:4, Funny)
    
    by einhverfr ( 238914 ) writes: <(chris.travers) (at) (gmail.com)> on Friday March 10, 2006 @05:02PM (#14893925) Homepage Journal
    
    I tried to install GPFS on Windows 98 and I keep getting GPFs... Is this supposed to happen?
    
    Parent Share
    twitter facebook
- Re:Nothing new here. Move along. (Score:5, Informative)
  
  by Kadin2048 ( 468275 ) writes: <<ten.yxox> <ta> <nidak.todhsals>> on Friday March 10, 2006 @01:43PM (#14891872) Homepage Journal
  
  I think the "news" is the transfer rate, not the file system.
  
  According to this article [internetnews.com], the idea was just to see how fast a sustained transfer rate they could achieve. That rate was 102 GiB/s, which apparently is a record. The purpose of the project apparently has something to do with reducing the bottlenecking in parallel-computing interconnects. The machine they used, ASC Purple (a weapons-research system at Lawrence Livermore Labs) has about 10,000+ processors, so that's their obvious application.
  
  The filesystem itself doesn't seem to be anything new -- I have no idea why the poster fixated on that, since it's kind of a minor footnote in most of the articles I've read about this today.
  
  Parent Share
  twitter facebook
Well.... (Score:2, Funny)

by GoMMiX ( 748510 ) writes:

Atleast someone can make a new filesystem... *cough* Microsoft *cough*
- Re:Well.... (Score:5, Funny)
  
  by ackthpt ( 218170 ) * writes: on Friday March 10, 2006 @01:42PM (#14891861) Homepage Journal
  
  Atleast someone can make a new filesystem... *cough* Microsoft *cough*
  Oh, come now. They just finished winning their latest legal round on FAT [slashdot.org]
  Give them a moment to catch their breath, will you?
  introducing OrigamiFS, you write it out on paper then fold it in half as many times as you can
  
  Parent Share
  twitter facebook
  - Re:Well.... (Score:3, Informative)
    
    by Firewalker_Midnights ( 943814 ) writes:
    
    "introducing OrigamiFS, you write it out on paper then fold it in half as many times as you can"
    
    Apparrently it can only be folded 12 times [pomonahistorical.org], at most. Unless M$ has created a new form of highly (unstable) foldable OS :D
    - Re:Well.... (Score:2)
      
      by ivan256 ( 17499 ) * writes:
      
      A breakthrough in stretchy paper technology would break that 12 fold barrier.
10 Tbytes? (Score:4, Funny)

by kyouteki ( 835576 ) writes: <kyouteki@gmai l . com> on Friday March 10, 2006 @01:23PM (#14891685) Homepage

But what kind of performance does this give on relatively small ( 10Tbytes) file systems? Petabyte arrays are still kind of out of reach for most.

Share
twitter facebook
- Re: 10 Tbytes? (Score:5, Funny)
  
  by KDan ( 90353 ) writes: on Friday March 10, 2006 @01:39PM (#14891835) Homepage
  
  You puny geekling. It's been years since I migrated my enormous collection of pr0n to my petabyte array...
  
  Running out of space too... maybe I should build a beowulf cluster of them.
  
  Daniel
  
  Parent Share
  twitter facebook
  - Re: 10 Tbytes? (Score:2, Funny)
    
    by Anonymous Coward writes:
    
    I recently purchased a 1,024 TB SAN array to hold all my porn. Now there are *TWO* peta-files in my house!
  - Re: 10 Tbytes? (Score:3, Funny)
    
    by oliverk ( 82803 ) writes:
    
    You puny geekling. It's been years since I migrated my enormous collection of pr0n to my petabyte array...
    Great, but you only ever watch 7 minutes at a time! That's like 100 billion years of pr0n!!
- Re: 10 Tbytes? (Score:3, Interesting)
  
  by Tester ( 591 ) writes:
  
  they have 104 servers... that's almost 1GB/s/server ... that's a lot.. and they have 4 raid controlers per server.. that means each raid controler does around 250 mb/s.. (which normal for a high end raid controler) and they are connected with a 10gb/s interconnect (probably infiniband or 10G ethernet). So the whole thing is not that hard to do if you use your servers properly.
  
  But they have 1000 clients.. so its only 100MB/s/client.. so 1Gbps/s/client.. so the clients are probably gigabit ethernet... Otherwi
  - Re: 10 Tbytes? (Score:5, Informative)
    
    by Kadin2048 ( 468275 ) writes: <<ten.yxox> <ta> <nidak.todhsals>> on Friday March 10, 2006 @02:11PM (#14892100) Homepage Journal
    
    From the articles I've read, this was accomplished using (some subset of) ASC Purple, which is full of a lot of either custom or IBM-proprietary stuff (or else stuff that nobody but IBM seems to be using).
    
    According to the published/unclassified spec sheet [llnl.gov]:
    
    "Purple has 2 million gigabytes of storage from more than 11,000 Serial ATA and Fibre Channel disks. ... Each login node has eight 10-gigabytes-per-second network connections for parallel file transfer protocol and two 1-gigabyte-per-second network connections for network file systems and secure shell protocol. The system has a three-stage 1,536 port dual plane Federation switch interconnect ..."
    
    I think that it was this last thing, the Federation interconnect, that they were pushing the data over in this test, since it forms the backbone of the machine and links the storage nodes to the login node controllers, which then connect to the login nodes themselves (of which there are apparently over 1,400 of, according to this [llnl.gov]). I couldn't find much information on Federation, as it seems to only be used in a few systems, of which Purple is the most notable. One reference [sandia.gov] I found seems to put it at 1.49 GB/sec (11.92 Gbit/s) bandwidth, although it's not clear if that's "dual plane" Federation or not. 4X SDR Infiniband is around 10 Gbit/sec, IIRC, so Federation's a little faster.
    
    It does sound a little like it was a case of "hey, what can we do with $230M worth of hardware? I know, let's break some records." So they did. I'm not sure that there's anything there that anyone else couldn't do, with different technologies, given the same investment of capital -- it's just a matter of who else wants to, and has the capability.
    
    Parent Share
    twitter facebook
- Re: 10 Tbytes? (Score:5, Insightful)
  
  by chris_eineke ( 634570 ) writes: on Friday March 10, 2006 @02:01PM (#14891994) Homepage Journal
  
  relatively small ( 10Tbytes) file systems
  
  Seagate recently released a 500GB hard-drive. It costs $431.99CAD. 2 of them makes 1 TB. 2000 makes 1 PB. (Yes, that's overly simplified because it doesn't take into account interconnection cost, cooling, hydro, &c.)
  
  2000 x 431.99 = $863,980CAD
  
  I don't think that that's a lot of money for a petabyte raid. Hell, you might even get a 20% discount. Now think back about 20 years. That sum of money could have bought you 1 GB - that is an order of magnitude less in hard drive space. But here is the kicker:
  Approx. 20 years down the road you will get at least two magnitudes more for the same amount of money (wo/ inflation). Why? Because approx. 30 years ago, that sum of money bought you 1 MB of space.
  
  Ray Kurweil calls it the "Law of Accelerating Returns" [kurzweilai.net]. 20 years down the road I will call it my petaporn array . Or maybe better not [peta.org]. ;)
  
  Parent Share
  twitter facebook
  - Re: 10 Tbytes? (Score:2)
    
    by kyouteki ( 835576 ) writes:
    
    Fair enough. But I made $21,000USD last year. I have 5Tb of storage in my house already (got some good deals on 250GB drives). I was speaking more from a more-storage-than-I-know-what-to-do-with perspective (though Bittorrent might make that point moot soon enough) than an enterprise perspective.
    - Re: 10 Tbytes? (Score:2)
      
      by chris_eineke ( 634570 ) writes:
      
      I was speaking more from a more-storage-than-I-know-what-to-do-with perspective
      
      Oh, don't worry. There is never enough storage(TM). Movie encoding quality will increase, games will get more immersive (maybe movies too), more detail, more of this, more of that. If transmission speeds increase quality will go up, if quality goes up transmittion speeds will have to increase. Mix in new technologies at any point and the more-storage-than-I-know-what-to-do-with-dept. won't close anytime soon ;)
  - Try six orders of magnitude (Score:3, Informative)
    
    by irritating environme ( 529534 ) writes:
    
    Unless I forgot, a single order of magnitude is 10x, not 1000x.
    
    Peta = 1 000 Tera = 1 000 000 Giga = 1 000 000 000 Mega
  - Re: 10 Tbytes? (Score:2)
    
    by xquark ( 649804 ) writes:
    
    why use the term array anymore? I'm guessing it will just be a
    petabye hdd, an all in one enclosed unit.
    
    what might be called an array in the future might be a zettabyte
    array or something similar in size.
    
    Arash
  - Re: 10 Tbytes? (Score:2)
    
    by mfrank ( 649656 ) writes:
    
    Looking at my hardware purchases since 1990, I'd say that hard drive density per dollar goes up an order of magnitude every 4 years. So:
    
    2010: 5 terabyte
    2014: 50 terabyte
    2018: 0.5 petabyte
    
    The main problem is, what are people going to use them for?
- We need a common benchmark (Score:5, Funny)
  
  by Linker3000 ( 626634 ) writes: on Friday March 10, 2006 @02:12PM (#14892109) Journal
  
  Typical porn movies per hour (TPMH)??
  
  Parent Share
  twitter facebook
  - Re:We need a common benchmark (Score:2)
    
    by sunwukong ( 412560 ) writes:
    
    How does fast forwarding through the "dialogue" affect the benchmark?
*NIX Integration (Score:2, Interesting)

by Anonymous Coward writes:

Are there open source drivers for this FS that can perhaps be integrated into Linux or the *BSD projects?
- Translation: (Score:2, Funny)
  
  by PFI_Optix ( 936301 ) writes:
  
  "That's nice, but will Linux run it?"
  - Re:Translation: (Score:3, Informative)
    
    by slackaddict ( 950042 ) writes:
    
    Yes:
    GPFS supports the current releases of AIX 5L and selected releases of Red Hat and SUSE LINUX Enterprise Server distributions. See the GPFS FAQ1 for a current list of tested machines and also tested Linux distribution levels.
  - Re:Translation: (Score:2)
    
    by PFI_Optix ( 936301 ) writes:
    
    Oh wow someone with mod points needs a sense of humor.
Can I use it? (Score:4, Interesting)

by ShieldW0lf ( 601553 ) writes: on Friday March 10, 2006 @01:24PM (#14891696) Journal

Is this stuff available in a fashion where we might see it ported for use on standard x86 hardware? Is it GPL'd? I want this in my living room!

Share
twitter facebook
Fast Stuff (Score:4, Funny)

by britneysimpson ( 960285 ) writes: on Friday March 10, 2006 @01:27PM (#14891719) Homepage Journal

Wow that"s fast stuff, plus with the ability to slow light to save energy IBM should have some great new systems coming out!

Share
twitter facebook
So what about JFS? (Score:2)

by kalpol ( 714519 ) writes:

What's going to happen to JFS? I was hoping it would get some attention - I am using it for my multimedia server and have been very pleased with the way it handles large DVD image files, not to mention power failures.
- Re:So what about JFS? (Score:4, Informative)
  
  by Tester ( 591 ) writes: <olivier@crete.ocrete@ca> on Friday March 10, 2006 @01:36PM (#14891802) Homepage
  
  GPFS is a cluster file system.. its in a completely different category.
  
  Parent Share
  twitter facebook
- You *like* JFS? (Score:2)
  
  by Junta ( 36770 ) writes:
  
  I tried JFS, and it handled power interruptions very poorly.
  
  Essentially, I liked philosophically that the act of mounting and journal replay are separated, it really makes sense. Journal replay should be more an fsck option, thought that was neat. And when you mount read-only, you *mean* read-only, no journal reply or anything even on a 'dirty' filesystem.
  
  However, I found all too frequently that after power failures, it would replay the journal and think everything was fine, until a few hours of usage lat
Bad Article Title (Score:5, Funny)

by Anonymous Coward writes: on Friday March 10, 2006 @01:28PM (#14891741)

I thought this article was going to be about IBM's HPFS from OS/2.

Share
twitter facebook
I'm Surprised (Score:5, Funny)

by Nom du Keyboard ( 633989 ) writes: on Friday March 10, 2006 @01:28PM (#14891742)

I'm surprised that the content industries (read **AA) let them release this. After all, everyone knows that the only reason for large amounts of writable storage is to store stolen content and deprive artists of their just rewards. All things considered, I'm also surprised that IBM doesn't have to close a non-existent Analogue Hole, nor implement a Broadcast Flag to prevent the storage of infringing materials.
That aside, how do I get one for my TiVo?

Share
twitter facebook
- Re:I'm Surprised (Score:2)
  
  by jZnat ( 793348 ) * writes:
  
  Well, IBM is quite a mammoth of a company, so they wouldn't take that sort of shit from Double A.
- - Re:I'm Surprised (Score:2)
    
    by MindStalker ( 22827 ) writes:
    
    Shall I slap you with a trout again???
    - Re:I'm Surprised (Score:2)
      
      by Nom du Keyboard ( 633989 ) writes:
      
      Shall I slap you with a trout again???
      I'd rather have you fetch me a shrubbery.
since the /. blurb doesn't explain it... (Score:5, Informative)

by frankie ( 91710 ) writes: on Friday March 10, 2006 @01:31PM (#14891766) Journal
...let's see if I can, never having heard of GPFS before 10 minutes ago:
- GPFS is not new; GPFS 1.0 dates to 1998
- IBM is touting its latest point update, v2.3
- analogy: desktop PC is to BlueGene as RAID is to GPFS cluster
It's basically data striping across 1000 disks. I suppose the hard part is coordinating all of that parallelism.

So, could someone who actually knows this stuff tell me how well I did?
Share
twitter facebook
- Re:since the /. blurb doesn't explain it... (Score:5, Funny)
  
  by Amouth ( 879122 ) writes: on Friday March 10, 2006 @01:37PM (#14891811)
  
  root@ibm$rm - r *
  
  humm that was quick
  
  Parent Share
  twitter facebook
  - Re:since the /. blurb doesn't explain it... (Score:2, Funny)
    
    by dow ( 7718 ) writes:
    
    Shouldn't that be:
    
    root@ibm# rm -rf *
    
    And as always on storage/bandwidth topics: the pr0n/ogg/divx potential of that thing... *sorry*
- I only know what IBM have published (Score:2)
  
  by jd ( 1658 ) writes:
  
  But you sound right to me. Having said that, I would have absolutely no objection to IBM porting support for ultra-parallel RAID to Linux. In fact, there are probably a number of areas in the kernel that they could use their experience in parallel architectures to tighten up on.
  Since GPFS is basically RAID on speed, it should be easy for IBM to write a wrapper for Linux that would allow you to read/write GPFS, without needing to port GPFS per-se. As IBM sells Linux-based machines, being able to access GPFS
  - Available now. (Score:2)
    
    by Kadin2048 ( 468275 ) writes:
    
    GPFS (apparently -- I know only of what I've learned in the last few hours) is available for Linux, from IBM, right now.
    
    Some people further up in the discussion have warned however that it's not as stable on Linux as it is on AIX, which is really its native platform.
    
    From IBM's page on GPFS [ibm.com]:
    
    "GPFS is available as:
    * GPFS for AIX 5L on POWER(TM)
    * GPFS for Linux on IBM AMD processor-based servers and
    IBM eServer
  - Re:I only know what IBM have published (Score:2)
    
    by Shisha ( 145964 ) writes:
    
    But you sound right to me. Having said that, I would have absolutely no objection to IBM porting support for ultra-parallel RAID to Linux. In fact, there are probably a number of areas in the kernel that they could use their experience in parallel architectures to tighten up on.
    
    NOOOO!!!! You've just finally provided SCO with the evidence it needed! Filesytems were used in UNIX and SCO owns everything UNIX related. Now they know that IBM could maybe consider integating, ehm, UNIX technologies, we mean UNIX
- Re:since the /. blurb doesn't explain it... (Score:2)
  
  by cluening ( 6626 ) writes:
  
  I don't profess to actually know much about gpfs, but I do use it on a daily basis. But, I can say that you are mostly right. Two additions: GPFS's original name was MMFS, putting it much older than 1998 (I believe). 2.3 is indeed the latest release, but we've been using it for around a year now and are up to patch level 10.
  
  When I started playing with gpfs on our linux machines about a year ago, I got pretty angry at it pretty often (mostly because we, with other people, were making it do things that the
- Re:since the /. blurb doesn't explain it... (Score:3, Informative)
  
  by TRS-80 ( 15569 ) writes:
  
  You missed the fact that GPFS is non-Free (tm):
  The prices for GPFS for AIX 5L, GPFS for Linux on POWER, and GPFS for Linux on Multiplatform are based on the number of processors active on the server where GPFS is installed.
- Most of all... (Score:3)
  
  by drinkypoo ( 153816 ) writes:
  
  ...the title of the story submission is INCREDIBLY STUPID. Why? Because the Filesystem in OS/2 is called HPFS, which stands for "High Performance File System". Anyone who knows more than what they read this week knows this already, and was expecting an article on HPFS from the title (until they saw the blurb.)
  
  Further evidence that "editor" is a misnomer 'round these parts.
  - Re:Most of all... (Score:3)
    
    by arodland ( 127775 ) writes:
    
    Not to mention that the summary text is really nothing than some numbers taken out of context with absolutely no practical meaning. Really, what the fuck, Zonk? This is your idea of a "story"? It's more like a fourth grader's idea of a "science report".
- Re:since the /. blurb doesn't explain it... (Score:3, Informative)
  
  by flaming-opus ( 8186 ) writes:
  
  yeah, except substitute 1000 disks with 10,000 disks. They almost certaintly are stiping across a bunch of mid-range IBM raids, each with ~100 disks, and probably getting around 1-2 GB/s.
  
  It's also striping across many machines in a cluster. Each of those nodes maxes out at 'only' 15 GB/s of I/O, so they wire up all the nodes to a bunch of fibre channel cards, and plug them all into the raids, to distribute the I/O access to the nodes. GPFS also lets you do the I/O over the cluster interconnect, but then you
GPFS Information and links (Score:5, Informative)

by Anonymous Coward writes: on Friday March 10, 2006 @01:39PM (#14891825)

GPFS FAQ - http://publib.boulder.ibm.com/infocenter/clresctr/ index.jsp?topic=/com.ibm.cluster.gpfs.doc/gpfs_faq s/gpfs_faqs.html [ibm.com]
GPFS Whitepaper - http://www-03.ibm.com/servers/eserver/pseries/soft ware/whitepapers/gpfsprimer.pdf [ibm.com]
"GPFS is a cluster file system providing normal application interfaces, and has been available on AIX® operating system-based clusters since 1998 and Linux operating system-based clusters since 2001. GPFS distinguishes itself from other cluster file systems by providing concurrent, high-speed file access to applications executing on multiple nodes in an AIX 5L cluster, a Linux cluster or a heterogeneous cluster of AIX 5L and Linux machines. The processors supporting this cluster may be a mixture of IBM System p5(TM), p5 and pSeries® machines, IBM BladeCenter(TM) or IBM xSeries® machines based on Intel® or AMD processors. GPFS supports the current releases of AIX 5L and selected releases of Red Hat and SUSE LINUX Enterprise Server distributions. See the GPFS FAQ1 for a current list of tested machines and also tested Linux distribution levels. It is possible to run GPFS on compatible machines from other hardware vendors, but you should contact your IBM sales representative for details.
GPFS for AIX 5L and GPFS for Linux are derived from the same programming source and differ principally in adapting to the different hardware and operating system environments. The functionality of the two products is identical. GPFS V2.3 allows AIX 5L and Linux nodes, including Linux nodes on different machine architectures, to exist in the same cluster with shared access to the same GPFS file system. A cluster is a managed collection of computers which are connected via a network and share access to storage. Storage may be shared directly using storage networking capabilities provided by a storage vendor or by using IBM supplied capabilities which simulate a storage area network (SAN) over an IP network.
GPFS V2.3 is enhanced over previous releases of GPFS by introducing the capability to share data between clusters. This means that a cluster with proper authority can mount and directly access data owned by another cluster. It is possible to create clusters which own no data and are created for the sole purpose of accessing data owned by other clusters. The data transport uses either GPFS SAN simulation capabilities over a general network or SAN extension hardware.
GPFS V2.3 also adds new facilities in support of disaster recovery, recoverability and scaling. See the product publications for details2."

Share
twitter facebook
GPFS is not new (Score:3, Informative)

by flaming-opus ( 8186 ) writes: on Friday March 10, 2006 @01:39PM (#14891830)

GPFS is one of the more entrenched parallel cluster filesystems available. (others include the classic vax cluster fs, Tru64 cfs, redhat gfs, adic stornext, lustre, Sanergy, polyserve, others) GPFS has been running on IBM's high performance clusters for a decade or more. I've used it, and it's as robust as any of the others I listed above.

I'll caution everyone that you can get 100GB/s of throughput, only if you have a hundred million dollar collection of computers and disks like Livermore has.

Share
twitter facebook
- Google File System (Score:2)
  
  by TubeSteak ( 669689 ) writes:
  
  We have designed and implemented the Google File System, a scalable distributed file system for large distributed data-intensive applications. It provides fault tolerance while running on inexpensive commodity hardware, and it delivers high aggregate performance to a large number of clients.
  ...
  ... The largest cluster to date provides hundreds of terabytes of storage across thousands of disks on over a thousand machines, and it is concurrently accessed by hundreds of clients.
  If it's scalable, there's no reas
  - Re:Google File System (Score:3, Insightful)
    
    by InsaneGeek ( 175763 ) writes:
    
    If not... what's the key difference between the two?
    
    When you care about throughput as well as capacity.
  - Re:Google File System (Score:2)
    
    by Wesley Felter ( 138342 ) writes:
    
    GoogleFS does not support Unix semantics, so if you mounted it (which you can't anyway) some apps would not behave correctly. Also, GoogleFS uses smart storage server nodes, while GPFS runs on a block-based SAN. Also, you can buy GPFS.
So will this mean cheaper storage costs (Score:3, Interesting)

by zenst ( 558964 ) writes: on Friday March 10, 2006 @01:41PM (#14891851) Homepage Journal

Will this mean that you can share storage more easily, maybe. It certainly seems to reduce sharks/ESS into an expensive interface for attaching discs (but there again there just a load of discs with a AIX box or 2 and SSA adapters to conenct the discs anyhow).

Given the managment/maintenance levels of discs wil be more intergrated and distrubutable with this I cant help but think that OS/features and the trend in (and rightly so) resiliance,easy and sharing resources approach towards what Plan 9 was setout to be.

The more we move on the more we seem to get towards the lego-type appraoch to IT were you can just buy another box of bricks and add on and keep your older bricks instead of throwing the whole lot out and/or hacksawing the end of a brick of and gluing it onto the side of....

Storage wise this is a nice step forwards and having worked on AIX and its many filesystems and managment tools and the ease of getting the job done with the option to get clever if you wish (you chose and not forced) this looks funky albeit its RAID for SAN's in a way.

What I realy want is a FS that will propergate automaticaly and resiliantly in a way that accomodates network diversaty already and I still come down to me wanting, what is all intent a filesystem sat on a database sat on a p2p network, alas atm performance would suck, least today but you know how long code takes to get right and how fast hardware moves - remember alot of code in windows XP has origins to when it was written on a humble 386 cpu if not lower.

What this does show is how netowrk/storage interfaces have moved forward and I/O requests dont hammer CPU's as much as they used to, getting there :).

Share
twitter facebook
- Re:So will this mean cheaper storage costs (Score:2)
  
  by Apparition-X ( 617975 ) writes:
  
  Will this mean that you can share storage more easily, maybe. It certainly seems to reduce sharks/ESS into an expensive interface for attaching discs.
  
  Well, it is one thing to send output from thousands of nodes to thousands of nodes, and achieve very high performance, as seems to be the case here. It is another to send output from a single node to a single node as you describe, where one node is a storage controller (ESS, whatever) and the other is a large database server that is probably not running a f
  - Re:So will this mean cheaper storage costs (Score:2)
    
    by zenst ( 558964 ) writes:
    
    ESS/Sharks have multi pathed access to discs (least if configured correctly) and general resiliantly cabled to for all effect AIX box;s inside the unit (2 generaly). As such there already multi-noded internaly and just present a single node with regards to storage. You are right with regards to cacheing but (prolly have CACHEFS on the shark/ESS's aix box's running :) that can be accomodated and indeed negated with regards to thruput of access if the base FS more than keeps up with the app, or localised c
Tech details (Score:3, Informative)

by MasterC ( 70492 ) writes: <cmlburnett@gm[ ].com ['ail' in gap]> on Friday March 10, 2006 @01:48PM (#14891910) Homepage

The article, as usual for news stories, are lacking juicy tech details. Here's some I found:

The article says 102 GB/s transfer. This PDF about the ASC Purple says they have 11,000 SATA & fiber channel disks (amongst other neat stats). So cursory math says that's about 10 MB/s from each disk.

My question is how useful is that transfer? Pulling in at 102 GB/s is fast and all, but if you can't consume it then it's just ego boosting. What kind of useful data transfer can you do on it? Surely it's for parallel processing (ASC = Advanced Simulation & Computing) of some kind so can this parallel app handle 102 GB/s collectively?

Share
twitter facebook
- Re:Tech details (Score:2)
  
  by MasterC ( 70492 ) writes:
  
  Crap. Teach me for not scouring my preview before submitting. Here's the PDF I intended to link to:
  
  http://www.llnl.gov/asc/platforms/purple/sc2005-pu rple.pdf [llnl.gov]
- Re:Tech details (Score:3, Insightful)
  
  by Helios1182 ( 629010 ) writes:
  
  I think your last sentence hit it. There are groups producing huge amounts of data that needs to be stored then processed. What is the point in having 10,000 CPUs crunching numbers only to have the system I/O bound by the hard disk? Memory is still a couple orders of magnitude behind hard drives in size so they have to cache data on the disk at some point.
- Re:Tech details (Score:2)
  
  by AchilleTalon ( 540925 ) writes:
  
  I'm pretty much sure this filesystem capacity can be consume easily by pushing an order of magnitude or two some nifty simulations. So, that's not just ego boosting.
  Just to give an idea, when LHC will turn into operation at CERN, it will produce data at a rate equivalent to the whole EU telephony/data network capacity. And, this only part of the story. Since you have to analyze data, compute, compare, etc. You need to be able to move it fast between processors.
  Imagine nuclear weapons simulations, hurrican
- Function of Purple (Score:3, Informative)
  
  by Kadin2048 ( 468275 ) writes:
  
  The intended purpose of ASC Purple is nuclear weapons simulations.
  
  Since they can't actually do tests, either aboveground or below, by treaty anymore, they do simulations instead. I assume these have something to do with modeling how radioactive decay affects the weapons' usability and yield over time (since I don't think they're really in the business of designing new toys, but who knows really), so that you know that a bomb is going to go "pop" instead of "fizzle" when you want it to.
  
  I'd imagine that those
unit correction (Score:2, Informative)

by psbrogna ( 611644 ) writes:

petabyte !== 1,024 terabytes

petabyte == 1,000 terabytes

ref: http://en.wikipedia.org/wiki/Petabyte [wikipedia.org]

Kibibytes is just so much more fun to say. Especially when it leads to "kibbles & bits."
- SCREW THAT!!! ;-) (Score:4, Insightful)
  
  by Ossifer ( 703813 ) writes: on Friday March 10, 2006 @02:12PM (#14892115)
  Do you even read your own links?
  
  the exact number in common practice could be either one of the following:
  1,000,000,000,000,000 bytes -- 1000^5, or 10^15.
  
  1,125,899,906,842,624 bytes -- 1024^5, or 2^50.
  
  Real geeks use powers of two; powers of ten we're only introduced for marketing purposes, which real geeks eschew.
  Parent Share
  twitter facebook
  - Re:SCREW THAT!!! ;-) (Score:2)
    
    by hswerdfe ( 569925 ) writes:
    
    base 10 comes from SI units.
    base 2 was used because it was easier to count with.
    and some jerkwad decided that 1024 is close enough.
- I speek for thousands of nerds when I say (Score:2)
  
  by geekoid ( 135745 ) writes:
  
  Fuck you.
binary prefixes (Score:5, Insightful)

by Lord Ender ( 156273 ) writes: on Friday March 10, 2006 @02:03PM (#14892017) Homepage

The submitter and editors need to learn their numeric prefixes. Come on! This web site is supposed to be for people who understand computer technology!

A petabyte == 1000 terrabytes
A pebibyte == 1024 terrabytes

Please see the NIST definition page:
http://physics.nist.gov/cuu/Units/binary.html [nist.gov]

Share
twitter facebook
- Re:binary prefixes (Score:2)
  
  by Surt ( 22457 ) writes:
  
  The baffling question is: did the article submitter get it wrong, slashdot get it wrong, IBM get it wrong, did you mean 1024 tebibytes, or possible 1024 terabytes, or .... ?????
  
  The plethora of SI prefixes gets more and more confusing. And remember, not everyone has or is in any way bound to adopt the NIST convention, after all megabyte = 1024 kilobyte was around and in use long before nist got into the act!
  - Re:binary prefixes (Score:2)
    
    by hswerdfe ( 569925 ) writes:
    
    yes well
    1000 Meter = 1 KiloMeter
    
    was around long before 'megabyte = 1024 kilobyte'
    the prefix's mean something.
    just cause some CS tard decided that 1024 is close enough, and that base 2 is easier then base 10
    does not make it correct
    - Re:binary prefixes (Score:2)
      
      by Surt ( 22457 ) writes:
      
      Likewise, just because some self righteous officiant at the BWM thought that doing everything in multiples of 10 would be a great idea doesn't make it 'right' either. No side is 'right' on this issue, it's all opinion.
- Re:binary prefixes (Score:5, Informative)
  
  by Richard Steiner ( 1585 ) writes: <rsteiner@visi.com> on Friday March 10, 2006 @02:30PM (#14892300) Homepage Journal
  
  The new SI prefixes are nice and all, but there are three or four decades of prior usage that have to be unlearned before some of us will use them intuitively. Or at all. :-)
  
  Context-sensitive conversion of SI prefixes isn't all that difficult. Really. It's commonly understood that data is stored in powers of 2, and the subject is only relevant if (1) you're a sales type, or (2) you are being overly pedantic about an unwanted and unneeded SI standard.
  
  Parent Share
  twitter facebook
- Re:binary prefixes (Score:2)
  
  by slavemowgli ( 585321 ) writes:
  
  Ouch. First of all, it's "tera", not "terra"; and second, 1 pebibyte = 1024 tebibyte, not 1024 terabyte (and if you think that that's a difference that doesn't matter, why are you complaining about the confusion of pebibyte and petabyte?).
- Re:binary prefixes (Score:2)
  
  by Kjella ( 173770 ) writes:
  
  A petabyte == 1000 terrabytes
  A pebibyte == 1024 terrabytes
  
  ROFL. Not to mention "terra" is Earth, "tera" is SI. One of many issues with the new names is that they sound like complete and utter crap. I'll never ever move away from mega-, giga- and terabytes. I abbriviate them correctly with the i's (MiB,GiB,TiB) and for anal people I'd specify it as "decimal *-byte" and "binary *-byte" or just give it in raw bytes. But those names.... OMG what nerd came up with those?
  - Re:binary prefixes (Score:2)
    
    by duerra ( 684053 ) * writes:
    
    But those names.... OMG what nerd came up with those?
    
    Yeah, I agree. "bi"? Maybe the guy was confused.
    
    It would have been so much easier just to change the "a" to an "i" - petibyte instead of petabyte.
  - Re:binary prefixes (Score:2)
    
    by Lord Ender ( 156273 ) writes:
    
    You're right. I misspelled "tera." That doesn't change my point at all. But thanks for pointing that out.
1.6 petabytes isn't that big a deal (Score:5, Informative)

by jm91509 ( 161085 ) writes: on Friday March 10, 2006 @02:05PM (#14892040) Homepage

ZFS from Sun is 128-bit. According to this guy [sun.com]
thats a whole load of data:

"Although we'd all like Moore's Law to continue forever, quantum mechanics imposes some fundamental limits on the computation rate and information capacity of any physical device. In particular, it has been shown that 1 kilogram of matter confined to 1 liter of space can perform at most 1051 operations per second on at most 1031 bits of information [see Seth Lloyd, "Ultimate physical limits to computation." Nature 406, 1047-1054 (2000)]. A fully-populated 128-bit storage pool would contain 2^128 blocks = 2^137 bytes = 2^140 bits; therefore the minimum mass required to hold the bits would be (2^140 bits) / (10^31 bits/kg) = 136 billion kg.

That's a lot of gear."

Share
twitter facebook
- No, the limits are much higher than that (Score:5, Informative)
  
  by FreeUser ( 11483 ) writes: on Friday March 10, 2006 @02:50PM (#14892544)
  
  "Although we'd all like Moore's Law to continue forever, quantum mechanics imposes some fundamental limits on the computation rate and information capacity of any physical device. In particular, it has been shown that 1 kilogram of matter confined to 1 liter of space can perform at most 1051 operations per second on at most 1031 bits of information
  
  Um, no, that's wrong.
  
  Bremmermann's Limit [wikipedia.org] is the maximum computational speed in the physical universe (as defined by relativity and quantum mechanical limitations) and is approximately 2 x 10^47 bits per second per gram (or, for those who prefer sexagesimal [jean.nu], one jezend [jean.nu], 60^11, bits per second per gram).
  
  Bousso's covariant entropy bound [elyseum.com] also called the holographic bound is a theoretical refinement on the Bekenstein Bound [wikipedia.org] that may define the limit of how compact information may be stored, based on current understanding of quantum mechanical limits, and is theorized to be equal to approximately one yezend [jean.nu] (60^37, or ~10^66) bits of information contained in a space enclosed by a spherical surface of 1 sq. cm.
  
  Given this, 1 kg of matter can perform approximately 2 x 10^50 bit operations per second per kilogram, in a space much smaller than 1 liter of space. Of course, other physical constraints (non-quantum related) probably limits us to a couple of orders of magnitude less computation, in a couple of orders of magnitude more space, but of course what those limits might be is very speculative
  
  Parent Share
  twitter facebook
- Re:1.6 petabytes isn't that big a deal (Score:3, Informative)
  
  by rkww ( 675767 ) writes:
  
  1051 operations per second on at most 1031 bits
  
  That'll be 10^51 and 10^31...
Comparisons to other Parallel/Clustered FS? (Score:2)

by soldack ( 48581 ) writes:

It would be nice to see comparisons to RedHat/Sistina's GFS [redhat.com], Lustre [lustre.org] (backed by HP), and others listed here [yolinux.com].

Also how does this compare to clustered storage that is not run on the hosts themselves like NetApp new Spinnaker based clustering. You also have folks like Isilon [isilon.com], Panasas [panasas.com], and Terrascale [terrascale.com].

Anybody have an good data on this?
-Ack
- Re:Comparisons to other Parallel/Clustered FS? (Score:2)
  
  by Rheingold ( 2741 ) writes:
  
  For that matter, how does it compare with Tivoli TotalStorage SAN Filesystem [ibm.com], which seems to be another shared-storage filesystem from IBM/Tivoli? Trying to read IBM's descriptions is an exercise in marketing-fluff cryptography.
- Re:Comparisons to other Parallel/Clustered FS? (Score:2)
  
  by hitchhacker ( 122525 ) writes:
  
  don't forget about the Parallel Virtual File System (PVFS) [clemson.edu]
  
  -metric
I don't get it (Score:2, Funny)

by TheSkyIsPurple ( 901118 ) writes:

So, the important question: How many Libraries of Congress is that per second?
Well, ... (Score:2, Funny)

by wasatched ( 726651 ) writes:

... 1.6 PB ought to be enough for anybody.
Bad Experience with GPFS (Score:4, Interesting)

by localman ( 111171 ) writes: on Friday March 10, 2006 @02:31PM (#14892318) Homepage

We used GPFS in our production environment for about 9 months in 2004/2005. We chose it specifically because it allowed several machines to share the file system (like NFS) but with file locking. It was also supposed to be very fault tolerant with no single point of failure. We set it up using a fiberchannel SAN.

Unfortunately we had a lot of problems with it. For one, performance was quite bad in ceratin cases... doing an ls in a large directory would take a very long time. Doing finds would take a very long time. Once you had a specific file you wanted, opening and reading it was reasonable (though all disk ops were still on the slow side), but multi file operations lagged on the level of 10s of seconds or more. I think it was having to issue network checks to every machine in the set for each file or something.

Also, the CPU usage was very high across all our machines, primarly from lock manager communications. It really taxed the system. And perhaps worst of all, it would caused crashes sometimes. A single machine in the set would die (usually a GPFS assert), and though that didn't break the set permanently, a multi-minute freeze on all disk reads would take place until the set determined the machine was unavailable. We spoke with IBM about all this stuff... provided debugging output and everything, we used the latest patches. But we never got the issues resolved. It was a very rough few months indeed. I probably averaged 4 hours sleep per night.

When I say "slow" what am I comparing it to? In the end we switched to NFS and we came up with a somewhat clever way to avoid the need for file locking. NFS used the same SAN hardware, but had a single point of failure: the head server. We doubled up there with warm failover. The load on all servers dropped dramatically (I'm talking from ~40 load to ~.1 load). Disk operations were orders of magnitude faster. And we've not had a single NFS related lockup or failure in the past year and a half *knocks on wood*.

Anyways -- GPFS probably has some good uses. But I would not recommend it for a very high-volume (lots of files, lots of traffic) mission critical situation. Unless they've made some major improvements.

Cheers.

Share
twitter facebook
- Re:Bad Experience with GPFS (Score:2)
  
  by isj ( 453011 ) writes:
  
  From my limited knowledge of GPFS my guess is that GPFS is slow at metadata operations (opening files, listing directories, updating last-changed-date, ..) but lightening fast for I/O once you have the file open.
1.6 petabytes is overkill ... (Score:2, Funny)

by DrJimbo ( 594231 ) writes:

640 terabytes should be enough for anybody.
The marketing geniuses at IBM strike again!! (Score:2)

by The_REAL_DZA ( 731082 ) writes:

Surely I'm not the only one who sees "GPF" and thinks "General Protection Fault"?!

First it was OS2 (OS 2 ? does the "2" stand for 2nd rate? Is it your 2nd attempt? Is it just a big piece of "#2"?) and now it's this. Don't get me wrong, I think their products are great, but I really think they'd have a hard time marketing air on the Moon!
(Slightly more) seriously, IBM could stand to hire the same marketing folks the beer companies hire...Especially since their markets overlap so much.
- Re:The marketing geniuses at IBM strike again!! (Score:2)
  
  by Mostly a lurker ( 634878 ) writes:
  
  ... they'd have a hard time marketing air on the Moon!
  The question is, given the somewhat limited short term potential client base, would they try?
- Re:The marketing geniuses at IBM strike again!! (Score:2)
  
  by Rheingold ( 2741 ) writes:
  
  I ran into this the other day trying to search for discussions of it... GPFs is overwhelmingly used as the plural for General Protection Fault...
my gpfs problem (Score:3, Informative)

by krismon ( 205376 ) writes: on Friday March 10, 2006 @03:22PM (#14892902)

We ran GPFS for about 10 months. It's great for it's primary purpose, and it was pretty stable on Linux, though we had a crash or two... but the biggest problem we ran across was with large number of files. We had > 150 million small files in 10000 directories, and gpfs couldn't handle the load. I'm sure with a smaller number of files, our experience would have been very different. Waiting 10 minutes for an ls in a directory wasn't really what I considered fun. :)

Share
twitter facebook
NTFS (Score:3, Informative)

by Jaime2 ( 824950 ) writes: on Friday March 10, 2006 @03:34PM (#14893025)

NTFS has supported 16 exabytes since 1993. That's about 10,000 larger than this new system. I'm not saying that NTFS is great or that IBM's accomplishment is small. But the submitter really shouldn't have said that a 1.6 petabyte filesystem is anything to write home about. Most likely every modern filesystem is at least 64 bit(16 exabytes).

Share
twitter facebook
- Re:NTFS (Score:2)
  
  by SecurityGuy ( 217807 ) writes:
  
  There's a large difference between a filesystem format that can support a lot of data and having a collection of spinning disks that comprise a 1.6 petabyte filesystem.
- Re:NTFS (Score:2)
  
  by Arimus ( 198136 ) writes:
  
  There is a world of difference between the technical file size limit (ie based on the bit size) and the actually maximum anyone has used... 1.6 petabytes while being well short of the theoretical limits of a 64bit address system is still an impressive feat...
It has to be said.... (Score:2, Redundant)

by Nonillion ( 266505 ) writes:

1.6 petabytes ought to be enough for anyone......
- Re:How many (Score:2)
  
  by RandoX ( 828285 ) writes:
  
  Don't forget GFS.
- Re:Darn! (Score:2)
  
  by Richard Steiner ( 1585 ) writes:
  
  Me too, but HPFS was a Microsoft filesystem, anyway. So why did they drop it in favor of crap like FAT32 and NTFS? :-)
- Chuck Norris (Score:3, Funny)
  
  by City Jim 3000 ( 726294 ) writes:
  
  Chuck Norris penis is so big that 1.6 petabyte can only store 4 seconds of Chuck Norris porn.
  - Re:Chuck Norris (Score:4, Funny)
    
    by geekoid ( 135745 ) writes: <dadinportland@@@yahoo...com> on Friday March 10, 2006 @03:15PM (#14892822) Homepage Journal
    
    Which is more then any of us deserve to see.
    
    Parent Share
    twitter facebook
- - Re:1 petabyte = 1000 terabytes, not 1024. (Score:2)
    
    by The_REAL_DZA ( 731082 ) writes:
    
    Yeah, it's not like 24 terabytes is a significant amount of storage... Let's just all say "about a thousand terabytes" and leave it at that.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Nothing new here. Move along. (Score:3, Informative)

Re:Nothing new here. Move along. (Score:2, Informative)

Re:Nothing new here. Move along. (Score:5, Informative)

Re:Nothing new here. Move along. (Score:2, Interesting)

Re:Nothing new here. Move along. (Score:2, Funny)

Re:Nothing new here. Move along. (Score:4, Funny)

Re:Nothing new here. Move along. (Score:5, Informative)

Well.... (Score:2, Funny)

Re:Well.... (Score:5, Funny)

Re:Well.... (Score:3, Informative)

Re:Well.... (Score:2)

10 Tbytes? (Score:4, Funny)

Re: 10 Tbytes? (Score:5, Funny)

Re: 10 Tbytes? (Score:2, Funny)

Re: 10 Tbytes? (Score:3, Funny)

Re: 10 Tbytes? (Score:3, Interesting)

Re: 10 Tbytes? (Score:5, Informative)

Re: 10 Tbytes? (Score:5, Insightful)

Re: 10 Tbytes? (Score:2)

Re: 10 Tbytes? (Score:2)

Try six orders of magnitude (Score:3, Informative)

Re: 10 Tbytes? (Score:2)

Re: 10 Tbytes? (Score:2)

We need a common benchmark (Score:5, Funny)

Re:We need a common benchmark (Score:2)

*NIX Integration (Score:2, Interesting)

Translation: (Score:2, Funny)

Re:Translation: (Score:3, Informative)

Re:Translation: (Score:2)

Can I use it? (Score:4, Interesting)

Fast Stuff (Score:4, Funny)

So what about JFS? (Score:2)

Re:So what about JFS? (Score:4, Informative)

You *like* JFS? (Score:2)

Bad Article Title (Score:5, Funny)

I'm Surprised (Score:5, Funny)

Re:I'm Surprised (Score:2)

Re:I'm Surprised (Score:2)

Re:I'm Surprised (Score:2)

since the /. blurb doesn't explain it... (Score:5, Informative)

Re:since the /. blurb doesn't explain it... (Score:5, Funny)

Re:since the /. blurb doesn't explain it... (Score:2, Funny)

I only know what IBM have published (Score:2)

Available now. (Score:2)

Re:I only know what IBM have published (Score:2)

Re:since the /. blurb doesn't explain it... (Score:2)

Re:since the /. blurb doesn't explain it... (Score:3, Informative)

Most of all... (Score:3)

Re:Most of all... (Score:3)

Re:since the /. blurb doesn't explain it... (Score:3, Informative)

GPFS Information and links (Score:5, Informative)

GPFS is not new (Score:3, Informative)

Google File System (Score:2)

Re:Google File System (Score:3, Insightful)

Re:Google File System (Score:2)

So will this mean cheaper storage costs (Score:3, Interesting)

Re:So will this mean cheaper storage costs (Score:2)

Re:So will this mean cheaper storage costs (Score:2)

Tech details (Score:3, Informative)

Re:Tech details (Score:2)

Re:Tech details (Score:3, Insightful)

Re:Tech details (Score:2)

Function of Purple (Score:3, Informative)

unit correction (Score:2, Informative)

SCREW THAT!!! ;-) (Score:4, Insightful)

Re:SCREW THAT!!! ;-) (Score:2)

I speek for thousands of nerds when I say (Score:2)

binary prefixes (Score:5, Insightful)

Re:binary prefixes (Score:2)

Re:binary prefixes (Score:2)

Re:binary prefixes (Score:2)

Re:binary prefixes (Score:5, Informative)

Re:binary prefixes (Score:2)

Re:binary prefixes (Score:2)

Re:binary prefixes (Score:2)

Re:binary prefixes (Score:2)

1.6 petabytes isn't that big a deal (Score:5, Informative)

No, the limits are much higher than that (Score:5, Informative)

Re:1.6 petabytes isn't that big a deal (Score:3, Informative)

Comparisons to other Parallel/Clustered FS? (Score:2)

You like JFS? (Score:2)