


PVFS2 - a High-Performance Parallel File System 26
neillm78 writes "As part of the development team, we're announcing PVFS2 version 1.0 here in Pittsburgh at the SC2004 conference! PVFS2 is a GPL/LGPL based parallel file system for cluster-based applications. It logically groups any number of storage servers into a coherent file system for use by client nodes, specifically tailored to handle efficient access to large shared files. PVFS2 supports access via an MPI-IO interface for high-performance parallel applications, but you can still mount it like a regular GNU/Linux file system for traditional serial applications and managment. The PVFS2 project is conducted jointly
between The Parallel Architecture Research Laboratory at Clemson University and The Mathematics and Computer Science Division at Argonne National
Laboratory. Please feel free to give it a try!"
Been following it for a while... (Score:3, Informative)
I plan on evaluating PVFS2 for our new clusters along with Lustre and GFS although I have heard nothing about the latter two operating over the MPI-ROMIO subsystem (which would definitely offer a performance increase).
Re:Been following it for a while... (Score:2)
Re:Been following it for a while... (Score:5, Informative)
This is like "Distributed NFS" although that description does it a huge injustice, it should help to get the point across.
Re:Been following it for a while... (Score:3, Interesting)
That would mean it should compete on the level of OpenAFS, Intermezzo and CODA for fault tolerant network filesystems -- except it would have internode locking which the others don't at the moment.
That would also mean it doesn't directly compete at the same level as GFS (which is targeted at configurations of servers connected by a SAN or similar).
Is this project set on integrating with the mainline kernel? What has/will happen on
Re:Been following it for a while... (Score:2, Interesting)
That's an interesting thought, but at no time have we ever thought of ourselves as a replacement for those file systems. The ones you mention are general purpose file systems whereas PVFS2 is meant to be a fast file system for parallel applications.
except it would have internode locking which the others don
Re:Been following it for a while... (Score:2)
Yes, I realize that now. Everything except for the last paragraph of my post was speculation, and the last paragraph was there to correct those speculations which was written after reviewing the web site a bit.
"I'm not sure what you mean here. We have no loc
It's Linux! (Score:3, Insightful)
That said, Nice job! I love to see the capabilities of Linux expanded in new directions like this. Cool work. I wish I had time to work on cool projects [sourceforge.net] like that.
Re:It's Linux! (Score:1, Troll)
-Neill;
Re:It's Linux! (Score:2)
Sorry. I'm just a nitpicky, pedantic bastard.
Having said that, I skimmed through the info on the project website, and it looks like some interesting stuff. At the
I hope the meta-data performance improved... (Score:3, Interesting)
Has the meta-data server been speed up at all, or made distributed with some kind of coherency-syncro backend?
Re:I hope the meta-data performance improved... (Score:2, Informative)
From the PVFS2 Guide [pvfs.org]:
The new design has a number of important features, including:
* modular networking and storage subsystems,
* powerful request format for structured non-contiguous accesses,
* flexible and extensible data distribution modules,
* distributed metadata,
* stateless servers and clients (no locking subsystem),
* explicit concurrency support,
* tunable
Re:I hope the meta-data performance improved... (Score:3, Informative)
> * distributed metadata,
> * stateless servers and clients (no locking subsystem),
Just to clarify... while we have distributed metadata, we don't have *replicated* metadata. At least, not yet.
If you have multiple metadata servers they will do load balancing. If you are working with lots and lots of small files, having a couple metadata servers might alieviate a possible bottleneck.
Oh. Neato. Well, I could have looked... heh... (Score:1)
Parallel Architecture Research Lab (Score:2)
This is exciting and all, but the really importing thing about PARL is that they were the only ones at Clemson willing to host our site [clemson.edu].
</SELF-PLUG>
I know this is for large clusters..... (Score:1)
Re:I know this is for large clusters..... (Score:4, Informative)
Redundancy ? (Score:2)
Does anyone use this for big, transparent file storage networks.
I've been looking for something better than "a bunch of nfs servers with some code to redirect each client to his storage". This is a pain to manage as well as having lots-'n-lots of pof's...
I've noticed that that metadata is not in a single node anymore, but it's not replicated yet either. I could live with this reliability problem if it could give me the transparency to just add a server when nee
Re:Redundancy ? (Score:1)
PVFS2's real sweet spot is for scratch space for scientific applications -- writing out checkpoints, reading in datasets.
I don't know if I'd call what PVFS2 has a "reliability problem". If you've got money, hardware-based failover solutions exist today and work well with PVFS2 (think heartbeat). In the not-so-distant future we've got people wor
Re:Redundancy ? (Score:2, Interesting)
Re:Redundancy ? (Score:1)
Re:Missing feature: Undeletion facilities (Score:2)
There's nothing about the recycle bin idea that makes directory structures disappear, except that that's how it's implemented on some systems.
Supporting udelete in your filesystem can be a huge pain, and stop you from doing many more interesting and useful things.
And yes, accidents do happen.