Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!


Forgot your password?
Data Storage Software Linux

Linux Filesystems Benchmarked 468

smatt-man writes "Over at Linux Gazette they ran some tests on popular Linux filesystems (ext2, ext3, jfs, reiserfs, xfs) and the results may surprise you."
This discussion has been archived. No new comments can be posted.

Linux Filesystems Benchmarked

Comments Filter:
  • by scrytch ( 9198 ) <chuck@myrealbox.com> on Tuesday May 11, 2004 @10:10AM (#9116199)
    Duplicate, spelling errors, and nothing but the short submission. Google is relaunching its blogger service -- tell me again what slashdot provides over it?
  • by Trailer Trash ( 60756 ) on Tuesday May 11, 2004 @10:12AM (#9116232) Homepage
    And the reason is that you used jpegs. jpegs are for photographs; use gif for images such as this. The text won't end up unreadbly blurry and you'll save tons of disk space/bandwidth.
  • by eddy ( 18759 ) on Tuesday May 11, 2004 @10:19AM (#9116302) Homepage Journal

    >Use gif for images such as this.

    No, use PNG.

    If you're going to do it, do it right. Using GIFs is half-assed.

  • While they do measure stuff like access times in ms, they don't mention recovery times (chkfs) that are mentioned in ms for reiserfs and mins for ext2. And they don't mention reinstallation times (measured in hours) which occurs for ext2 a lot more than the journalling filesystems :-)
  • by codepunk ( 167897 ) on Tuesday May 11, 2004 @10:23AM (#9116331)
    Personally I could care less which file system is fastest. The most important aspect to a file system is how stable it is with my important data. All the speed in the world means absolutely nothing if the file system is not stable. EXT 3 has never ever let me down so I intend to stick with it, at least until RedHat releases their version of GFS.
  • Re:ext3 slowness (Score:3, Insightful)

    by Speare ( 84249 ) on Tuesday May 11, 2004 @10:29AM (#9116393) Homepage Journal

    Personally, I see this as a mild security benefit. If I delete something, I want it GONE. It's not as good as an idle-time thread that 11-pass nukes unallocated sectors at random, but it'll do for a start.

  • by foolip ( 588195 ) on Tuesday May 11, 2004 @10:31AM (#9116410) Homepage
    What I'd like is a file system for which there is actually a defrag-tool. Sure, ext2/3 may try to reduce fragmentation as much as possible, but when it happens (as is likely when you have a near-full disk) you've got little or no way of fixing it. Or actually there is a defrag tool for ext2 (not ext3) but my experiences with it are not the best -- it took forever and I don't know that it actually did anything to the disk (fdisk didn't report a 0% fragmentation level afterwards anyway).
  • Re:Slightly OT (Score:2, Insightful)

    by Der Krazy Kraut ( 650544 ) on Tuesday May 11, 2004 @10:34AM (#9116433)
    The backwards compatibility is really not an issue anymore. Modern filesystems have been supported by all major distributions for years now. And if everything else fails, you can always use Knoppix [knoppix.de], which can access pretty much everything.
  • I've never understood why they don't move to ReiserFS, at least for new installations.

    Because for most uses, it's not the best option. So, if you're going to junk ext2 compatibility, why would you go to Reiser?
  • by ValourX ( 677178 ) on Tuesday May 11, 2004 @10:58AM (#9116693) Homepage

    Also, it's mountable from FreeBSD. Reiser, XFS and JFS are not.

    This may seem trivial until you have a dual boot system with FreeBSD and GNU/Linux and you want to transfer your /home dir or whatever.

  • by Gribflex ( 177733 ) on Tuesday May 11, 2004 @11:07AM (#9116781) Homepage
    Why is it that every benchmarking article contains the words "The results may surprise you?"

    Have any of you ever been surprised?
  • Re:Bad graphs (Score:2, Insightful)

    by MrBlue VT ( 245806 ) on Tuesday May 11, 2004 @11:08AM (#9116790) Homepage
    Arg, he also needs to just pick one color for each filesystem. I mean, he must want to torture his audience by constantly switching the colors and then making the legend so tiny and mucked up by JPEG artifacts, that one can't tell which bar goes to which filesystem.
  • One word - inertia (Score:3, Insightful)

    by Jeppe Salvesen ( 101622 ) on Tuesday May 11, 2004 @11:16AM (#9116873)
    You know - if it works, don't fix it!

    Ext3 is stable and there's a lot of useful available tools for it.

    If, for the end user, the difference is marginal, why bother to make things more difficult than necessary for yourself?

    Or maybe they've received unusually many bug reports for ReiserFS and thus concluded it's not stable enough for them to push it. After all, they want to be associated with (amongst other things) reliability.
  • Re:Slightly OT (Score:4, Insightful)

    by Rik van Riel ( 4968 ) on Tuesday May 11, 2004 @11:39AM (#9117088) Homepage
    Ext3 has most of its metadata (inodes, block group descriptors, etc) in fixed places on disk and e2fsck has a decade of testing in cleaning up the non-journaled ext2, so it's probably better tested than any of the fscks for journaling filesystems.
  • by kfg ( 145172 ) on Tuesday May 11, 2004 @11:39AM (#9117093)
    Tests such as these will always make things clear as mud. Engineering is always a matter of compromise. Trade offs must be made.

    Do you want a car that goes really, really fast, or do you want a car that gets good milage and has a really big back seat? ( You can always lie about having run out of gas).

    Neither car is "best" until you define its intended use, and they both make lousy hammers. I canna change the laws of physics.

    Different engineers have different ideas, different goals and different ways of going about things. Thus their output will vary in performance across a range of parameters. Pick the tool that compliments your primary need, then put up with the compromises that inherently entails. It's the best you can do, and yes, even FAT 16 may be the "winner" for certain functions.

  • by ztane ( 713026 ) on Tuesday May 11, 2004 @11:44AM (#9117130)
    Well, not all of us have temp on root partition. Of course, certain OSes tend to force that...
  • by Sxooter ( 29722 ) on Tuesday May 11, 2004 @11:47AM (#9117158)
    If you are using IDE drives with the write cache enabled, then NO journaling file system is safe on your system. This is because IDE drives with write cache respond immediately to requests for fsync with true, whether they've written the data out or not.

    If your data is important, either turn off the cache on your IDE drive or buy a SCSI drive, which won't lie about fsync. This is a problem for both linux and BSD.

    Later model IDE drivers and drives may be able to work properly with cache enabled, but not now. There are ongoing discussions on BOTH kernel hackers lists, BSD and Linux, about what to do, and no solution in sight.
  • by perlchild ( 582235 ) on Tuesday May 11, 2004 @12:12PM (#9117433)
    No mention of data=writeback, or any other optimisation tweaks, however. Kinda sad. The article is nice, the graphs are.... Err too much of a good thing?
    And basically the results just reiterate the design imperatives of each filesystem(how unsurprising!)

    - ext2 predates them all
    - ext3 is a low-impact, let's reuse what we know as much as possible, kinda file system
    - reiser's b-trees reflect it's "once we put the data in, how do we find it again" orientation
    - XFS was at least at one point, designed for "Media" files(think renderfarms), aka LARGE files, much of the benchmarks it won were on such files, although its design was also influenced by large-scale server needs(a renderfarm is a large-scale server cluster right?)
    - JFS was influenced by large-scale server needs(databases), but tampered by OS/2's needs, and other systems, resulting in a filesystem that's a bit more nimble than XFS, but less handy with huge files(normal, since databases try to use raw-io if necessary on huge files, unlike render clusters)

    I think this demonstrates the implications of early design imperatives on long-term software trends. XFS and JFS were developed for other platforms and ported to linux, yet notice how they can't really change their strengths(good thing too!).

    Anyone try the same benchmark on the 2.6 kernel just to contrast it? Wouldn't the new IO-system help to mitigate those weird ext2/ext3 slowdowns the article mentions, but doesn't explain?
  • by Anonymous Coward on Tuesday May 11, 2004 @12:52PM (#9117958)

    >Animation? Or do you not class this as a good feature?

    In an image format? No, I don't.

    JPEG JFIF doesn't support animation either, but we never hear people bringing that up?

    If you want animation, use an animated format. There are many. If you believe GIF is the best animated format available to you, then use that for animation. That still doesn't make it any better as an image format, it just proves the point that it's a complete hack (and a bad one)

  • Re:The conclusion (Score:3, Insightful)

    by broothal ( 186066 ) <christian@fabel.dk> on Tuesday May 11, 2004 @01:37PM (#9118550) Homepage Journal
    This question has been answered over and over.

    I must have missed the memo

    And the answer is: never.

    Please give me a link to a /. editor saying that.

    It would be neither legal nor ethical for Slashdot to mirror/cache content for articles posted on slashdot.

    So you're saying that freecache and google are illegal?

    Many site relies on banner ad revenue. Caching content would deprive those sites from the revenue generated by traffic. Plus there is the whole copyright issue.

    If you don't cache the images, the banners will still show (as google does it).

  • by WoodstockJeff ( 568111 ) on Tuesday May 11, 2004 @01:49PM (#9118670) Homepage
    Having read the article, it would have been nice if the bar graphs had been consistent... but, that's not the problem. As mentioned by others, a very large criteria for non-home users is damage tolerance, and, to an equal extent, the lack of any tendency for the driver to damage the file system (aka "stability"). And, in this day of databases, the ability to handle large files is increasingly important.

    I'm rapidly approaching the point where I need support for file sizes greater than 2GB. Quite frankly, most of what I've found about file sizes and file systems is 2 to 4 years old... Everyone's too concerned with speed!

  • by aggieben ( 620937 ) <aggiebenNO@SPAMgmail.com> on Tuesday May 11, 2004 @01:54PM (#9118724) Homepage Journal
    I'd like to see a set of benchmarks regarding stability and fidelity of the various filesystems. Which ones are the most durable? Which ones get corruption the most, and what are their corruption/data-loss ratios? Performance isn't the end of the world for me....but losing data *is*.
  • by Anonymous Coward on Tuesday May 11, 2004 @02:09PM (#9118883)
    The point of the benchmark is "what fs is better". There's been *years* of development in 2.6, while 2.4 has not been touched because of stability. For a benchamrk, IMHO 2.6 is much better. And 2.6, BTW; is working rock solid with more of 100 and 200 days of uptime on some boxes. The OSDL people put a 2-week long database test under 2.6. The result was: zero errors. And that was a four-CPU server.
  • jackass article (Score:5, Insightful)

    by jusdisgi ( 617863 ) on Tuesday May 11, 2004 @02:34PM (#9119109)

    Wow...I'm really surprised that I don't see anyone else around here bashing this "benchmarking" as totally ridiculous. Get it together, people! I mean, how does a group of folks that typically pride themselves on shredding the foolish articles that come by miss these beauties:

    1) This guy goes out with the stated goal of evaluating real-world performance...so he starts by throwing out all real benchmarks. Of course, those tools are designed by experts to try to represent real-world performance, but who cares, right? Instead, our jackass throws together a bundle of random operations and times them. No thought is apparent in the choices of the operations, and no discussion is given as to why the choices were made.

    2) The conclusions are drawn by simply adding the times of all the tests together. If you haven't figured out why this is dumb as a rock, let me explain: test #1 took 23-40 seconds, while test #2 took .02-.04 seconds. So, in his conclusion, test #1 was weighted 1000 times as heavily as test #2. I don't know about you all, but I for one don't feel that touching speed is 1000 times as important as finding speed.

    3) Even if he had normalized all the times so that the mean in each test was the same and then added those, he would still be wrong...various tests ought to be weighed differently, because real-world usage doesn't do all of these things the same amount. That said, the weight given to the tests needs to be well thought out and planned, rather than arbitrarily assigned (accidentally) without paying any attention. Interestingly enough, this sort of purposeful weighing of tests is exactly the sort of thing done by the real benchmarking tools that this idiot threw away.

    4) Perhaps this one isn't as important...but this guy can't make a graph to save his life. Half the bar graphs put time on X and the other half put time on Y. Graphs that obviously should be bar graphs are made into dot-line ones. The text is blurry and you can't tell the colors in the key.

    Anyway, I still don't get why everybody around here seems to have missed all this...it was painfully obvious to me when I just took a cursory glance at it.

  • by GooberToo ( 74388 ) on Tuesday May 11, 2004 @03:09PM (#9119463)
    All of the files systems faired fairly well when finding 10,000 files in a single directory, the only exception being XFS which took twice as long.

    I'm surprised that XFS did so poorly here. I do know they had a bug one point in time, which may reflect such a score, however, I thought it had long been addressed. Worse, I thought I remembered reading that XFS used a btree to track file and directory names. Please correct as needed. If this is true, it would appear to be a bug rather than normal performance. Any XFS experts care to fill in the blanks here?

    I should also offer than XFS's big claims are stability, reliability, big and huge file support, speed when accessing big and huge files, and excellent concurrent file access abilities, which is why SCSI is the preferred media for XFS. Basically, if you plan on managing big and huge files or medium to huge files with large amounts of concurrent activity via SCSI, then XFS should be one of your target FS.

    Then, you have excellent snapshot, backup and recovery utilities as well as quota and realtime access support. All of which, make XFS an excellent journalled FS.

  • by idiotnot ( 302133 ) <sean@757.org> on Tuesday May 11, 2004 @03:10PM (#9119473) Homepage Journal
    Ugh. HFS+ is not a Unix filesystem. It's a Macintosh filesystem, introduced in OS8, that's been modified so it can play nice with a Unix-like OS. Darwin/OSX also use old, slow, reliable Berkeley FFS, the filesystem on which ext2 was patterned. HFS+ isn't an ideal Unix filesystem, because it doesn't have case-sensitivity by default, and it has features that would be wasted by typical Unix. The reasons it's there in OSX are backward compatibility and speed. Some carbon applications refuse to run on an FFS partition, and OS8/9 can't read FFS. The FFS code in Darwin is also very old now (most of it is late 80's tech), and the performance shows. The nifty things that the real BSD's have done to speed it up (soft dependencies, etc.) haven't found their way into the weird mach almalgam that is XNU.

    Write support of HFS+ works under GNU/Linux.
  • by PGillingwater ( 72739 ) on Tuesday May 11, 2004 @03:37PM (#9119724) Homepage
    For those who are concerned about loss of permissions, filename case, long names, etc., may I offer a simple solution -- wrap your files in .tar or .cpio archives.

    If you need "live" access to the files, then simply create a loopback ext2/3 file system (which can also be encrypted), which is stored in a single large file in the FAT partition. Mount it on a loopback device, and your other problems are moot.
  • by Sxooter ( 29722 ) on Tuesday May 11, 2004 @04:38PM (#9120335)
    I don't think you're understanding what I'm saying. Here's a brief time line based explanation.

    The OS writes to the journal what it's going to do.
    It calls fsync to make sure the journal is on the disk.
    The disk says "oh yeah, it's there."
    The file system then writes the actual data, and fsyncs that. The disk, again, tells it that it wrote the data out. The file system then marks that part of the journal as having been written.

    However, the disk hasn't actually written out its cache yet. It lied to the OS / file system and said it had, but it hasn't, it's busy doing something else. Poof, the power goes out.

    Now, the journal doesn't have our data, we've already cleared it out, and the file system, which is supposed to have been coherent because we fsynced it, is not, and it is now corrupted.

    I have reproduced this behavior a dozen or more times on IDE based systems. The only way to stop it is to tell the drive to stop using it's write cache.

    Now, SCSI drives don't lie about fsync. At least none of the ones I've tested have done that. so, when the file system asks the disk to fsync and reports back that it has fsynced, the data really is on the drive. And we can now roll forward the journal pointed and proceed with the knowledge our data is coherent on the drive.

    You can prove this to yourself. Set up a server on IDE and another on SCSI. Install postgresql, running on a journaled file system like ext3, and then run the follow commands on each:

    pgbench -i -s 10
    pgbench -c 100 -t 100000

    Now, in the middle of the test on both machines, pull the plug.

    When you restart the machines, the SCSI one will come right up, database coherent and no problem. the IDE one will fail, at least every so often, or as in my experience, nearly every time.
  • by sakyamuni ( 528502 ) on Tuesday May 11, 2004 @04:43PM (#9120381)
    They really should have benchmarked V4 not just V3

    With all due respect, as your website states:

    Reiser4 is in final testing, and will ship soon!
    and (on the download page):
    do not use reiser4 for production system, do not keep any important data on reiser4. It is experimental yet.

    I would say benchmarks need to be performed with released products. It doesn't help most users if Vendor X claims his vaporware beats all competitors. Now, I know this isn't the case with ReiserFS here -- it isn't vaporware -- but it isn't production-ready either according to Namesys. You're just unfortunate in that this benchmark was performed just before the release of your next version which would have performed better.

    On the other hand, any benchmarks published on the Web ought to be updated whenever a new version of a tested product is released -- add the results of the new version and keep the old one as well, for comparison purposes.

Happiness is twin floppies.