Slashdot is powered by your submissions, so send in your scoop


Forgot your password?
Data Storage Software Linux

Real-World Benchmarks of Ext4 249

Ashmash writes "Phoronix has put out a fresh series of benchmarks that show the real world performance of the Ext4 file-system. They ran 19 tests on Fedora 10 with changing out their primary partition to test Ext3, Ext4, Xfs, and ReiserFS. The Linux 2.6.27 kernel was used with the latest file-system support. In the disk benchmarks like Bonnie++ Ext4 was a clear winner but with the real world tests the results were much tighter and Xfs also possessed many wins. They conclude though that Ext4 is a nice upgrade over Ext3 due to the new features and just not improved performance in a few areas, but its lifespan may be short with btrfs coming soon."
This discussion has been archived. No new comments can be posted.

Real-World Benchmarks of Ext4

Comments Filter:
  • ext2? (Score:5, Funny)

    by jon3k ( 691256 ) on Wednesday December 03, 2008 @11:42AM (#25975671)
    What, no ext2 comparison? seems like a pretty glaring omission.
    • Re: (Score:3, Funny)

      by Neotrantor ( 597070 )

      hey, why not a fat16 bench as well?

      • Re:ext2? (Score:5, Informative)

        by Hal_Porter ( 817932 ) on Wednesday December 03, 2008 @01:07PM (#25976863)

        You joke but fat with big clusters is pretty efficient for media applications. It's easy to get a good cache hit rate on the FAT cache.

        E.g consider a FAT32 filesystem reading from a contiguous file. You have 512 bytes sectors 32K clusters. You have a one sector buffer for FAT data in the filesystem and a cluster sized buffer to read ahead data in the application. Each read of the FAT tells you where 128 clusters are. So you read a sector of FAT and then you can read 128 data clusters (4MB) before you need to do any metadata access. That's a very low overhead. There are no inodes to be updated, no atime and no bitmap, just the FAT, data clusters and directory entries. You need to update the time in the directory entry only once when you close the file if it has been written.

        A small amount of code can get read speeds that are close to the raw read speed of the device for a contiguous file because you spend a small amount of time on bookkeeping.

        Of course directories aren't indexed but lots of applications don't have vast numbers of files in a directory. In any case reading ahead as far as possible and doing a linear search is probably quicker than doing a B tree access which involves moving the hard disk head around. Even fragmentation doesn't introduce much overhead, you'd just have to reload you FAT buffer from a different place in the FAT each time you cross into a new fragment. Traditionally inode based fiesystems are worse because you have multiple levels of indirection to find the data blocks. Ext4 uses extents instead but a FAT implentation could keep track of extents in Ram so reading a big contiguous file would require FAT reads only until the filesystem works out that the file is one big extent.

        If you have large directories it slows down a bit but you can always cache the mapping from files to clusters.

        Most people running inode based filesystems with atime updates on, the default, probably have a filesytem which is less efficient than FAT. On Windows for example NTFS is slower than FAT in the default config with atime updates on. Kind of embarassing given that NTFS is much more high tech filesystem with Btrees fo directories and extent lists in the inodes to track data blocks.

        Of course NTFS has a logfile but that only protects metadata. FAT has bit in the first FAT entry to mark the volume dirty, the idea is that you set it before you write and clear it when you're done. If you mount a filesystem with the bit set you need to do a chkdsk. Chkdsking means reading all the directories and making sure that the allocations match the FAT. It's slower than a logfile based chkdsk but it will fix corrupt metadata, mostly by freeing lost clusters. You wouldn't want to boot your OS from it, but it's good for play MP3s or AVIs from. It's also incredibly widely supported - pretty much any PC OS, games console or media player will handle it. And it works for drives of up to 2TB. Really the only problem is 4GB max file size.

        • Re: (Score:2, Interesting)

          by Microlith ( 54737 )

          And it works for drives of up to 2TB. Really the only problem is 4GB max file size.

          I had forgotten about those two particular limitations of Fat32 and it struck me when you mentioned them that we're about to run across the first. I had personally hit the 4GB limit a while back but at the rate things are going, before 2009 is out we'll see disks released that a single Fat32 partition doesn't support.

          To think that when it was released with Windows95, people saw the 2TB limit and committed the first sin of com

          • Re:ext2? (Score:4, Interesting)

            by Hal_Porter ( 817932 ) on Wednesday December 03, 2008 @02:55PM (#25978297)

            You could extend FAT32 to handle bigger drives. The 2TB limit comes from the fact the the volume size is limited to 2^32 sectors. With a hard disk that's 2TB.

            One possibility would be to add a 32 bit volume size in clusters field to the FSInfo sector which already contains the first free cluster and the free cluster count. FAT32 has the upper 4 bits in a FAT entry marked as reserved which limits you to 2^28 clusters. You bump the filesystem version in the bootsector and use all of the bits in a FAT entry. With those two changges you could have 2^32 clusters which is 256TB with 64K clusters.

            It's the same with the 4GB limit. You could use one of the spare bytes in the directory entry to have more bits of filesize. Some Doses do this and call it FAT+ []

            Mind you, the reason people use FAT32 is because it is supported by everything. If you did either of these things you'd end up with a file system which wasn't supported by anyone. Old implementations would either corrupt volumes which were more than 2TB or had files bigger than 4GB or fail to mount them.

            Now Microsoft have something called exFAT, a completely new filesystem which is incompatible with FAT32 and patented so it's not really in their interests to keep adding features to FAT32 which is now more or less open. At least I don't think many people paid them royalties and they haven't sued anyone to get them.

        • Re: (Score:3, Interesting)

          by Anonymous Coward
          WTF man I can't believe I assembled and executed your signature ... damnit
          • Re: (Score:3, Informative)

            Can't you tell by reading it that it's a fork bomb? I mean, any idiot should be able to figure that out! :-P

            global _start
            mov eax, 2 ; sys_fork
            int 80h ; execute call
            jmp _start ; unconditional jump to _start

            Get it???

            (Disclaimer: I didn't really expect you to know what it did just by looking at it. I didn't. I had to look up the system call on Google. Just don't go executing code if you don't know what it does.)

    • Re: (Score:3, Insightful)

      by Hyppy ( 74366 )
      Modded funny, but ext2 still has its uses. Journalling can be a bit of a performance breaker in the right (wrong) circumstances.
      • Re: (Score:3, Insightful)

        by jgtg32a ( 1173373 )
        Yup, I use it for the /boot partition
      • In fact, journaling is a performance breaker in pretty much every i/o intensive scenario, such as database servers.
        Ext2 is still the preferred choice on servers here.

        • Re: (Score:3, Interesting)

          by Atti K. ( 1169503 )

          In fact, journaling is a performance breaker in pretty much every i/o intensive scenario, such as database servers. Ext2 is still the preferred choice on servers here.

          Ext2 itself is kind of a performance breaker :)

          Don't get me wrong, I like ext2/3, I use only ext3 on all my machines and other machines I install, it's the only Linux fs I really trust. (back then when Suse defaulted to install reiserfs, I always changed that :) But we have to admit that it's not the best-performing fs on Linux.

  • Begin discussion revolving around what you think btrfs sounds like... again
    butter file system
    butt file system
  • by Viol8 ( 599362 ) on Wednesday December 03, 2008 @11:45AM (#25975703) Homepage

    Admins tend to stick with what they know and ext4 is a natural progression from ext3. btrfs however hasn't even reached version 1.0 yet - and to be honest who is going to want to use a 1.0 release anyway on something as fundamental as a filesystem? Also its development is being done by an Oracle team , albeit FOSS , which may put a few people off.

    My prediction for what its worth is that ext4 will be around for a LONG time.

    • by IceCreamGuy ( 904648 ) on Wednesday December 03, 2008 @11:49AM (#25975755) Homepage

      My prediction for what its worth is that ext4 will be around for a LONG time.

      Like... how long? Longer than it takes to fsck an 80GB ext2 filesystem? Because that's a pretty long time.

    • by statusbar ( 314703 ) <> on Wednesday December 03, 2008 @11:53AM (#25975809) Homepage Journal

      I wonder if ext4 performs better with SSD's? Or if ext4 doesn't need an occasional fsck like ext3 does?


    • Re: (Score:3, Informative)

      by LWATCDR ( 28044 )

      What I don't understand is why more people are not using JFS?
      It has good performance, it is stable, and it supports very large file systems.

      Seems like a good choice for many people.

      • by Azar ( 56604 ) on Wednesday December 03, 2008 @12:37PM (#25976435) Homepage

        I was wondering why it was omitted from this article as well. I believe that it's because JFS just doesn't seem to have the mindshare of ext3, XFS, or even ReiserFS. After reading various filesystem comparisons, I chose it as my FS and I have been using it for over a year without a single issue. I don't have to worry about long fsck times at reboot, my CPU has less load when deleting or copying a large volume of files (virtual machines or CD/DVD isos usually), never had any file loss or corruption, and it seems to complete large file operations quicker than ext3 (what I used previously). There are some things that ext3 does better than JFS, but overall I prefer the advantages of JFS over the advantages of ext3.

        I know that if I were to rebuild an older computer that had lower specs, or perhaps a set-top box like MythTV that I wanted to be more power effecient then JFS is going to be my choice for the filesystem.

        • How does JFS handle crashes (on desktop hardware)? I gave XFS a shot to find out it doesn't handle crashes well at all, since then im on reiserfs but with support lacking my next install will probably be using JFS

          • by MrNemesis ( 587188 ) on Wednesday December 03, 2008 @03:14PM (#25978531) Homepage Journal

            I've used JFS extensively for several years now, including power outages when the discs were under load - and I've never had anything fail to correctly fsck when the power came back up. fsck is also far, far faster than ext3 or XFS and CPU loading is *way* lower than XFS, whilst maintaining comparable throughput. I don't like XFS for the simple reason that if you don't have bulletproof power you will be restoring from backup. And when you've been in the job for as little as a few years you soon come to realise that *nothing* is bulletproof. Even triple-UPSed, six diesel generators, four seperate power lines and a five nines SLA with your data centre are no protection against a hung-over techie pulling the power cables from the wrong server.

            JFS also gets ACL access "for free"; if you use it like I use it, as the backed for a Samba server (~300 users) with a complicated ACL structure, JFS is much faster. Last time I fiddled with ACL's under ext3, I found they introduced a small, but percetible, increase in filesystem latency - don't know if this has been fixed, but using ACLs under JFS incurs no penalties, at least as far as my testing goes [citation needed]

            IMHO JFS is one of the hidden gems of Linux - fast under varying workloads, robust, frags up nowhere near as badly as ext2/3 when getting full, incredibly simple... the only downsides I'm aware of are that you can't shrink it and the fact that no-one seems to "support" it.

  • by INeededALogin ( 771371 ) on Wednesday December 03, 2008 @11:49AM (#25975745) Journal
    and in the darkness... bind them.

    Seriously... one of the nice things about Windows, OSX, Solaris is that they get a new filesystem once every 5-10 years. The safest thing to do for Linux is to be a generation behind. I would not run ext4 until btrfs came out. Why be the admin that gets screwed with early bugs and future incompatibilities...
    • by fracai ( 796392 ) on Wednesday December 03, 2008 @11:57AM (#25975863)

      See, I already thought of that, so I run no fewer than 5 generations behind.

      I started at 1, but realized that soon this practice would become widespread and then I'd be back to being an early adopter. So I moved to 2 generations. But then a friend agreed with my plan and I saw that in not too very long I'd be an early adopter again with my 2 gen old system. Not this time! I skipped the 3 and 4 generation delay and went right to the 5 generation wait time. I figured it was the only way to be sure I wouldn't get hit by any bugs.

      Shoot, now the secret's out. Time to roll back my filesystem again.

    • by The MAZZTer ( 911996 ) <> on Wednesday December 03, 2008 @11:57AM (#25975865) Homepage

      Windows hasn't had a new filesystem that recently! NTFS was introduced in 1993 []. Windows has been using it for the past 15 years (if you only count the NT line, the 9x/ME line never even supported it IIRC, they just used the even older FAT32).

      Of course if you want to count the small extensions [] added on with each windows version then your claim about Windows is correct. Still I wouldn't be surprised if Windows Vista filesystems mount inside NT 3.1. I should test this with a VM...

      • by El Lobo ( 994537 ) on Wednesday December 03, 2008 @12:07PM (#25976001)
        Sure the NTFS used by NT4 and the one Vista uses share the same name (and are *somehow* compatible: Vista understands the old NTFS and NT4 can use the new one in a limited way, but there is A LOT under the hood. The way security descriptors work, for instance is completely different in new versions. volume Shadow Copy, Hierarchical Storage Management, Junction Points, and other "extensions" are a HUGE step forward, and made the new NTFS in reality a new version, with the same old name.
        • by PitaBred ( 632671 ) <`gro.sndnyd.derbatip' `ta' `todhsals'> on Wednesday December 03, 2008 @01:00PM (#25976763) Homepage

          Junction points have been around since at least Win2K, possibly NT4. I know I used them on 2K personally, can't speak to NT4 though. And Volume Shadow Copy requires application cooperation [], so it's not really that much of an improvement over standard mirroring unless you use copy-on-write, which is still not filesystem level. You need to have the apps aware of it.

          • by macraig ( 621737 )

            I'm using junctions in this Windows 2000 system right now this very moment, thanks to a third-party shell extension or two I found that makes using them practical. One of the things I use them for is to shift some of the "default" Windows file structure locations somewhere else, without having to tweak all that stuff in the Registry. Because junctions operate under the OS radar, Windows is none the wiser that C:\Documents and Settings\username\My Documents is really just a hard link to a directory in anot

            • I'm using junctions in this Windows 2000 system right now this very moment, thanks to a third-party shell extension or two I found that makes using them practical.

              To the extent they work and to the extent they aren't a half-assed implementation of what non-Windows users take for granted, junctions created using different methods (Windows tools, Sysinternals, etc.) all behave differently, so expect to be bitten soon enough once you step outside your Explorer window.

        • Volume Shadow Copy: sounds like LVM2 snapshots under Linux and VMS had real versioning file restoraton a long time ago.

          Hierarchical Storage Management: HSM as implemented by IBM and then ported to Linux

          Junction Points: sounds like hard links under Linux

        • They do actually implement the version number, but don't generally publish it as numbers confuse people. [] so it about the same (ish) time line as ext2->3->4
      • by TheRaven64 ( 641858 ) on Wednesday December 03, 2008 @12:22PM (#25976193) Journal

        NTFS is the result of the requirements team taking too long to decide what a filesystem should do. The filesystem team on the original NT project couldn't wait for them to decide, so they produced something very simple that just stored mappings from names to two kinds of data, big and small values. Big values are stored as one or more disk blocks, small values are stored in the Master File Table (MFT).

        On top of this is a metadata layer, where certain types of name are interpreted in different ways. This was used to add ACLs, encryption, compression, copy-on-write semantics, and so on.

        The low-level format for a Vista NTFS partition is the same as for NT3.51, but each version of NT has added a few more types of metadata and interpretations of them, meaning that writing to a disk with an older version is likely to cause data corruption.

    • Re: (Score:3, Insightful)

      Seriously... one of the nice things about Windows, OSX, Solaris is that they get a new filesystem once every 5-10 years. The safest thing to do for Linux is to be a generation behind. I would not run ext4 until btrfs came out. Why be the admin that gets screwed with early bugs and future incompatibilities...

      I love your sense of panic.

      Anyone taking the use of ext4 seriously will setup test systems where the ONLY different is the file system. And, then, beat the crap out of it and see how it performs.

      It's really pretty simple to validate this type of thing.

      • by Vancorps ( 746090 ) on Wednesday December 03, 2008 @01:25PM (#25977121)

        You think validating the integrity of a filesystem is easy??!?!?

        That's insane, first of all, you won't know how it performs unless you give it real world usage complete with disk failures. There are hundreds of file systems which can store data but how they handle problems are what separates most of them. Of course there are other distinctions but the failure mode scenarios are what most interest an admin as failure is never a question of it, only a question of when. Simulating certain failure modes is exceedingly difficult to do.

  • by Ritz_Just_Ritz ( 883997 ) on Wednesday December 03, 2008 @11:51AM (#25975773)

    It really depends on what the larger distros choose to stick with as their default. To be honest, I'd still be using ext2 if Redhat hadn't made ext3 the default. While I'm sure that some applications depend on wringing that last few % of performance out of the spindles, it just doesn't matter THAT much for most applications.

    • by INeededALogin ( 771371 ) on Wednesday December 03, 2008 @12:02PM (#25975927) Journal
      it just doesn't matter THAT much for most applications

      well... run an fsck against ext2 and ext3 and tell me it doesn't matter. For an admin, speed, reliability, recoverability... are all major concerns. On Solaris, I love ZFS because of the functionality like snapshots and exports. I also got burned by the IDE/100% CPU driver bug on Sparc hardware. Admins need to be aware of what they are running and what limitations exist. I honest don't give a damn about mp3 encoding speed, but the capabilities and maturity of a filesystem have to be considered.
      • Well, I'm going to assume the ext2/3 drivers are not buggy and then ask, why not just STOP fscking it if it bothers you that much?

        Back when I ran linux you had automatic fsck's at regular (reboot) intervals because no one actually trusted the code, but that was 12-13 years ago, I can not possibly believe that by now its not stable and safe enough to avoid checking it a regular interval from clean shutdowns. And even then, its not like its going to happen that often on a server unless you reboot server nig

    • Re: (Score:3, Insightful)

      by Chris Burke ( 6130 )

      To be honest, I'd still be using ext2 if Redhat hadn't made ext3 the default.

      Well thank goodness RedHat saved you from yourself, then!

      It's not about performance, it's about journaling. Ext3 has it, ext2 doesn't, ergo by modern standards ext2 is crap. The only justification for using it was when the only journaling file systems for linux were unstable.

  • ReiserFS (Score:5, Funny)

    by thomasj ( 36355 ) on Wednesday December 03, 2008 @12:03PM (#25975941) Homepage
    ReiserFS used to be the killer FS, but now it seems like it is stuck. But I shall not be the judge of that, though there seems to be some truth buried in it somehow. And not to mention, the next release is probably more than a few years down the road.
  • by Nick Ives ( 317 ) on Wednesday December 03, 2008 @12:10PM (#25976027)

    I see no analysis as to why the filesystems perform the way they do. Why does XFS perform so well on a 4GB sequential read and so badly on an 8GB read? Why did they include cpu / gfx bound frame/sec benchmarks? In the few application benchmarks where there was more than a tiny fraction of percent difference there's no discussion as to whether that difference is actually significant.

    Not at all enlightening.

  • by afidel ( 530433 ) on Wednesday December 03, 2008 @12:11PM (#25976045)
    WTF who measures things like MP3 compression time when testing a filesystem?!? As far as I can tell they only ran one real I/O test and that was the Intel IOMeter based fileserver test which showed EXT4 is really fast for that profile. I would have loved to have seen the DB profile run. Their other artificial tests could have been summed up by running the streaming media profile since they were just large contiguous reads and writes.
    • by ChrisA90278 ( 905188 ) on Wednesday December 03, 2008 @01:00PM (#25976761)

      who measures things like MP3 compression time when testing a filesystem?!?

      They were measuring what matters: "System performance" It very well could have been the case that the fastest file system, measured on a simple bench mark gives the worst MP3 compression time. Let's say the reason the filesystem is fast is because it uses a huge amount of CPU time and RAM. So a RAM based encrypted file system might be very fast, until you run an application on it.

      It's a reasonable test and what it showed is that in the real-world performance is about the same

      • Re: (Score:3, Informative)

        by tknd ( 979052 )

        What about office apps? Design tools (like engineering design software)? Compilers, linkers, and interpreters? Package management (since this is linux)? File searches with the "find" command? Recursive directory copies?

        The benchmark suite chosen is not representative of the "real world" usage. In the real world there are a variety apps that work on a variety of data. Not just cpu bound applications.

    • Agreed, there's not really much point to many of the benches.

      True, all other things being equal if you have two identically performing filesystems but one ramps the CPU higher than the other for a given throughput, you'd expect the more conservative filesystem to finish the CPU-bound benchmark the quickest.

      However, this is no reason to make approx. half of your tests mostly CPU/memory/graphics bound where I/O performs a minimal role. Where are the rest of the IOMeter profiles? Where's the time to do 10,000

    • Re: (Score:3, Insightful)

      by syousef ( 465911 )

      WTF who measures things like MP3 compression time when testing a filesystem?!?

      Anyone interested in real world usage. Resource usage of the file system drivers while doing something processor intensive such as encoding is certainly of interest.

  • XFS (Score:3, Insightful)

    by conspirator57 ( 1123519 ) on Wednesday December 03, 2008 @12:16PM (#25976139)

    Given that XFS, a 15+ year old file system, is still a serious contender, one would think enough blood has been squeezed from this stone. What is left to us is application tuning and hardware improvements, possibly including filesystem management hardware. It seems to me that teaching application developers how to write their programs to best utilize the filesystem is more likely to yield better performance gains for the effort expended than trying to make a general purpose filesystem good at any flavor of IO that application developers naively throw at it. Simple rules: buffer your IO yourself, perform raw accesses in multiples of the sector/stripe size size.

    • Well I think these tests play to XFS' advantages. XFS does well with larger files and when XFS was new there were not many applications for large files except maybe in Big Iron. These days larger and larger files like movies are more the norm.
    • Re:XFS (Score:4, Interesting)

      by sohp ( 22984 ) <snewton&io,com> on Wednesday December 03, 2008 @12:33PM (#25976385) Homepage

      Blood to squeeze? How about a new stone? Solid-state drives.

      SSDs. Yep, they will completely change the rules for filesystems. Decades of tricks and tweaking to deal with rotational latency and head movement have virtually zero application in SSDs. All the code for that will become worse than useless. It will have to be removed or at least turned off. Leaving it on will actually result in worse performance on SSDs.

      • Re: (Score:3, Interesting)

        hrm, i don't know that SSD has gained enough widespread adoption for a mainstream filesystem to be optimized for solid state rather than mechanical rotational storage. however, you do raise an interesting point. perhaps a new filesystem can be designed from the ground up optimized for SSD. whoever gets into this area of development right now will have a huge lead on competitors when SSD storage solutions finally achieve price parity with spinning media.

  • Useless. (Score:4, Insightful)

    by Steve Baker ( 3504 ) on Wednesday December 03, 2008 @12:17PM (#25976153) Homepage

    What's with the CPU/Video tests? How about some more random access pattern tests, DB/web/streaming media tests? How about showing CPU utilization in addition to I/O performance?

  • Which distros will include this as an option at install time? That is what I want to know.

  • aah Ext4 (Score:2, Funny)

    by noppy ( 1406485 )

    Time to get a new wife

  • This has got to be one of the worst tests I have ever seen. Their 'real world' tests were made on operations that take mostly processor power. Without getting into the completely retarded game testing, they looked at encoding, compression, and file encryption. Sorry but for me this tells practically nothing of the speed of a file system.
    I would have like to see :
    creating 4gb of large files
    deleting 4gb of large files
    copying 4gb of large files
    creating 4gb of small files
    deleting 4gb of small files
  • by Rendus ( 2430 ) < minus language> on Wednesday December 03, 2008 @12:59PM (#25976749)

    Remember, this is the site that's decided that Ubuntu 7.04 is twice as fast as any other version of Ubuntu []. Take what they say with a good healthy dose of skepticism.

  • Ehm, still those benchmarks filesystems are optimized for. Please try blogbench in order to make filesystem really hurt like they would do for a file server.

  • Their test system is a monster 8-core, dual CPU setup, with only 2 GB of RAM. (Hell, I've got 2 GB of RAM in my dinky single-core Athlon 2800+ desktop.)

    RAM is cheap and CPUs are expensive. Their system is not particularly representative since it seems to be biased in the other direction. Further, when the tests include reads and writes that are guaranteed to fill up all available RAM (sequential ops of 4 and 8 GB), the design is flawed because I/O to swap may contaminate the results.

    Maybe they can fill u

Make it myself? But I'm a physical organic chemist!