Linux Filesystems Benchmarked 468
smatt-man writes "Over at Linux Gazette they ran some tests on popular Linux filesystems (ext2, ext3, jfs, reiserfs, xfs) and the results may surprise you."
It is easier to write an incorrect program than understand a correct one.
'Tis a dupe (Score:5, Informative)
Dupe (Score:0, Informative)
Re:Not a clear winner (Score:5, Informative)
Jedidiah.
here's the link to the original (Score:4, Informative)
it's not slashdotted yet
The conclusion (Score:2, Informative)
Anyway, all rants aside, here's the conclusion from the tests (there were some graphs as well but I couldn't make sense of them anyway):
CONCLUSION
For those of you still reading, congrats! The conclusion is obvious by the "Total Time For All Benchmarks Test." The best journaling file system to choose based upon these results would be: JFS, ReiserFS or XFS depending on your needs and what types of files you are dealing with. I was quite surprised how slow ext3 was overall, as many distributions use this file system as their default file system. Overall, one should choose the best file system based upon the properties of the files they are dealing with for the best performance possible
Re:Not a clear winner (Score:5, Informative)
Ext3 provides a safe low-pain entry into the world of journaled file systems. No need to re-partition or reformat. It offers reasonably good performance plus the benefits of journalling.
Re:Slightly OT (Score:5, Informative)
ext3 slowness (Score:5, Informative)
Re:Slightly OT (Score:3, Informative)
Use the mount option data=journal, see man mount for more information.
I do know that RieserFS has some "features" that are unexpected and can be agrivated by powerfailure during write.
But don't worry, Hans says it's designed that way, and your filesystem will be intact, even if
XFS is good, but cannot be shrunk. EXT3 can shrink, but I don't know about the others. I'm going to have to investigate JFS, which right now is the bastard stepchild, ignored by most......
Re:Slightly OT (Score:3, Informative)
Re:Slightly OT (Score:5, Informative)
No, in fact ext3 is one of the few that actually will journal data as well as metadata.
Mirror (be kind) (Score:5, Informative)
http://www.gutenpress.org/links/LG/102/piszcz.htm
Re:Your graphs are unreadable (Score:4, Informative)
Re:Your graphs are unreadable (Score:5, Informative)
>Web site accessibility (use image type supported by all major browsers)
All the "good features" of GIF is supported by PNG in all current browsers. You'd have to go back in time fem years to find a browser that can't display a basic PNG. If you think otherwise, give me a link to one that matters that doesn't, and explain to me why, if it wasn't released/updated this year, using it isn't a security issue.
Since GIF doesn't support per-pixel-alpha to begin with, you lose nothing by using PNG for everything. After all, with GIF you didn't have the choice at all so there is no issue with simply "converting to PNG".
Score: PNG
>Bandwidth conservation
PNGs are always smaller where it matters (anything more complex than 1x1x1-images). In some not atypical cases a PNG can be 25% smaller than the corresponding GIF.
Score: PNG
PS. GIF-via-LZW is still encumbered in many countries.
More features, better standard, solid software, no licensing issues, smaller output == Winner: PNG
Re:Your graphs are unreadable (Score:5, Informative)
Almost Full Article Text (Score:3, Informative)
Benchmarking Filesystems
By Justin Piszcz
INTRO
I recently purchased a Western Digital 250GB/8M/7200RPM drive and wondered which journaling file system I should use. I currently use ext2 on my other, smaller hard drives. Upon reboot or unclean shutdown, e2fsck takes a while on drives only 40 and 60 gigabytes. Therefore I knew using a journaling file system would be my best bet. The question is: which is the best? In order to determine this I used common operations that Linux users may perform on a regular basis instead of using benchmark tools such as Bonnie or Iozone. I wanted a "real life" benchmark analysis. A quick analogy: Just because the Ethernet-Over-Power-Lines may advertise 10mbps (1.25MB/s), in real world tests, peak speed is only 5mbps (625KB/s). This is why I chose to run my own tests versus using hard drive benchmarking tools.
SPECIFICATIONS
HARDWARE
COMPUTER: Dell Optiplex GX1
CPU: Pentium III 500MHZ
RAM: 768MB
SWAP: 1536MB
CONTROLLER: Promise ATA/100 TX - BIOS 14 - IN PCI SLOT #1
DRIVES USED: 1. Western Digital 250GB ATA/100 8MB CACHE 7200RPM
2. Maxtor 61.4GB ATA/66 2MB CACHE 5400RPM
DRIVE TESTED: The Western Digital 250GB.
SOFTWARE
LIBC VERSION: 2.3.2
KERNEL: linux-2.4.26
COMPILER USED: gcc-3.3.3
EXT2: e2fsprogs-1.35/sbin/mkfs.ext2
EXT3: e2fsprogs-1.35/sbin/mkfs.ext3
JFS: jfsutils-1.1.5/sbin/mkfs.jfs
REISERFS: reiserfsprogs-3.6.14/sbin/mkreiserfs
XFS: xfsprogs-2.5.6/sbin/mkfs.xfs
TESTS PERFORMED
001. Create 10,000 files with touch in a directory.
002. Run 'find' on that directory.
003. Remove the directory.
004. Create 10,000 directories with mkdir in a directory.
005. Run 'find' on that directory.
006. Remove the directory containing the 10,000 directories.
007. Copy kernel tarball from other disk to test disk.
008. Copy kernel tarball from test disk to other disk.
009. Untar kernel tarball on the same disk.
010. Tar kernel tarball on the same disk.
011. Remove kernel source tree.
012. Copy kernel tarball 10 times.
013. Create 1GB file from
014. Copy the 1GB file on the same disk.
015. Split a 10MB file into 1000 byte pieces.
016. Split a 10MB file into 1024 byte pieces.
017. Split a 10MB file into 2048 byte pieces.
018. Split a 10MB file into 4096 byte pieces.
019. Split a 10MB file into 8192 byte pieces.
020. Copy kernel source tree on the same disk.
021. Cat a 1GB file to
NOTE1: Between each test run, a 'sync' and 10 second sleep were performed.
NOTE2: Each file system was tested on a cleanly made file System.
NOTE3: All file systems were created using default options.
NOTE4: All tests were performed with the cron daemon killed and with 1 user logged in.
NOTE5: All tests were run 3 times and the average was taken, if any tests were questionable, they were re-run and checked with the previous average for consistency.
CREATING THE FILESYSTEMS
(snipped. too many junk characters0
BENCHMARK SET 1 OF 4
In the first test, ReiserFS takes the lead, possibly due to its balanced B-Trees. (If the images are hard to read on your screen, here's a tarball containing larger versions of them.)
All of the files systems faired fairly well when finding 10,000 files in a single directory, the only exception being XFS which took twice as long.
Both ext versions 2 and 3 seem to reap the benefits of removing large numbers of files faster than any other file system tested.
To make sure this graph was accurate; I re-benchmarked the ext2 file system again and got nearly the same results. I was surprised to find how much of a performance hit both ext2 and ext3 take during this test.
Finding 10,000 files seemed to be the same except for XFS; however directories are definitely handled dif
XFS, solid (Score:2, Informative)
Because of hardware/configuration issues, I have had to hard-reboot the laptop countless times during the months that hardware support in the kernel caught up. (It works pretty well now).
I have never borked my filesystem.
Re:So why does RedHat/Fedora continue to push EXT3 (Score:4, Informative)
So, what I usually do when installing a new copy of Fedora or Redhat is to drop to a console, and use fdisk + mkfs.jfs manually. Then, when I get to the right page in the installer, I can simply set it to not reformat the partition but to use it as the "/" mount point, and voila -- my computer has JFS.
ext3 options (Score:5, Informative)
"data=writeback" mode does no data journaling, only metadata journaling, and you would probably see better performance here. Although, you could lose data in the event of a power outage (no fun). Same thing applies to XFS, JFS - you could lose data because only metadata is being journaled, not real data.
"data=ordered" mode - inbetween, still no data journalling, but there are provisions that make it less likely to lose data in the case of a power problem. It has something to do with the way it journals the metadata and the way the filesystem interacts with the disk that makes is a little slower than data=writeback but also a little more secure than data=writeback if you get a power outage.
"data=journal" mode - this journals data and metadata, and with the exception of a few situations, is the slowest. The least likely to lose your data, but also much slower.
I am assuming, or at least it looks like, these tests were run with the default data=journal - so to be fair, they should have been run in data=writeback, or maybe even all three modes. Again, all you have to is specify in
It would probably be better to compare the ext3 in data=writeback mode.
Re:It works for mine! (Score:3, Informative)
I've personally had several friends tell me of their data loss with ReiserFS. No one's arguing that it's a horrible file system, only that it still experimental.
The typical data loss situation is a power loss in the middle of a write. ReiserFS might be atomic in operation, but it still can't dodge hardware failure at that level.
I use ext3, and I've been happy. ReiserFS is definitely not an appropriate default partition type at this time.
-Erwos
Re:Defragmenting filesystem? (Score:3, Informative)
And even if someone does put their
Re:It works for mine! (Score:4, Informative)
My understanding is that journalled means the FS can't get into an inconsistant state and so does not need fsck-ing. It does not mean your data is safe from having the power pulled halfway through a write. If you want a super-safe home area you want some sort of logging FS and these are all far slower than journalled (I think).
Re:Your graphs are unreadable (Score:2, Informative)
Re:Hrmmm (Score:3, Informative)
There is: gnuplot [gnuplot.info], an utterly wonderful little program. I use it all the time.
Re:The tests don't show everything (Score:3, Informative)
Re:Your graphs are unreadable (Score:5, Informative)
I did some too (Score:5, Informative)
* Ext2 is still overall the fastest but I think the margin is small enough that a journal is well worth it
* Ext3, ReiserFS, and XFS all perform similarly and almost up to ext2 except:
o XFS takes an abnormally long time to do a large rm even though it is very fast at a kernel `make clean`
o ReiserFS is significantly slower at the second make (from ccache)
* JFS is fairly slow overall
* Reiser4 is exceptionally fast at synthetic benchmarks like copying the system and untaring, but is very slow at the real-world debootstrap and kernel compiles.
* Though I didn't benchmark it, ReiserFS sometimes takes a second or two to mount and Reiser4 sometimes takes a second or two to unmount while all other filesystem's are instantaneous.
Whole thing available here [cmu.edu]
Re:Your graphs are unreadable (Score:2, Informative)
Re:Since article has been ./ed.... (Score:3, Informative)
Re:ext3 options (Score:4, Informative)
"I am assuming, or at least it looks like, these tests were run with the default data=journal - so to be fair, they should have been run in data=writeback, or maybe even all three modes. Again, all you have to is specify in /etc/fstab and reboot, no big deal."
And where do you get the idea that this is the default? According to mount(8):
What I really would have liked to see on this benchmark is ext3 on 2.6 with dir_index enabled. (Maybe this would also gain the benefit of the Orlov allocator? I haven't been paying attention to what's been backported.) ...In fact, I would have liked to see this whole thing on 2.6.
Re:Results questionable (Score:4, Informative)
2nd Opinion... (Score:2, Informative)
There is also some commentary and recommendations based on their results.
One more note about the tool... it's not well documented but works well when configured... note that you need a kernel that supports the filesystems to be tested (duh!), Use python 2.2, the database schema is somewhere in the comments.
Re:Your graphs are unreadable (Score:2, Informative)
Re:Your graphs are unreadable (Score:1, Informative)
Not every graphics program can export to PNG, and the ones that do cannot always do it in a way that is smaller than their GIF export. GIF is a much older and better supported standard, which is why most people use it over PNG.
Re:Not a clear winner (Score:5, Informative)
Re:Your graphs are unreadable (Score:1, Informative)
Did you check transparency in Windows Explorer? As far as I know, it's not supported, unless you play some weird tricks. Or maybe you were making a joke, and I didn't get it.
PNG transparency works fine in IE as long as you don't try to do partial transparency. For simple on/off transparency (the same as what GIF offers), there are no problem with IE5 and up.
The article you linked to was describing a solution to getting partial transparency working.
Re:Speed means absolutely nothing (Score:3, Informative)
Reiserfs +IBM HD's = hair loss
Re:Best Filesystem for Production System (Score:3, Informative)
Please read this [oracle.com]
Just to show that it depends upon the application you need to run.
Now, you will not hear me say that you should not use ReiserFS, for a desktop it is probably the best choice, but for servers, please think again.
In addition to that, my own experiences with ReiserFS are mostly positive, especially on my 233 Mhz laptop, but I have also a big system with a Promise SX6000 raid controller, where I had a partition with ReiserFS, ext3 and JFS using Red Hat 9. Everytime I did file operations using ReiserFS I got problems with the consistency of my RAID 5 array, so I scrapped ReiserFS and only kept ext3 and JFS, which give me no problems anymore.
Re:Your graphs are unreadable (Score:4, Informative)
2) Who cannot see PNGs? IE supports them, Opera supports them, Netscape/libpr0n [libpr0n.com]-based browsers all support PNGs. Hell, even Links 2 when run in X or svgalib supports PNG.
Re:So why does RedHat/Fedora continue to push EXT3 (Score:4, Informative)
You're making things unnecessarily complicated. At the "boot:" prompt, just type "linux jfs". The graphical installer will then offer it as an option. Works with reiserfs, too.
Re:Your graphs are unreadable (Score:5, Informative)
More precisely, the PNG need to be in indexed mode (aka PNG8) for full transparency to work in IE. In The GIMP, click the "Image" menu, "Mode", "Indexed".
Some myth ("IE don't support PNG !!!") really die hard.
EXT3 and *BIG* filesystems (Score:2, Informative)
# df -k
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda6 4032092 548856 3278412 15%
/dev/sda1 505605 19171 460330 4%
/dev/sda7 41729164 31165820 8443572 79%
none 2000568 0 2000568 0%
/dev/sda3 10080520 6437952 3130500 68%
/dev/sda2 80636072 45306536 31233364 60%
/dev/sdb1 1600772128 1462009904 138762224 92%
/dev/sdc1 1600772128 760247416 840524712 48%
PS. No, it's not pr0n
Re:Not a clear winner (Score:3, Informative)
"Overall Ext3 was disappointingly slow surprisingly often."
I disagree, plus this test is obsolete, why did he use a 2.4 kernel?!
from: http://freshmeat.net/projects/linux/?branch_id=463 39&release_id=160407 [freshmeat.net]
"Linux 2.6.6
...ext2 and ext3 filesystem performance was significantly improved.
"
[...]
Changes:
And thats just from today's kernel release. What about all the changes between 2.4 and now?
Considering the conveniance of backward compatability, and the fact that ext3 wasn't the worst in every category, and it looks like maybe uses less cpu than some, it seems like ext3 is the hands-down winner of the test, not the looser. ext3 did as well in tests that IMO represent everyday use. Who creates 10k files in a folder? I would have liked to see a linux kernel COMPILE using the fs. Thats something we all could appreciate.
Re:Defragmenting filesystem? (Score:3, Informative)
Why is it hard to understand that UNIX filesystems are designed to have minimal impact on performance due to their design?
These are multiuser operating systems that are designed to make frequent requests from multiple users at any given time. Things *are* going to be strewn across the drive, but there is a reason that there is no noticable impact on performance.
UNIX filesystems are engineered to avoid appending old files and scattering data about in the same manner that MSDOS and Windows FAT filesystems do. These filesystems don't fill every single free "crack" on the drive in the way that MSDOS filesystems do. FAT filesystems are designed to write into the first available location, or "hole", often spanning across several of these as well, for writing a single file. This is what causes the "fragmentation" on a Microsoft filesystem. The clustering algorithms that UNIX/Linux machines use use help to prevent "fragmentation", by which Windows users expererience.
Bear in mind that FAT/FAT32 was based off of a design that was optimized for writing small amounts of data to *floppy disks* and small capacity drives with very limited amounts of space. Later in the life of DOS and Windows, the "fragmentation" issues became terrible, typically as drive capacities got to be larger and filesizes increased by a great degree. NTFS resolves many of these issues, but still carries a few of the FAT traits in spite of it being a totally different filesystem (based off of HPFS). Potentially, it still writes to tiny, empty, blocks of free space, but its tree-based structure doesn't limit the performance due to "fragmentation" like we experienced on DOS/Win9x. However, I think that the biggest problem on Windows machines is the way drives are typically partitioned, more than anything else. Things get removed and installed frequently, to the same locations of the drive, with user-created data overlapping the locations of important system and swapfiles. NTFS, in most respects, doesn't actually need defrag. In fact, when I ran Windows 2000 for a few years, defrag provided almost no improvement, at least not to the same degree as it did on Windows 98. You can defrag all you like, but it's unlikely that even an NTFS partition will experience more than 3-5% total fragmention.
I hope that is "logical enough" for you. I think that, perhaps, you need to ditch the old DOS/Windows "I MUST DEFRAG" mentality in order to really understand this. Filesystems (especially journalling types) have greatly changed since the days of DOS.
"worst chartmaking...ever" (Score:3, Informative)
Re:Best Filesystem for Production System (Score:3, Informative)
It looks like Oracle has gotten the same basic results as the PostgreSQL Global Development Group have. JFS and ext3 are generally the fastest under a database, while XFS and Reiser seem to be pretty slow.
The odd thing here is that most other tests show XFS and Reiser as the kings. But the kind of random access databases do seems to be better handled by JFS / ext3.
The problems with your RAID consistency, were these file system problems, or RAID level problems? If they're RAID level problems that would seem to point at your RAID controller, as no file system should be able to bonk a RAID array on the head, since it doesn't really have that kind of access to it.
Re:Not a clear winner (Score:4, Informative)
Reiserfs can at least be accessed [p-nand-q.com] under Windows.
My personal peeve with ReiserFS is, though, that I've had the main ReiserFS partition on my Laptop completely destroyed by a simple power failure once. Many files were in lost+found afterwards, but some had their contents destroyed. (And restoring files by looking at their contents is not fun for ~1000 files...) So I've kinda lost trust in it. ext3 may be slow, but I've never had a single problem with it. Seems very robust to me. So, reliability is more important than speed for me (whoever needs performant servers is of course entitled to a different opinion). XFS and JFS seemingly can't be accessed from Windows, so that is a Minus for some.
Re:'Tis a dupe (Score:2, Informative)
large file support (Score:4, Informative)
Here's some real world information on the state of large file support in 2004. Filesystem driver support is the least of your worries -- almost any linux filesystem you can think of (except for maybe umsdos) supports over 2GB files at the kernel level. The Linux LFS [www.suse.de] page, dated April 2004, contains reasonably updated information on large file support in linux.
The bigger problem is that many userspace applications cannot yet read or write to the large files. This failure arises from non-use of the LFS API combined with just plain unfortunate use of a signed 32-bit int in the wrong place. So for instance mkisofs will reject all input files larger than 2GB in size, and cdrdao will simply abort at 2GB if you try to rip a DVD larger than 2GB in size. In some extreme cases there are programs that can't even handle large files off of the disk; one example is
wget http://mirror.linux.duke.edu/pub/fedora/linux/core /test/1.92/i386/iso/FC2-test3-i386-DVD.iso
which fails spectacularly on any x86 linux system (hint: the DVD is not really 84MB in size). In general, the "core" system utilities such as dd, cp, mv, cat are fully compatible with large files whereas third party applications are much more hit-or-miss.
Even today, by far the most practical solution to large file woes is to migrate to a 64-bit system, the most affordable of which is AMD64 by a long shot. I've been using an Athlon 64 system for the past few weeks and it has handled large files perfectly in all respects so far.
Re:The conclusion (Score:3, Informative)
Re:Not a clear winner (Score:1, Informative)
Re:Not a clear winner (Score:3, Informative)
I've been experimenting by using ReiserFS on and off for the last 3 years or so. I've always shied away after a few failures.
My scorebook so far:
- Laptop
- Two various machines at previous work got open files in
- Mums computer. Had to travel 500km to fix a reiserfs fuckup due to repeated power failures.
- Dad's laptop got a partition trashed by reiserfs when he forgot to put in his power cord and the battery time were used up.
Reiserfs is the single most unstable piece of shit of a filesystem I've ever had to deal with. No, I'll not be using it again anytime soon.