The choice of filesystems on Linux is vast, but most people will stick with their respective distributions’ default choices, which will most likely be ext3, but you’re free to use ReiserFS, XFS, or something else completely if you so desire. Things are about to change though, with btrfs just around the corner. To bridge the gap between now and btfrs, ext3 has been updated to ext4, which adds some interesting features like extents, which are already in use in most other popular file systems. Phoronix decided it was time to do some performance checking on ext4.
After testing, Phoronix concludes that ext4 isn’t an enormous leap forward compared to other file systems, but that it did show a performance advantage over ext3 in pure disk benchmarks – but as they state, that only helps if your job is to run disk benchmarks day-in, day-out (quote of the day, I must say). However, that is not to say that ext4 isn’t worthwhile – there are more important things than raw performance when it comes to file systems.
EXT4 is clearly a significant improvement over EXT3 when it came to the pure disk benchmarks, however, in the real world, saying better performance should not be used as a reason to replace your existing EXT3 or XFS partitions. In our tests that cater towards Linux desktop users and gamers, EXT4 hadn’t delivered a sizable quantitative advantage. That’s not to say though it’s not worth switching to EXT4. EXT4 is more scalable, more efficient through the use of Extents, supports larger disk capacities, can handle twice the number of sub-directories, is capable of handling online defragmentation, and there is improved reliability via journal checksums. What perhaps is more important is that with the addition of these new features, the performance hasn’t regressed. Also, when testing the EXT4 file-system, we hadn’t run into any problems with stability, file corruption, or any other issues.
Ext4 has been available as an option on Fedora for a while now (append the ext4
boot parameter when booting Anaconda), but for the rest, no major distributions are shipping it yet. It will become part of the next Linux kernel release, so we may see an increase soon enough. Mostly, however, as Phoronix notes, adoption of ext4 will depend on how fast btrfs becomes stable and well-tested.
I’d be really curious to see how the performance of whole disk encryption is affected by filesystem choice. I have a 1.5 TB RAID5 (4x500GB SATA2) entirely encrypted using dm-crypt and dm-raid with XFS on the RAID. I used to have the entire system system encrypted but the performance hit was too painful. It would be interesting to see if the filesystem choice would have much impact on that.
Phoronix is on the ball! With all their recent benchmarks I consider them one of the best places on the net to get good third party information. Keep up the good work
The disk i/o benchmarks were certainly interesting, and worth reading the article to see. But when they started throwing lzma, bzip2, lame, video games, and other processor bound “benchmarks” at these filesystems, I went… “what”? They could have made something out of the video game benchmarks, maybe, by checking the time to start the game and load a level, rather than reporting the FPS. If the FPS is affected by the filesystem, its time to find a new gaming house, not a new filesystem!
Looks like ext4 closes the gap with XFS, or now beats it significantly, in all but that surprising 4GB random delete phase. Deletes used to be a real Achilles heal for XFS, and I know the Linux XFS devs put a lot of special effort into improving that situation. But since it is also slower than ext3 in this phase, I suspect something else is going on.
Beating XFS by 25-30% on the 8GB sequential read and 4GB sequential create phases is particularly notable, since large sequential reads and writes were major design goals for XFS.
And all that with the significant additional integrity guarantees that the default “data=ordered” mode (and, I believe, journal checksumming) provides over XFS. Impressive work by the ext4 guys, indeed.
Edited 2008-12-03 23:48 UTC
again no love for JFS and yeah really, whats unreal tournament in 1280×1024 got to do with disk benchmarks.
If people wanted extents they’d be using a filesystem like JFS or XFS already.
I fail to see the need of ext4 to ‘bridge’ the gap until BTRFS arrives.
Someone had an itch to scratch, so be it.
Exactly.. most of these benchmarkers are either Linux ignorant or have an axe to grind. Interestingly, OSNews and LWN both failed to pick up my roundup of Linux file systems. I’m going to do some benchmarking of ext2/3/4, XFS, JFS, Reiser3, and Btrfs when kernel 2.6.28 goes stable.
JFS is an excellent file system. I wish it would get some developer attention, as I’m sure it could be improved quite a bit to take advantage of modern kernel features.
Would be something to read, but one aspect that annoys me about ext3 is when a user comes to me and say, “I have deleted accidentally some important files! Can you help me?”. Do you know if they changed that irritating inode block pointers zeroing on ext4?
I have used ext3grep with a so-so success and some forensics tools from time to time. I know that exist some apps that monitor and save the needed information to restore files but I’m afraid of use them on a corp environment and get affect by some obscure bug and/or instability issues.
Benchmarks are always funny, but those benchmarks are not serious.
When you do benchmarks, you’ve got to put more effort in it.
linux caches a lot in RAM and in SWAP. When you have 4GB of SWAP and 2Gb of RAM, a lot of the changes you made is not commited when the test ends. How much has really been written to the disk depends on a lot of things.
They test too much things at the same time.
Linux caches a lot in swap?
With 2GB of ram and only bonnie++ running, I should hope that swap would not be relevant! (Well, as long as the fs in use is not ZFS. 😉 ) At any rate, bonnie++ fsyncs at the appropriate times. I presume that the other disk benchmarks do so, as well. The other tests, like lzma, were not suitable as disk benchmarks to start with.
Edited 2008-12-04 08:01 UTC
Umm…. You are aware the page cache != swap, right?
You can cache the contents of the swap file, but you -really- shouldn’t swap-out the contents of your cache…
– Gilboa
These filesystem benchmarks are always disappointing because they only use default options. XFS can acheive huge performance increases by increasing the log size, the inode size, and by adjusting the extent size.
Exactly my the same beef I have with them. Many options can be set for various diferent filesystems each of these effecting thier performance. In a production server environment, the defaults are usually going to get changed for maximal performance on the desired workload.
“””
Exactly my the same beef I have with them. Many options can be set for various diferent filesystems each of these effecting thier performance.
“””
Using the defaults for these benchmarks is reasonable and beneficial. One significant problem I’ve noticed with FOSS is a tendency to ship with really sucky defaults. Setting proper defaults is not given a high priority because FOSS users are assumed to be smart enough to modify them. (Hi PostgreSQL! Glad you’re doing better now!) If XFS ships with poor default settings the devs deserve to be beaten over the head with benchmark losses until they notice the problem. This actually *benefits* users rather than just giving them the warm fuzzies with a reassuring published benchmark result. Most people trust the defaults. And if data integrity is important, with good reason. The default config is the most thoroughly tested configuration. I recall a case a few years ago where actual data corruption could occur on ext3 filesystems. However, as it turned out, the bug only affected people who mounted their filesystems with “data=journal”. This turns on full data journaling, at significant expense to performance, and is thus only used by people who consider data integrity to be of paramount importance. Oops.
So, obviously, both the devs and the users must weigh the pros and cons of settings which potentially affect stability, even in nonobvious ways. Ext3/4 writes can likely be sped up with data=writeback. But I don’t actually run machines that way, since they would then only provide the same data integrity guarantees as XFS does… and because I’m a big believer in using defaults unless there is a clear need to vary from them. Mess with them and you essentially become your own beta tester.
Edited 2008-12-04 16:41 UTC
XFS doesn’t ship with poor default settings. It ships with settings that aren’t optimal for a desktop system but that’s because it is used often for large disk arrays. That doesn’t mean it isn’t a good desktop file system also but distributions tend to use ext3 as the default filesystem and pretty much ignore all other filesystems. It would be trivial to export optimized desktop settings during install but there aren’t many distributions that even offer XFS as an option during install.
you should post option to improve performance for xfs
“Filesystem performance tweaking with XFS on Linux”
http://everything2.com/index.pl?node_id=1479435
The article has been around for some time but the infromation contained within is quite valid. The author begins with a default XFS setup showing marginal performance and then applies various tweaks by experimenting with differnt values for settings finally resulting in drastic performance improvements.
I wonder when this will become an option for Enterprise Linux distro’s?
First, the hardware setup is pretty strange : they use an old single disk (160GB? are you really still using these?), big processor (dual quad core!) but very little RAM (2GB! for eight cores? that’s ridiculous)… That’s not a proper setup IMHO.
Then from the numbers, it looks like they didn’t run bonnie++ several times and averaged it, but on a single run. However bonnie++ notoriously gives very variable results from run to run; you have to run it with very large file size and at least 8 times for a reliable average.
I’d rather run benchmarks this way :
* a proper RAID array of say 6 or 8 500GB or bigger drives with a 3Ware or Areca controller;
* RAM sized to fit the CPU (1 or 2 GB per core)
* 8 runs with a file size 8x RAM size.
Well I’ll do it someday anyway
Yeah. They shoulda used hardware more like most people have instead of that 160GB disk and 2GB of memory.
Edited 2008-12-05 18:08 UTC