A Better File System for Linux?

Guest post by Rahul 2008-10-31 Linux 29 Comments

InternetNews talks to developers and vendors about the rise of Btrfs as a successor to Ext4. Though Ext4 adds extents, Chris Mason, Btrfs developer noted that BTRFS adds a number of other features beyond that. Among those features are items like snapshotting, online file consistency checks and the ability to perform fast incremental backups. BTRFS (pronounced better FS) is currently under development in an effort led by Oracle engineer Chris Mason. With the support of Intel, Red Hat, HP, IBM, BTRFS could become the engine that brings next generation filesystem capabilities to Linux.

29 Comments

2008-10-31 7:55 pm
unavowed
Does anyone know if this new file system has the feature that ext2 had, namely the recovery of unlinked files?
Edited 2008-10-31 19:57 UTC

2008-10-31 8:10 pm
poundsmack
once it hits 1.0 this file system will make you breakfast if you want it to. its going to be amazing. with huge colaboration by vendors and distrobutions like RedHat, this is going to be the future of Linux FS. I for one welcome it, its about time ZFS had some competition (though they have a ways to go)

2008-10-31 8:18 pm
kwag
Even when it reaches v1.0, BTRFS will be far behind ZFS.
Just look at the features.
Also there are no current plans for transparent compression on BTRFS, and I consider that a must for many cases.
Not to mention the ZFS syntax is so dead plain simple.
Not so for BTRFS.
I run ZFS on Linux (for a long time now!) via FUSE, and I’m very happy with it
Edited 2008-10-31 20:19 UTC

2008-10-31 8:48 pm
poundsmack
oh i never said it was going ot be as good as ZFS (as i love ZFS with a pasion) only that it will be nice to have some ceompetion to ZFS and NTFS (yes NTFS is a very good file system, i don’t care that its an MS product, its till good).
2008-10-31 9:56 pm
diegocg
Even when it reaches v1.0, BTRFS will be far behind ZFS. Just look at the features.
Sure it won’t implement all the “advanced” features, but it will implement all the features people cares about – mutiple device management, raid-z, selfhealing, cheap snapshots. And other features (like I/O priority) don’t even need to be implemented by btrfs because they are already implemented in the generic block layer.
Also there are no current plans for transparent compression on BTRFS
Transparent compression on btrfs was merged two days ago – http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-unstable-stan…
Edited 2008-10-31 21:57 UTC

2008-11-01 2:31 am
kwag
“Transparent compression on btrfs was merged two days ago – http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-unstable-stan…
Thanks for that info!
I didn’t see that because I’m syncing to the stable branch.
2008-11-01 1:29 pm
Weeman
And other features (like I/O priority) don’t even need to be implemented by btrfs because they are already implemented in the generic block layer.
IO priority would still be better implemented in the filesystem itself if you value performance. That way, the scheduler can do its work better since it’s aware about the state of the filesystem and its structures, can reorder things better than a generic layer unaware of anything and just getting a list of requests.

2008-11-01 2:35 pm
zdzichu
Also there are no current plans for transparent compression on BTRFS, and I consider that a must for many cases.
You are behind the times.
http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg00887.ht…

2008-10-31 8:17 pm
Hein-Pieter van Braam
Probably not, but you’ll have snapshots, and probably ways to roll back in time.
you should think of btrfs more as a database than a filesystem, although that’s not ENTIRELY true either.

2008-10-31 10:05 pm
StuffMaster
It is a terrible shame that ZFS is even better, and also more mature, but won’t get into the linux kernel.
Edited 2008-10-31 22:06 UTC

2008-10-31 11:27 pm
1c3d0g
Right on! It’s exactly these things that make the Linux community look childish and ignorant of other proven technologies. Until they get over themselves and start implementing what their users (not only developers!) want, Linux will always remain a niche, at least in the desktop arena.
Then again, this happens to all O.S. to some degree. I still can’t understand why Microsoft has taken a perfectly good-working TCP/IP stack from BSD and botched it up with inefficient, proprietary code and implemented that mess in Vista…

2008-11-01 12:10 am
Rahul
ZFS is licensed under CDDL. This license is GPL incompatible and apparently deliberately designed to be so, because some of Sun folks were worried about losing their technology to Linux and some developers threatened to quit if their work was put under GPL (by far the most used foss license. refer http://www.dwheeler.com/essays/gpl-compatible.html ).
You can find a reference in this video linked from
http://lists.debian.org/debian-devel-announce/2006/09/msg00002.html
Linux developers cannot possibly just take code and ignore the licensing incompatibility. So they are being prudent and implementing things from scratch in Linux taking advantage of existing Linux kernel API including the vfs and block layer. Yes, it is unfortunate that license incompatibility between free and open source software prevents it from being reused but when vendors do it deliberately, not much can be done about it

2008-11-01 3:07 am
TechGeek
There is also the little question about who actually invented ZFS. Sun and NetApp are in a lawsuit over the patents on the file system right now. Even if Sun licensed ZFS as GPL, it would be pretty dumb to put it into the kernel until the legal questions are settled.

2008-11-01 1:36 am
g2devi
But BTRFS developers *are* implementing what their users want. Look at all the companies involved. Linux has a different user base than Solaris and as such, have different priority. In the Solaris world, staying with the same OS version for 10 years is not unusual. In the Linux world, staying with the same OS version for 4 years is. Neither OS is wrong for its focus. They just fill different niches.
2008-11-01 10:35 am
PlatformAgnostic
Just a note on Windows TCP/IP: It’s not related to any BSD or other networking stack out there. There has never been any BSD code in the networking stack in any eternally released and commonly used versions of Windows.

2008-11-01 11:34 am
Detlef Niehof
(…) There has never been any BSD code in the networking stack in any eternally released and commonly used versions of Windows.
Interesting information. I believe you mean “externally” instead of “eternally”, right? Any idea how this myth (that Windows contained BSD networking code) came about?

2008-11-01 12:33 pm
ba1l
Most of the command-line network tools that come with Windows were derived from BSD versions. So utilities like ping, traceroute, nslookup, telnet, and so forth. That’s the only trace of any BSD code in the network stack of current Windows versions.
Apparently, these were originally provided by a company called Spider Systems, and were ported by Microsoft to use the Winsock API instead of the BSD sockets API.
That same company also provided the initial TCP/IP stack used in Windows NT 3.0. Apparently, this stack was basically parts of the BSD TCP/IP stack, ported to run on top of an abstraction layer. Microsoft incorporated that into NT 3.0, because they decided to include TCP/IP support far too late to develop their own.
They did develop their own before NT 3.5 was released, and that one was in use up to Windows XP. Windows Vista contains yet another new TCP/IP stack.
It’s not really a myth – one version of Windows NT certainly did contain BSD network code, but it’s since been replaced. Twice. There might be some remnants of it, since there’s little point in throwing out working, well-tested code, but there’s no way to know without looking at the source code.
2008-11-01 2:29 pm
PlatformAgnostic
Yeah.. there’s a pretty detailed description of it at http://www.kuro5hin.org/?op=displaystory;sid=2001/6/19/05641/7357.
The basic gist is that in pre-release versions of NT, some BSD code may have been used in boot-strapping the networking effort. This code was accessed through a wrapper which is now long-disused (and probably not even present anymore) and was entirely replaced in NT 3.5, the first externally (eternally, even) released NT, by a significantly different Microsoft stack.
Some of the userland network tools are ported bsd code (command line ftp being an example), so maybe that’s the source of this rumor.

2008-11-03 4:42 pm
FunkyELF
It is a terrible shame that ZFS is even better, and also more mature, but won’t get into the linux kernel.
Two different implementations. ZFS is one huge piece of software that does everything. Much of what ZFS does is already implemented in a modular way in the Linux kernel.
The architectures are completely different.
If Oracle was trying to create a huge monolithic system then you’d have reason to complain…but I welcome options like this.

2008-11-01 4:39 am
transami
Hey FS man! Can you spare a brother 32 bits?
All these fancy features, and still not the one feature I’ve been waiting to see for decades –a type field in the inode.
Every file has a type, and with the exception of executables we add archaic .xyz extensions to the names of our files, and start them off with an archaic “!#/usr/bin crazy” shebang line, or emacian -*- stuff -*-, and depend on magic mime to figure it out by content. Isn’t it time for something a little more reliable? I for one would really like to get out of the 1980s.

2008-11-01 1:37 pm
siride
A file type field isn’t any more reliable than a magic number or an extension. And who is going to decide what the file type field should be? That’s right, the same programs that would decide what the extension or magic number would be.
2008-11-01 3:13 pm
thecwin
What about storing mimetype in the extended attributes? I think this kind of stuff is being worked on by freedesktop.org.

2008-11-01 5:15 am
MrVain
It is funny that some Linux people thinks that preventing silent corruption is not important, when discussing ZFS. Now that BTRFS will support that feature, I promise you that soon these people will say it’s the most important thing since sliced bread.
Some would say they are funny. Other would call them just plain dumb. I think they are funny. Soon SEGEDUNUM also will change his mind and praise prevention of silent corruption. But not until Linux has that feature. Until then, detection and prevention of silent corruption is not important.

2008-11-01 3:31 pm
segedunum
Another article about Linux filesystems and BtrFS, yet more comments about ZFS :-).
It is funny that some Linux people thinks that preventing silent corruption is not important, when discussing ZFS.
Well, it is and it isn’t. Silent corruption is generally the result of problems elsewhere, and in the case of Solaris, usually its the pretty arcane and old device drivers. Unless you’re prepared to use ZFS to find out what went wrong and to fix it, the frenzied excitement over ZFS detecting ‘silent corruption’ is pretty laughable.
In the case of silent corruption as a result of hardware, well, you aren’t going to recover anything if you don’t have redundancy, regardless of whether you run ZFS or not. In any event, you want to solve the issue rather than detecting it and thinking you’re brilliant, and either the hardware gets fixed or you move to something else. Either way, it’s a hardware issue and drives in particular need to get better and do their own data integrity checks. That issue will not change with a new filesystem.
Now that BTRFS will support that feature, I promise you that soon these people will say it’s the most important thing since sliced bread.
I hope not, because it isn’t.
I think they are funny.
I think you’re funny. If ZFS (or BtrFS) detects silent corruption then you need to do something about it. If you can’t, and in most cases you can’t really fix it because it’s an esoteric hardware or driver issue, then the feature is essentially useless. You will forever be firefighting and without redundancy you will lose data regardless.
In many ways, its a feature more useful to pass on to kernel developers and hardware manufacturers because they’re the ones who can do something about it. I suppose those who can make best use of it will be those with the better development community ;-).
Soon SEGEDUNUM also will change his mind and praise prevention of silent corruption.
Thanks for mentioning me by name, and in capitals no less :-). No, I’m afraid I’m not going to do that for reasons I have described ;-).
But not until Linux has that feature. Until then, detection and prevention of silent corruption is not important.
I’m sorry, but while some form of silent corruption detection is nice, your problems have only just started. I feel for you that silent corruption detection just hasn’t generated the level of excitement intended with ZFS, and henceforth Solaris, but there you are.
Edited 2008-11-01 15:34 UTC

2008-11-01 4:07 pm
WereCatf
Well, it is and it isn’t. Silent corruption is generally the result of problems elsewhere, and in the case of Solaris, usually its the pretty arcane and old device drivers. Unless you’re prepared to use ZFS to find out what went wrong and to fix it, the frenzied excitement over ZFS detecting ‘silent corruption’ is pretty laughable.
Well…detecting corruption is always useful. It means there’s an issue somewhere, most likely hardware, and if it is indeed a hardware issue it’s good to know about it before it gets worse. The ZFS ability to sometimes fix the corruption is good, but I do consider it more useful to have a warning about malfunctions.
On the other hand, if your system finds corruptions, fixes them, and you are just glad it got fixed and continue whatever you did without checking what the issue is…then you are quite ignorant and in a risk of losing a part or all of your data on the disk. Trusting it to get automatically fixed without your intervention is foolish.
2008-11-01 7:33 pm
Arun
In the case of silent corruption as a result of hardware, well, you aren’t going to recover anything if you don’t have redundancy, regardless of whether you run ZFS or not. In any event, you want to solve the issue rather than detecting it and thinking you’re brilliant, and either the hardware gets fixed or you move to something else. Either way, it’s a hardware issue and drives in particular need to get better and do their own data integrity checks. That issue will not change with a new filesystem.
Please elaborate how you go about fixing something you can’t even detect?
I think you’re funny. If ZFS (or BtrFS) detects silent corruption then you need to do something about it. If you can’t, and in most cases you can’t really fix it because it’s an esoteric hardware or driver issue, then the feature is essentially useless. You will forever be firefighting and without redundancy you will lose data regardless.
Again if you have no clue it is happening how do you go about fixing it?
In many ways, its a feature more useful to pass on to kernel developers and hardware manufacturers because they’re the ones who can do something about it. I suppose those who can make best use of it will be those with the better development community ;-).
Again if the end user can’t detect corruption since it is “silent” how does one inform the developers or hardware guys?
I’m sorry, but while some form of silent corruption detection is nice, your problems have only just started. I feel for you that silent corruption detection just hasn’t generated the level of excitement intended with ZFS, and henceforth Solaris, but there you are.
The only place it hasn’t generated any excitement is in your head. Everyone else including BTRFS developers and Gentoo developers took notice and are doing something about it. Meanwhile you are still trumpeting a very flawed view point.
http://storagemojo.com/2007/09/19/cerns-data-corruption-research/
CERN did silent data corruption research and found it to be a major problem.
Here is the link to the paper:
http://indico.cern.ch/getFile.py/access?contribId=3&sessionId=0&res…
Edited 2008-11-01 19:50 UTC

2008-11-02 10:06 pm
segedunum
Please elaborate how you go about fixing something you can’t even detect?
The odds are that you can’t fix it anyway, so you’re screwed either way – unless you have redundancy regardless. If you have redundancy then that’s the only way you will be OK. Either that or you use an OS with decent device drivers or change your hardware ;-).
Again if you have no clue it is happening how do you go about fixing it?
You use something that works to start off with, or you switch mighty quick. Whatever happens, you’re screwed without redundancy no matter what silent corruption you think you’ve detected, or what you can’t detect either.
Again if the end user can’t detect corruption since it is “silent” how does one inform the developers or hardware guys?
Users can’t inform the developers or hardware guys because you won’t know what has been detected at all and won’t be able to give them anything. It will only be useful as a troubleshooting tool for developers to work through something that is reproducible.
The only place it hasn’t generated any excitement is in your head. Everyone else including BTRFS developers and Gentoo developers took notice and are doing something about it. Meanwhile you are still trumpeting a very flawed view point.
Yer, it’s created a moderate amount of excitement – amongst developers. They will use it to recreate scenarios where corruption has taken place, find out what has happened in various device drivers and/or inform hardware manufacturers what has been going wrong with their hardware. Users will feel the indirect benefits of it but they will carry on as they have always done because they can’t do much about it.
CERN did silent data corruption research and found it to be a major problem.
Good for them. Did they actually go through 20TB of their own data and actually find out how much corruption they had, did they actually fix anything or did they just produce some numbers? 😉
Edited 2008-11-02 22:06 UTC

2008-11-02 10:46 pm
Arun
Please elaborate how you go about fixing something you can’t even detect?
The odds are that you can’t fix it anyway, so you’re screwed either way – unless you have redundancy regardless. If you have redundancy then that’s the only way you will be OK. Either that or you use an OS with decent device drivers or change your hardware ;-).
Don’t beat around the bush answer the question.
How do you know to change your hardware if you can’t detect the problem in the first place. Wait for lots of data to get corrupted over days and months? When that happens your backups get corrupted too because you didn’t know you had corruption with it being “silent” and all.
Again if you have no clue it is happening how do you go about fixing it?
You use something that works to start off with, or you switch mighty quick. Whatever happens, you’re screwed without redundancy no matter what silent corruption you think you’ve detected, or what you can’t detect either.
Answer the question. If you can’t then say so. How do you know it works to start of with if you can’t detect any corruption that might be happening?
Again if the end user can’t detect corruption since it is “silent” how does one inform the developers or hardware guys?
Users can’t inform the developers or hardware guys because you won’t know what has been detected at all and won’t be able to give them anything. It will only be useful as a troubleshooting tool for developers to work through something that is reproducible.
Again no real answer. How do you know something bad is happening if you have no way to detect it?
Most people have no idea how to fix their cars. But car manufacturers put diagnostic information on dashboards so end users can know something is wrong and take it to get fixed.
Your silly argument is that since most people can’t fix their own cars there should be no sensors or On Board Diagnostics in production cars. Information symbols, like low coolant, check engine lights, codes etc that tell people that something is wrong shouldn’t exist in the field. Furthermore only the manufacturers of cars should use the system when they develop cars because that is the only place errors happen.
People with real world experience will realize that hardware can go bad over time while in use. Especially disks. Not all issues are bugs in hardware. Bugs in FW and drivers are also important. If you have ever done system level software development you would know that some bugs are damn near impossible to reproduce in development environments and only happen in corner case scenarios in the field. In such cases having a indication that something bad is happening can be invaluable.
Linux’s development cycle wouldn’t include beta/unstable releases for people outside the core developers if all bugs could be found in the handful of configurations developers have access to. Bugs in linux would never exist after the final release phase because all permutations would have been tired in Beta. If you believe that I have some really bad news for you about Santa Claus.
The only place it hasn’t generated any excitement is in your head. Everyone else including BTRFS developers and Gentoo developers took notice and are doing something about it. Meanwhile you are still trumpeting a very flawed view point.
Yer, it’s created a moderate amount of excitement – amongst developers. They will use it to recreate scenarios where corruption has taken place, find out what has happened in various device drivers and/or inform hardware manufacturers what has been going wrong with their hardware. Users will feel the indirect benefits of it but they will carry on as they have always done because they can’t do much about it.
Wrong those features are for end users. If it was only for developers then it would only exist in debug builds. But it is an advertized feature for end users. Oracle is developing BTRFS, they obviously saw the need and are funding a completely new file system. If existing technologies sufficed why would they spend the money. Unlike what you say it isn’t some independent OSS developers wet dream project.
Go ahead and show me proof that BTRFS’ checksums are only meant for developer debugging.
CERN did silent data corruption research and found it to be a major problem.
Good for them. Did they actually go through 20TB of their own data and actually find out how much corruption they had, did they actually fix anything or did they just produce some numbers? 😉
Yes and they developed all sorts of application level error checking because no suitable OS feature existed. If you had read the PDF you would have known not to ask what was already in the paper. Goes to show you are not really here to provide an meaningful discussion on filesystems and data integrity. Just like you troll on every Sun and ZFS related topic with your silly theory on data corruption.
Your warped view of the perils of Silent Data corruption notwithstanding. Most sane people realized the need to detect it. I am glad those people are actually doing some thing to fix it (BTRFS).
The bottom line is this discussion is about if filesystems should have measures to detect corruption. Nothing to do with ZFS especially since BTRFS is advertizing the same capabilities for linux. Your contention is that they shouldn’t. Yet you can’t seem to answer a simple question.
How do you know you have problem is no mechanism exists to detect it?
Edited 2008-11-02 22:48 UTC

2008-11-01 8:13 pm
renox
This is a ‘fanboy’ way of thinking which isn’t specific to Linux..
I agree it’s quite annoying though.