According to kerneltrap: “The future of Reiser4 was raised on the lkml, with the filesystem’s creator, Hans Reiser, awaiting his May 7th trial. Concerns that the filesystem wasn’t being maintained were laid to rest when Andrew Morton stated, ‘the namesys engineers continue to maintain reiser4 and I continue to receive patches for it.‘”
Am I the only one who feels that Linux *badly* needs a new filesystem, and that is a shame that Reiser4, having been declared stable years ago, isn’t still being officially supported?
When I saw that SUSE had dropped Reiserfs (3) as its default filesystem I had the shock of my life.
Needs – sure, any new file system is potentially helpful. But badly? Not so much. There’s a lot of choice in that area.
IMHO, it is a matter of choosing the lesser evil, which in my case is Reirfs3.
But when you think of filesystems like ZFS you can’t help wondering.
Isn’t zfs being ported?
I don’t think so. There would be never ending licensing issues.
Let Sun do all the talking….
http://www.sun.com/emrkt/campaign_docs/expertexchange/knowledge/sol…
there is a fuse based port: http://zfs-on-fuse.blogspot.com/
and while a kernel module isn’t likely at the moment, if you’ve ever used ntfs-3g, a fuse based fs actually works quite well.
“Reiser4, having been declared stable years ago”
Well if it was stable a few years ago, it quickly became less and less stable after that. I gave up on reiser4.. oh probably in early 2006 because it was trashing my data. I haven’t used it since then, though, so reliability has hopefully improved since then.
ext4 is that new filesystem. Hans is trying to sell Namesys to pay his court costs. It is sad that this is how things are working out, but thats what happens sometimes.
http://lwn.net/Articles/187336/ ext4
That could be fine with me, provided it doesn’t take another decade:
“As reiser4 shows, getting a truly new filesystem into the kernel isn’t necessarily an easy thing to do. It may well not happen before large numbers of users start running into the current limits of ext3. So the current set of enhancements will probably find its way in – though what the resulting filesystem will be called is still not entirely clear.”
My experience is that users don’t want to replace their favorite filesystem.
Am I the only one who feels that Linux *badly* needs a new filesystem, and that is a shame that Reiser4, having been declared stable years ago, isn’t still being officially supported?
When I saw that SUSE had dropped Reiserfs (3) as its default filesystem I had the shock of my life.
A new filesystem would be nice but it’s not that important. And we’ll have ext4 soon anyway.
Reiser4 would have been merged by now if it wasn’t for Hans Reiser being so uncooperative. Instead of fixing the problems people pointed out for him he started arguing which led to numerous of flamewars.
I’ve used Reiser4 and while it’s faster at some things it’s also slower at other, and it has an overally higher CPU usage. It’s not some kind of filesystem revolution as some people like to think.
Reiser4 would have been merged by now if it wasn’t for Hans Reiser being so uncooperative. Instead of fixing the problems people pointed out for him he started arguing which led to numerous of flamewars.
That’s not the full story. Reiser4 had it’s integration problems, but many people came up with an awful lot of issues that just weren’t because for whatever reason they don’t want to see it in. The biggest is the non-issue of Reiser4’s plugin architecture, which is purely a matter for Reiser4 itself.
Quite frankly, ext3 and ext4 are just nowhere near the next generation filesystems needed to push things on further, either for the desktop or server, and the discussions around ZFS only start to throw that in sharp relief. The fact that ext4 was pushed into service because of ext3’s problems on large volumes when filesystems like JFS and XFS were built for that job is a joke. I did read some very vague comments on the XFS codebase’s quality, that didn’t amount to an awful lot.
By ZFS, I mean ZFS purely as a filesystem, not some of the non-filesystem stuff it actually does.
Linux does badly need a new filesystem, but unfortunately Reiser4 is barking up the wrong tree and ext4 isn’t really new at all. The design considerations for filesystems are changing, and the lead time for filesystem development is large, about 3-5 years. Before that time, mainstream computers will have various kinds of non-volatile, solid-state cache layers between main memory and the mass storage device, and many portable computers will only use solid-state media for mass storage.
Reiser4 is a filesystem designed for maximum spatial locality of reference. In other words, it’s optimized for read throughput on devices with large seek times, on the assumption that writes can occur asynchronously with respect to the caller. Larger caches and mass storage media with much faster random access latency drastically reduce the relevance of this design consideration.
Other modern filesystem approaches such as ZFS and even NTFS assume that most reads will be satisfied from a cache, and therefore I/O to the backing store will be write-heavy. Accordingly they optimize for write throughput. They also don’t relocate data areas very aggressively, which cuts down on write cycles. This approach is more in line with the trends in storage hardware.
ext4 is a band-aid for ext3 in order to support larger volumes (possibly comprising many physical disks). There are some other minor enhancements here and there. It will not be forward compatible for existing ext3 filesystems, but it should more or less replace ext3 for new filesystems. Because it isn’t very ambitious, it shouldn’t be very long at all until ext4 hits the mainline. This is an example of a new kind of development model for the Linux kernel driven by the needs of commercial vendors. IBM needed larger volumes on Linux, so they made the necessary changes to ext3. They didn’t try to reinvent the filesystem, and they found the kernel maintainers very receptive to this approach.
The most pressing problem for filesystems is consistency and recovery. As disks get bigger, the rate of I/O failures increases along with the time necessary to recover from them. There are two projects for the Linux kernel that are attempting to solve this problem. Chunkfs is a solution that seems destined for the client and entry-level server markets. It is a filesystem that is actually composed of many smaller filesystems, each of which has its own clean/dirty flag and can be recovered independently of the other chunks. The underlying filesystem could be any kind, but it needs to be ported to support chunkfs. This directly attacks the problem of recovery time without attempting to improve consistency.
The other approach, which seems more appropriate for the enterprise space, is doublefs. This is intended to be a response to ZFS without the write latency drawbacks associated with traditional copy-on-write filesystems. The idea is to defer updating the filesystem to a more convenient (idle) time and meanwhile use the out-of-place log as the active store. Reading from the log is slower than reading from the in-place filesystem, but new writes will be cached and the filesystem should be updated by the time these pages are evicted from the cache. You pay a reasonably small penalty in space-efficiency in order to maintain 100% consistency and fast recovery without paying a penalty in write latency. Basically, it’s the perfect marriage of a COW filesystem with a log-structured filesystem, hence the name doublefs.
But like I said, the lead time on new filesystem development is long. I think that an ext2-based chunkfs patch is circulating, but I’m not sure about the current status of doublefs.
Extremely good post, thanks.
When I saw that SUSE had dropped Reiserfs (3) as its default filesystem I had the shock of my life.
I wouldn’t be so shocked. I don’t know which filesystem SUSE switched to, but personally I’ve switched from reiser3 to ext3, which is a really nice FS when the dir_index option is enabled. Not to mention it’s tried and tested.
There are of course other options as well, with xfs and jfs probably being the most popular ones.
About Linux *badly* needing a new filesystem; I think the Linux desktop “badly” needs philosophies and applications that make use of modern filesystem features more than it needs fancy but unused new concepts for storage.
Honestly, I used JFS on all of my Linux systems and have great luck with it. I figure if its good enough for our IBM p5’s here at work, its good enough for my laptop I don’t have anything against ReiserFS/Reiser4, but I don’t think that Linux is hurting for a new FS.
I think what the OP meant was thinking of was something along the lines of the BeOS fs or WinFS.
Well actually the Linux version is a port of OS/2 jfs, not the AIX jfs. See http://jfs.sourceforge.net/project/pub/faq.txt (at the end)
I am using Reiser 3 on Debian for a very simple reason:
I am using LVM, and i is the only FS I have found that will resize hot. I can grow the FS without unmounting it.
FYI.. ext2/3 resize on the fly too using ext2online.
Been using it for a long time here in a large enterprise.
In fact, some kernel hackers have stated that the main issue of ZFS vs linux is an userspace tools interface problem – ZFS has a single very nice tools with a nice interface that puts into your hands all your disks, while linux has way too many tools (lvm, fdisk, fsck…)
So you haven’t found ext2/ext3/xfs?
i thought you couldn’t grow xfs volumes with lvm. i know you can’t with evms.
IIRC xfs volumes _have_ to be mounted when growing them. A bit strange, but it definitely works.
As usual, you can’t make the filesystem smaller, mounted or not.
Redhat ships resize2fs and officially supports it in RHEL5 FYI
ext2online is indeed possible on ext2/3 volumes… HOWEVER.. you MUST have created your filesystems with the following options/features:
-O resize_inode
-E resize
Check the mke2fs man pages for it. Default on most current distributions use these options and allow online resizing towards 1024 * the initial size.
If you want to use it on a ie. pre-3.0-debian created filesystem, you’re in trouble… but generally, all newly created ones support it.
he sounds guilty to me just by going off this http://en.wikipedia.org/wiki/Hans_Reiser#Nina_Reiser.27s_disappeara…
if you want to know why many people isn’t that enthusiast with reiser4’s design
http://lkml.org/lkml/2007/4/23/319
For all I can say, it sure is a killer filesystem
The sole vision of Linus was his very practical aproach of working together on a big project in a manner of many tiny but reality-proven little steps.
But there are revolutionary tasks. Like changing a paradigma and then little steps are not working.
The decades old paradigma of a traditional filesystem let us be used to think that a fragmented namespace with all his difficulties is normal and all the tools to come around these difficulties are a must to work on every filesystem.
Hans wanted to be not only a visionary genious but also financial successful. Therefore his big effort was to optimize his new filesystem (You could hear him say something like: “only the most efficent filesytem will survive”).
But that is not the case! And worse he pointed attention to the minor important direction. The very bad situation is now:
1. Nobody but two developers at Namesys can handle the code
2. Nobody gives attention to the real thing:
one unified namespace!
If the community wants to get ahead the task would be to give some room and many more attention to revolutionary mind breaking possibilities!
Maybe someone will figure that porting HaikuFS (former OpenBeFS) is a really good idea… I mean, someone else than SkyOS guys… =]
While some like to mention BFS often as a possible Linux FS candidate due to its attribute, I personally think BFS is vastly overrated and I use it all day long in anger as well as NTFS on Windows.
BFS has ridiculous limits in these days of near TB sized drives, I’m allowed 4x 32GB partitions on a drive, whoopy. Its performance also suffers when pushed to do really big jobs and the Tracker problems with it make it look silly except for simple jobs. NTFS by comparison is orders faster and always does what I want it to do quickly and efficiently. I would expect any of the Linux Fs to do same or better.
Perhaps BeOS needs to use one of the Linux FSs with the attributes added on in some fashion but then as a previous poster mentioned, disk drives are moving to solid state probably before Haiku is really done. ZFS sounds intriguing though.
“BFS has ridiculous limits in these days of near TB sized drives,”
Here’s another funny bfs bottleneck: lots of files in a directory. Try untar’ing something like the X.org sources and watch performances slow to a crawl.
Performance also drops to a crawl if you have a lot of files with a lot of attributes on a partition. Put a few thousand emails on a disk for a fun experience (fun: 30 seconds to *delete* a file).
Of course, this was a year ago or so, maybe this has been fixed by now.
No it is not funny, it is still like that in 5.03 std, I know of no repair patches to it. I often use net+ve and it fills up its cache folder with 2000+ crap files and then dies (again & again) and most of these are redundant copies. I might want to see what some of the images were, more than 500 files copied to a new folder and BFS appears to lose files. It doesn’t really lose them just makes it look like that (thanks to multithreading). Try renaming them so all jpgs get an extension and it doesn’t usually complete more than 500. And managing folders with 1000s of file items is an experience not to be recommended.
If I did put 100K files on a partition, I would seriously expect BFS to get confused and trash some of them, it has in the past.
And don’t get me started on the use of floating point for desktop coordinates, every 11 files usually ends up in 10 places, double stacking some of them as the coordinates round the wrong way.
Whats so funny is the Scott Hacker book that talks of Multimedia nirvana with TB sized media capability, while BeOS seems to use 64b addresses with bubble sorts. Of course BFS was written in the days of giant 1GB or smaller drives but still all of the issues could have been seen back then.
Still I use it though, just wonder how it would rock with some real modern OS, FS parts, I sure hope Haiku FS is much improved.
That hasn’t been the case on any of the hardware I’ve run BeOS on and I’ve run it on machines as slow as Pentium 100 / 32MB RAM.
Sounds like you may have had a corrupted partition or some sort of hardware problem. My ~/mail folder currently has just under thirty five thousand files and I’ve never seen the performance issues you mention.
Eh? Old versions of the mkbfs command had limitations on the volume size they could create, but that’s not inherent to BFS – I’m posting from an R5 system running on a 120GB partition.
That aside, I’m genuinely curious what you’re using BFS for that requires volume sizes greater than a TB.
Tracker problems, such as?
This looks like it could be a nice filesystem, once it’s complete and thoroughly debugged:
http://www.nilfs.org/en/
Regarding other filesystems… The one with the best overall performance and stability (in my experience) is ext3, but it’s got nasty limits, and lacks some of the nicer features of other filesystems. XFS is a good alternative as of right now, but it’s bloody terrible when you have to search through a lot of small files, and its volume and filesize limits are actually than those of NTFS last I checked.
I did use JFS for a while. It performed better than ext3 for most things, but it had a tendency lose random files (and I mean random, not files that were being written to) irrecoverably when crashes occured, and on several occasions I’ve seen the fsck program total partitions (and boot sectors!) for no discernable reason.
There’s also spadfs, the latest patch set released a couple of weeks ago.
Unfortunately it seems there’s not a lot of interest in file systems of late.
The chunking sounds interesting. I have some ideas about how to make really safe enterprise level higly protected storage, but it’ll take some work to implement, not to mention some machine clustering for HA capabilities.