The next stable update of the Linux kernel will bring advances in file system event monitoring, the Xtensa architecture, and a set of system calls that allows users to load another kernel from the currently executing Linux kernel. While the 2.6.13 โrc are currently being tested, the stable version is expected to be released in the next few weeks.
no need to reboot anymore, not even after a kernel recompile… after all rebooting is about adding new hardware.
hotplug? modprobe?
You can already add new harware in linux without rebooting. Maybe we won’t need to reboot after a kernel upgrade ( clearly not with this release, but probably later )…
It depends on what kind of hardware. If you add new memory, or processors, you still need to reboot.
It would really be great if you could upgrade the kernel without a reboot. This would make it much easier to handle kernel related security fixes without disturbing the production.
Even if the kernel could replace itself with a newer version, I imagine that the kernel subsystems would have to shutdown and kill all the running processes during the switch. Could anyone confirm this?
The result would be that you don’t have to wait for the BIOS tests or the bootloader, but the kernel would still have to go through its halt/boot stages.
Yeah, that’s how I see it.
Exactly right. Seamlessly starting a new kernel without losing the state of running tasks is pretty complicated (lots of state) and is not really the problem kexec is trying to solve.
Kexec does some cunning tricks to get the machine in a state the kernel initialisation code is expecting, then jump to the entry point of the new kernel. It’s like doing a reboot but much faster (particularly on big hardware where it takes a long time to go through the initialisation).
Another thing it can do is jump directly into a new kernel on panic. Why would you want to do this? Crashdumps: you jump straight into a new kernel that’s untainted by whatever caused the panic, which can then reinitialise the disk hardware and write the contents of memory out to disk. This should be much more reliable than provious crashdump mechanisms that relied on drivers *in the kernel that panicked*.
Kexec is cool ๐
“Even if the kernel could replace itself with a newer version, I imagine that the kernel subsystems would have to shutdown and kill all the running processes during the switch. Could anyone confirm this?”
I’m not so sure there is any real demand (read: enthusiasts don’t count) for swapping kernels on a single, uninterruptable host. More likely, the enterprise customer would like to get a second system up and running (and tested) with the upgraded kernel and then instantaneously failover their services to the upgraded host.
Kexec doesn’t save enough state to do this, but XEN does, and Ian Pratt proved at OLS that Linux on XEN can failover a busy apache server in under 200 ms. Now, notice I said a single “host,” not a single physical machine. There is definitely demand for failing over between virtual hosts on the same physical machine.
Actually, the Xen example at OLS was that of *migrating* an Apache server to another host. The server virtual machine remains the same so you can’t use it for kernel upgrades in the server (although you could upgrade the “host” kernel and Xen itself on the target machine before migrating).
It takes a while to migrate its state, butthe actual time when the server is not active (on some host) is about 200ms. Quake 3 server migrates with about 60ms downtime ๐
You are thinking of “hotpluggable” hardware like USB or Firewire stuff. But some hardware is not designed to be added/removed while the computer is running as doing so could mess up the motherboard or confuse the BIOS. IDE devices, monitors, serial devices, PS/2 devices, parallel port devices, PCI cards and RAM are just a few examples of things you should not alter while the computer is powered on.
PCI hotplug has nothing to do with kexec, and is already implemented (still experimental).
$ grep HOTPLUG /usr/src/linux-2.6.11.4-21.8/.config
CONFIG_HOTPLUG=y
CONFIG_HOTPLUG_PCI_PCIE=m
# CONFIG_HOTPLUG_PCI_PCIE_POLL_EVENT_MODE is not set
# PCI Hotplug Support
CONFIG_HOTPLUG_PCI=m
CONFIG_HOTPLUG_PCI_FAKE=m
CONFIG_HOTPLUG_PCI_COMPAQ=m
CONFIG_HOTPLUG_PCI_COMPAQ_NVRAM=y
CONFIG_HOTPLUG_PCI_IBM=m
CONFIG_HOTPLUG_PCI_ACPI=m
CONFIG_HOTPLUG_PCI_ACPI_IBM=m
CONFIG_HOTPLUG_PCI_CPCI=y
CONFIG_HOTPLUG_PCI_CPCI_ZT5550=m
CONFIG_HOTPLUG_PCI_CPCI_GENERIC=m
CONFIG_HOTPLUG_PCI_SHPC=m
# CONFIG_HOTPLUG_PCI_SHPC_POLL_EVENT_MODE is not set
I’m going to focus on Linux, though the same should be possible with any modern OS, though the pain involved may be higher.
You are thinking of “hotpluggable” hardware like USB or Firewire stuff. But some hardware is not designed to be added/removed while the computer is running as doing so could mess up the motherboard or confuse the BIOS.
The BIOS isn’t used after initial boot except for power management and hot-key combinations for laptops. All those pieces of hardware can’t be removed…the rest can, but don’t depend on the BIOS, so there is no ‘confusing the BIOS’ if any other hardware is yanked.
[I debugged and replaced system BIOSes at runtime in a former job.]
IDE devices, monitors, serial devices, PS/2 devices, parallel port devices, PCI cards and RAM are just a few examples of things you should not alter while the computer is powered on.
Easiest things first…
PCI cards: From the current stable Linux kernel (2.6.12.5 Documentation/pci.txt);
“remove – Pointer to a function which gets called whenever a device being handled by this driver is removed (either during deregistration of the driver or when it’s manually pulled out of a hot-pluggable slot). This function always gets called from process context, so it can sleep.”
Parallel / Serial ports: If it’s on a PCI card…
The harder ones…
Processors: CPU1 and later … can be disabled and removed in a live system. Don’t know about CPU0. Probably can, though I couldn’t find any reference in the kernel docs and yanking CPU1 would be a substantially harder trick to pull off.
Ram: Can’t find a current reference. There was some work done on hotadd and removal of system memory.
Monitors: As in displays or hardware monitoring devices? If displays, just unload what is using the monitor and pull the hardware…or even easier just yank the cord. If hardware monitoring devices…same deal (yank a drive with SMART support, and SMART can’t monitor the device…but won’t crash when it goes away). Some hardware monitors can’t be removed since they are part of the system chipset, though that has more to do with hardware design and not the capabilities of the OS.
PCI, memory and CPU hotplugging still requires hardware that’s capable of it, though (i.e. high-end hardware).
PCI, memory and CPU hotplugging still requires hardware that’s capable of it, though (i.e. high-end hardware).
You’re correct in that the hardware must be capable of being swapped out. I was replying to this, though;
You are thinking of “hotpluggable” hardware like USB or Firewire stuff. But some hardware is not designed to be added/removed while the computer is running as doing so could mess up the motherboard or confuse the BIOS. IDE devices, monitors, serial devices, PS/2 devices, parallel port devices, PCI cards and RAM are just a few examples of things you should not alter while the computer is powered on.
[I debugged and replaced system BIOSes at runtime in a former job.]
Was it Smokey’s BIOS, Jacky-X? ๐
Unless this is some joke I don’t get… I doubt he’s kidding. The BIOS really isn’t required, even in DOS:
Some dude reports swapping the BIOS on *different* two running computers here; http://srwofb.murrell.co.nz/?id=140
…forgot to add; IDE devices. I frequently unmount and yank them using common PC hardware. If you want to just yank a drive without unmounting it manually, that takes additional hardware or configuration but it’s not uncommon.
That’s a silly argument. You can’t, let’s say, replace your power supply in this way.
That’s a silly argument. You can’t, let’s say, replace your power supply in this way.
Swapping out the power supply doesn’t require OS support.
You replied to this, though;
“hotplug? modprobe?
You can already add new harware in linux without rebooting. Maybe we won’t need to reboot after a kernel upgrade ( clearly not with this release, but probably later )…
…so, I’m wondering what your objection is or what you found silly.
Can anyone highlight the issues with Reiser4? It is fast and it works brilliantly on my box, but I keep hearing there are issues with it.
Mystilleef: I have had some random kernel panics while unmounting a reiser4 partition. After moving to XFS (starting from kernel 2.6.10) I haven’t had any problems what so ever..
That’s rather unusual. I am using Reiser4 and it is by far the fastest and most reliable filesystems I’ve ever tried. Even the performance difference between Reiser3 and Reiser4 is quite significant.
http://kerneltrap.org/mailarchive/1/message/82278/thread
After reading the link to the mailing list provided by
anonymous earlier in this thread, it turns out the issues
with Reiser4 is that it duplicates some of the
functionality in Linux’ VFS.
With all due respect to the kernel developers, that is a
blatantly LAME excuse for an issue. Are they
trying to tell me there aren’t any subsystems in the kernel
that duplicate functionality?
I smell rivalry and politics.
Well, I smell stupidity, around here…
Just because some things need fixing doesn’t mean the kernel developers should add code that will need even more fixing later. Duplicating functionality is BAD.
Yes, it kills billions of kittens every year.
Whilst he put it in a slightly abrasive way, the gp had a reasonable point regarding code; if it can be refactored to fit better with the kernel then it would be good to do that before adding it.
The Reiser4 code will in any case be checked in without its “file-as-a-directory” mode enabled – which is a shame because it was a neat trick but unfortunately a flawed one.
The core developers are pushing the code in a direction where all Linux filesystems will be able to benefit from some of the Reiser4 high level features, leaving filesystems themselves purely (mostly) to worry about on disk data layout. This will be good for the kernel as a whole, although it’s a shame to hold up getting the Reiser4 functionality in there.
With all due respect to the reiser team, they cannot just decide to rebuild the kernel VFS layer’s functionality and get away with it.
Nobody is stopping you (or they) from patching your kernel in-order to get reiser4 support (or even releasing pre-patched kernels).
With all due respect to the reiser team, they cannot just decide to rebuild the kernel VFS layer’s functionality and get away with it.
Actually they can and that is exactly what Linus and other kernel hackers want. The problem is that no one wants to rewrite the VFS to include functionality that Reiser4 offers. It seems that Hans would rather have a completely seperate system that does not use the Linux VFS, or at least he doesn’t seem to be keen on modifiying the current VFS to fit Reiser4’s needs.
Well, I don’t know about anyone else, but the lack of xattr support is a very big minus for me. I think I’ll stick with ext3.
-bytecoder
Basically, Reiser4 wants to get its tendrils everywhere. It wants to replace the VFS layer, to the exclusion of all other filesystems, and Hans Reiser refuses to issue patches that remove the most invasive parts of Reiser4 because he is so much smarter than the Linux kernel developers.
Another problem is that it has been revealed on LKML (by the Namesys employee who conducted them) that the Reiser4 benchmarks on the Namesys site were faked. Hans knows this, responded to the allegation, said he would correct the matter, and then did nothing. The faked benchmarks are still there, a year later.
The numbers you see are, to my knowledge, correct. But Hans was told by Nikita that in some phases of Mongo Reiser4 performed *absolutely* abysmally compared to EXT3. Hans told him to remove those numbers from the published benchmarks. And there the matter sits to this day. Don’t believe anything Hans Reiser says.
Basically, Reiser4 wants to get its tendrils everywhere. (1)It wants to replace the VFS layer, to the exclusion of all other filesystems, and Hans Reiser refuses to issue patches that remove the most invasive parts of Reiser4 (2)because he is so much smarter than the Linux kernel developers.
(1) Do you have proof, because it sounds like one side’s story and somewhat hysterical at that…
(2) Well frankly if you write an FS that makes you a kernel developer, also he sells a commercial product and maybe he doesn’t want to fork the codebase. But really: never ascribe to malice where ignorance will do; I’d lay my bet on pride -or, hell- a difference of opinion.
the Reiser4 benchmarks on the Namesys site were faked….The numbers you see are, to my knowledge, correct. But Hans was told by Nikita that in some phases of Mongo Reiser4 performed *absolutely* abysmally compared to EXT3. Hans told him to remove those numbers from the published benchmarks.
Thats not lying; that’s selectively telling the truth. As long as what they do say holds up then that is all that is required. Otherwise it just means that you have to be critical and thorough and frankly thats what you have to do all the time. After all, when did a Linux preacher tell newbies that off-the-shelf consumer-grade-software is non-exist really?
CaptainPinko,
It was all pretty clearly laid out on LKML. Although I currently use EXT3, as I’m in a conservative phase these days, I like reiserfs v3. It’s extremely fast, particularly wrt metadata operations. (e.g. slocate -u or find / -name BlahBlah ) But I was pretty shocked and dismayed when I witnessed Hans’ attidude and behavior. He actually admitted on the list that he had, indeed, instructed Nikita to suppress any unfavorable numbers, and mumbled something about making a notation about the matter on the benchmark page. (Under the circumstances, he had little choice.) But to this day, the benchmark page is unchanged.
Perhaps that would not be so bad except that every time he pushes for Reiser4 inclusion in the vanilla kernel, he trots out the benchmarks and says that Reiser4 should be included because its performance blows the doors off of every other filesystem and he has the benchmarks to prove it.
But the real reason it has not been merged is that it is, indeed, extremely invasive. It does want to replace the Linux VFS. And there are some very hard questions that Hans can’t answer about the consequences of the “file as directory” paradigm; Issues that have been discussed on LKML before, and to which no one seems to have a solution.
He could drop the plugins and a couple of other features (and, perhaps, his ego) and get it in. But he refuses. And I will agree that pride likely has a lot to do with that.
It’s a shame, really, because Reiser4 has some exciting features.
But I have to agree with the kernel guys. Currently Reiser4 is a round peg trying to fit into a square hole. Long term maintainability is more important than trendy new features.
ldouglas,
Allow me to make my point of view clear, I think elegance of design AND implementation is a huge statement on correctness of a solution.
With respect to Reiser4, I think it’s very good.
I think the real issues is that if one takes away file as a directory, plugins and other such features, one doesn’t have Reiser4, instead they have the same stuff that’s being tossed around. Those features are exactly what make Reiser4 compelling, they allow one to make extending the file system much easier. I think Hans is bang on when he said opensource is more than a license it’s a way to code, one should expect and facilitate rapid jump in and jump out coding. Namesys seem to have achieved that goal, it’s merely not to everyones liking, but people haven’t really offered a better solution, merely stated, that, it shouldn’t be done Han’s way because of legacy issues. Whether the legacy issues themselves are worthwhile hasn’t truly been evaluated.
Currently, the approach being taken is keep the file system super simple and then have a VFS. I think that’s generally daft, over and over other people have tried to do just this and the performance is weak — MS being the most recent victim, but that’s partly due to legacy as well. In the end the VFS has to work intimately close with the FS, you’re going to either need a whole whack load of hooks for performance reasons or pass some sort of performance hints, but why throw in such hacks, which are ultimately, messages. We all know, “Message passing is bad!”
>We all know, “Message passing is bad!”
This is what for example Linus is preaching at the topic of kernel structure, but especially with L4’s achievements it is a bad over-generalization.
Ahh, Reiser4… a great idea, however mismanaged and misimplemented. The whole thing stinks. No, the filesystem doesn’t stink (that much), it’s the business/politics that stinks. Hans was envisioning the merge of Reiser4 into the 2.6.x tree as simple process that wouldn’t require any touching of the code. If the patch works, then why not merge it, damn it!!
The fact is that Reiser4 was not written with a focus on leveraging the functionality already within the Linux kernel. This is fine, no one was holding a gun to Hans’ head saying he had to fit his (not Linux exclusive) FS nicely into the Linux puzzle. But the kernel maintainers want everything to fit nicely together, they want every feature that could be elegantly pushed to a lower level (where other subsystems can use them in the future) to be appropriately positioned.
There are people willing to do this work in the case of Reiser4. There are tons of people who really want Reiser4 to be in mainline. However, Hans and probably others at Namesys oppose these modifications for purely business reasons. They know that more customers will sign on to their commercial licensing provisions once the core code is in mainline, and hence they want the merge to happen ASAP. To them, the merge is long overdue, and they are losing business for every kernel development cycle that does not result in Reiser4 being merged into mainline. Hans, in the end, knows that his implementation works, and he does not care how it could be best implemented with respect to the Linux kernel.
This is what separates many community development processes from proprietary ones. People actually care more about doing things right than merely getting it working as fast as possible. Stopgap solutions are extremely detrimental in community projects. If Reiser4 were to be merged into mainline in its current state, there would be little to no future motivation to improve the extent to which reuse of Reiser4 functionality is possible. If you want Reiser4 now, you can apply the patch and get on with your life. If Reiser4 is to be merged into mainline, Hans and his buddies at Namesys need to understand to goals of the Linux kernel project first.
The way I see it, Namesys needs the support of the kernel development community more than the Linux userbase needs Reiser4. In my experience with Reiser4 on multiple machines, it has some visible areas of regression, a nasty tendency to make audio skip under load, and a fragile coherency relative to other filesystems (including Reiser 3.x). I have been using ext3 with dir_index enabled on my new filesystems and I’ve been much happier.
Not sure what it has to do with the original post, but anyways. I have had many problems with Reiser4 on various desktop and laptop systems when they have not been cleanly powered off (say, hard power-off after a omni-powerful X11 crash). The file system trees seem to get corrupted easily and can sometimes not be restored at all. My experience with Reiser v 3.6 and even ext3 after similar crashes has (surprisingly) been much better.
But why bring this up, will the new kernel bring better Reiser4 stability/performance?
The article stated that there were some issues with Reiser4
which prevented it from being accepted in the mainline
kernel. I was not aware of these issues until I read
the link to the mailinglist provided earlier in the thread.
That’s why the topic was brought up.
“With all due respect to the kernel developers, that is a blatantly LAME excuse for an issue. Are they trying to tell me there aren’t any subsystems in the kernel that duplicate functionality?”
agreed, sort it out.
As much as I find the kernel upgrade with no booting attractive, what about some more administration-related features?
Is there now support for I/O nice setting per process? Can users renice their own processes as much as they like (having 0 as a maximum priority)?
> Is there now support for I/O nice setting per process?
From TFA:
“…this release, which also contains an enhancement to the Complete Fair Queuing disk I/O scheduler, which permits separate processes to have different I/O priorities, similar to ‘nice’ levels for CPU prioritization, Morton said.”
> Can users renice their own processes as much as they like (having 0 as a maximum priority)?
I think the command is io-renice. I know at one point (April 2004) it was not merged along with the rest of CFQ, but I’m not sure if it’s in now.
As much as I find the kernel upgrade with no booting attractive, what about some more administration-related features?
Is there now support for I/O nice setting per process? Can users renice their own processes as much as they like (having 0 as a maximum priority)?
Yes there are several other enhancements including the new CFQ 3 scheduler which supports i/o priorities
http://lwn.net/Articles/149479/
That were a few _short_ weeks:
http://www.kernel.org/pub/linux/kernel/v2.6/ChangeLog-2.6.13
What are the advances in file system event monitoring about? Will it finally be possible to monitor a whole fs at once (in contrast to watching each and every directory on the fs to achieve the same effect)? If this is the case, implementing searches ala BeOS will at last be feasible under Linux.
Under inotify, you’d be expected to monitor every directory, however it’s (supposed to be) a pretty efficient interface so this should be doable.
Nautilus is (I believe) able to use inotify already, as is the Kat desktop search tool (like spotlight for KDE). It’s kinda nice that the support for this is out there already!
Read the mail archives. First Reiser4 was not allowed in because it changed VFS, later it was not allowed b/c it did not change VFS.
Although there has been some agreement and it has been fixed in Reiser4 code:
07/07/2005
additional patch for 2.6.12-mm2
Changes: first attempt to address lkml complains about reiser4’s VFS duplication
ftp://ftp.namesys.com/pub/reiser4-for-2.6/README
This has nothing to do with issues with Reiser4, it’s just politics. It’s evident after reading the mailinglist that there is bad blood and rivalry between the Linux FS hackers.
You are correct that there is bad blood. But the kernel developer community is not going to keep an FS out of the kernel over bad blood.
The bad blood is caused by Hans’ attitude, and more importantly, the fact that he doesn’t think that Reiser4 should be subject to the peer review that every other inclusion into the kernel is subject to. (Rather like Eric Raymond’s campaign for CLM2; It didn’t get very far either.)
If he could just let go of his ego for a bit, discuss the inclusion, make the requested changes, and basically cooperate, he could get Reiser4 in. The VFS layer could then be gradually, generically, updated to provide the features that Reiser4, in all its full glory, needs and we would all benefit. Other file systems could then have some of those features, too, and have them implemented in a standardized way.
But I guess it’s important to Hans, from a business standpoint, that Reiser4 be the *only* fs with the those features. I don’t see that Hans’ business postition is the community’s problem.
Reiser4 has some compelling features. But when these features get included into the kernel it needs to be done right, because we will be living with the consequences long after Hans’ business interests have moved on to Reiser5 and Namesys doesn’t care about Reiser4 any more.
If what you postulate happens, what will be Reiser4’s differentiating factor as compared to other filesystems? And why does Hans have to write code for or maintain Linux’ VFS?
His efforts should be directed towards Reiser FS, and not the Linux VFS nor the enhancement of other filesystems via Reiser4.
Denying Reiser4 into the mainline kernel because it supposedly duplicates some functionality in the VFS is just retarded. That’s all I’m saying.
I don’t give a damn whether Hans has an ego or not. I give a damn about not given a technology as innovative as Reiser4 a chance in Linux as result of envy, jealousy, rivalry, and sentimental bullcock rather than technical flaws.
For a project as large as Linux, there damn well is going to be a lot of subsystems that duplicate functionality.
In the land of hackers, logic should be the corner stone of operations. Leave sentiments and emotional outrage for the politicians. I could care less if Hans has an ego, I probably have two myself.
> If what you postulate happens, what will be Reiser4’s differentiating factor as compared to other filesystems?
This is where our viewpoints diverge. You are looking at Namesys’s business requirements. I am looking at the long-term consequences for the community.
Are you volunteering to take over Reiser4 as maintainer after Namesys loses interest in it? If so, and if you could guarantee a suitable timeframe, that would no doubt help get it into mainline.
Are you interested?
I don’t know who gave you the idea that I was a kernel hacker. The only people who can maintain Reiser and Reiser4 is the Resier team. Random Joe Clueless cannot delve into the source code and automatically start maintaining it.
Who told you that once Reiser4 enters the kernel’s stable branch it will be maintained by hackers other than the Reiser4 team? No dude, it doesn’t work that way.
I’ll take that as a “no”.
So if Reiser4 is incorporated into mainline, and people start using it, the Linux Kernel developers are basically at the mercy of Namesys, because they have users that are dependent upon code that only Namesys can maintain? Wonderful. I think you are getting close to understanding the problem.
All that glitters is not gold, and often comes with a price. Namesys wants it to come with a price so that they can leverage it later. No thanks!
There goes your unfounded conspiracy theories. I don’t have time for all that nonsense. It will shock you to know that Namesys still maintains ReseirFS version 3. I’d like to know who is paying any price for using it.
JFS maintained by IBM is in the kernel. XFS maintained by SGI is in the kernel, but Reseir4 for some reason have evil Namesys behind it so it shouldn’t go into the kernel. And oh, only Namesys can maintain it. Well duh!
Pulease! Look for a better theory.
> It will shock you to know that Namesys still maintains ReseirFS version 3.
Well, yeah. That is their current, stable filsystem. I should hope that they are still maintaining it.
It also does not suffer from the problems that Reiser4 has and can well be maintained by non-Namesys personell if necessary.
Granted, the mainline kernel has a high barrier to entry wrt to code maintainability. The distros’ standards are, for practical reasons, somewhat lower.
Have any of the distros thought that Reiser4 features were important enough to warrant its inclusion? I’m going to guess that maybe Gentoo might have?
> > But in most of the changesets on the bkbits site you can go back over 2
> > years and not see anything from namesys people. Nearly all of the fixes
> > commited in the past 2-3 years are from SuSe.
With Chris Mason’s name attached? Chris wrote the journaling support for R3
and worked for SUSE for a while (he may still?). I also remember seeing quite
a few patches run though the reiser mailing list for comment…
Namesys’s business model is to be paid by corporations to include Reiser filesystems in their products *without* acknowledging Namesys. So the people who have systems based on Reiser3 will be paying their licenses to Namesys regardless of the new Reiser4 filesystem.
No, it’s far from obvious.
As I read it, many kernel hackers have serious qualms about ReiserFS 4, and don’t want it in the kernel until thos qualms are fixed. They will, after all, have to support that code once it’s in, especially given Hans Resier’s predisposition to abandonding code after a while, such as Reiser 3.
The kernel devs, including Linus, have told Hans Reiser what they dislike and what he needs to change to get it in the kernel, but he seems unwilling because he believes there is a conspiracy to compromise his artistic vision.
The primary concerns are:
1) Non-posix semantics for its own sake. In especial files are directories, sort of. Except in those file-directories are files which are not directories. So files in file-directories are non-directory files. So we now have a distinction not merely between files and directories, but file-directories and no-directory files withing file directories.
That’s not a semantic improvement over Posix handling of extended attributes. It’s a mess.
First thing to do: Files-as-directories-sorta has to go. Implement Posix Extended Attribute semantics.
2) The layering violation caused by ReiserFS 4 plugins. Much of the functionality implemented by these plugins could be implemented at the VFS layer for use by all filesystems. That they have not been has left ResiserFS 4 with a second VFS-style layer beneath the first, only useful for ResiserFS. This is bad design and incorrect layering.
Second thing to do: Add a plugin architecture to VFS, then reimplement semantic-extending plugins in the VFS layer.
3) Separate out the “core” filesystem with semantic plugins and files-as-directories-sometimes removed.
Third thing to do: Nothing.This could and would be merged straight away, independent of (1) and (2), provided Hans Reiser this time provides an fsck that repairs more often than it screws your filesystem (fsck.reiserfs3 I’m looking at you…).
Martin
Gypsumfantastic,
I pretty much agree with all that you say but will make one correction/addition:
Linus actually likes the file as directory idea very much. But he has never seen an implementation that did not have major problems as inherent features. Reiserfs is no exception. Hans did not even realize it. Or maybe he did and was trying to sweep the fact under the rug; you can never tell, with that guy, whether he was really unaware of a problem, or was just hoping no one would notice. Peer review is *especially* important in the case of patches that come from Namesys.
2) The layering violation caused by ReiserFS 4 plugins. Much of the functionality implemented by these plugins could be implemented at the VFS layer for use by all filesystems. That they have not been has left ResiserFS 4 with a second VFS-style layer beneath the first, only useful for ResiserFS. This is bad design and incorrect layering.
Yes, but let us be honest, this will run dog slow, there needs to be more hooks, or a hinting system which file systems have to support.
The whole Reiser4 debacle is really sad. Here we have a very modern and innovative file system that is perhaps a decade ahead of anything closed software companies like MS have to offer, and people are complaining that it is not compatible with the VFS.
Guess what: if you rip out all the cool stuff like plugins, files as directories and so on, you have killed what makes reiser4 great.
One of the most compelling features of reiser4 is the capability to do atomic updates over multiple files. But there is just no way to do something like this over VFS. So let’s just rip it out and use it like ext2 or fat32.
Linux will always be about the lowest common denominator. Really sad. Perhaps Hans should join one of the new, innovative kernel efforts such as Dragonfly BSD.
there are no “atomic updates” in Reiser4
please read LKML it was discussed so many times
already.
> there are no “atomic updates” in Reiser4
please read LKML it was discussed so many times
already.
Could you elaborate on that? Is that yet another lie out of Namesys of which I was unaware? I guess I shouldn’t be surprised; Namesys has about as much credibility with me as Mindcraft these days.
People are free to apply the Reiser4 patches. Distros are free to apply the Reiser4 patches. Your comment about Linux being the lowest common denominator, I trust, was not intended as a troll. But it does come off that way.
For Reiser4 to get included in the Vanilla kernel will require that it be maintainable and meet certain standards.
It does have some compelling features. But they are not *that* compelling. i.e., not compelling enough for the Linux kernel to sacrifice long-term maintainability for.
I doubt Dragonfly would include it as a standard feature, either. But if they do, well… that’s their problem.
OTOH, if you put all the cool stuff into the VFS then all filesystems can do this stuff. Reiser4 remains a strong contender because of the *speed* with which it handles the underlying storage.
Files as a directory has semantic problems that need resolving – you can hardlink to a file. You can’t hardlink (safely) to a directory because it’s possible to create link cycles that violate system integrity guarantees. For this reason the file-as-directory feature needs further modification – Hans admitted he hadn’t spotted the potential for problems here.
Trouble with Dfly BSD is that though it’s a great system, it’s BSD licensed. That’s fine but it breaks Namesys’ business model: they charge people to use the code in proprietary products (for this reason, external contributors must sign away rights to their code). If the code were BSD licensed anyway, that’s a threat to this licensing model.
I guess this is potentially another reason Hans does not want to integrate the code into the VFS layer: it takes away the value add of the core Reiser4 filesystem. I think Hans deserves to profit from his hard work but I think it would be incorrect to accept filesystems into the kernel for that reason alone.
from Nikita Danilov:
First off, my comment was in response to the message http://marc.theaimsgroup.com/?l=linux-kernel&m=111945793205480&w=2 claiming that reiser4 was “atomic”.
As it stands, currently reiser4 cannot guarantee that single write(2) call is atomic.
Think of system with X bytes of physical memory and 9*X bytes of swap.
User does buf = malloc(10 * X);
memset(buf, 42, 10 * X);
write(reiser4_fd, buf, 10 * X);
As far as I know, current reiser4 code cannot guarantee that such write is always handled as a single transaction. It is _possible_ to implement such a guarantee, I believe, but this requires a lot of work.
Thank you Anonymous (IP: 194.206.123.—).
Yet another lie out of Namesys about the fundamental capabilities of Reiser4. I wonder how many more there are left to find. It would actually be kind of fun, like an easter egg hunt, if so many people did not take Hans’ and Namesys’s claims at face value.
Does Reiser4 really not even give that level of atomicity? One of the advanced features Reiser4 can (is supposed to) do is atomic updates of *multiple* files. Can it really not atomically update a single file? I thought Posix semantics demanded write(2) calls to be atomic?