Finally something really interesting to talk about. If you’ve used UNIX or any of its derivatives, you’ve probably wondered why there’s /bin, /sbin, /usr/bin, /usr/sbin in the file system. You may even have a rationalisation for the existence of each and every one of these directories. The thing is, though – all these rationalisations were thought up after these directories were created. As it turns out, the real reasoning is pretty damn straightforward.
I’ve never made a secret of the fact that I absolutely detest the UNIX directory structure. The names are non-descriptive and often entirely arbitrary, they require a book to properly understand, and everyone seems to have their own ideas about what goes where. And heck, does it show – even among Linux distributions there’s no consistency about what goes where.
It’s a total and utter nightmare that even defiled my beloved BeOS.
Solutions so far tend to be just layers upon layers upon layers to hide the mess of the directory structure. Mac OS X is especially egregious in this regard – open a terminal and check the directory listing at root – it’s an even bigger mess than regular UNIX. Load up a finder window, and you’re looking at an entirely different directory structure.
This is like buying a beautiful car, only to realise the engine is made of cake and rotting brocolli.
Whenever I complained about the UNIX directory structure in the past, I (and those who agreed with me) were always pummelled with explanations on why it makes sense, why it’s best to spread stuff across /bin, /sbin, /usr/bin, /usr/sbin, and so on. The funny thing always was – these explanations were never particularly consistent between each other.
Late last week, I came across a link at HackerNews giving some intriguing insight into how /bin, /sbin, /usr/bin, and /usr/sbin came to be. Many of you will be surprised to learn that no, there is no divine plan behind all this separation.
Back on November 30 2010, David Collier wondered on the BusyBox mailing list why “kill is in /bin and killall in /usr/bin”. He “[doesn’t] have a grip on what might be the logic for that”. Rob Landley replied, and offered an interesting insight into all this.
The issue was that when Ken Thompson and Dennis Ritchie upgraded from a PDP-7 (on which they created UNIX in 1969) to a PDP-11 in 1971, they were confronted with not one 1.5MB hard drive, but two. Amazingly, they now had an insane amount of megabytes (3 of ‘m). The first disk contained the operating system, while the second one contained all the user stuff. This second disk, with all the user stuff, was mounted at /usr (/home was invented later).
Data Center image via Shutterstock
At some point, the operating system grew too big for the first disk, and had to spill over to the second disk. As a result, Thompson and Ritchie replicated the system directory structure (/bin, /sbin, /lib, /tmp, and so on) on this second disk under /usr. When they got a third disk, they moved all the user stuff from /usr to the third disk, mounted under /home.
This forced them to come up with a number of rules, such as that a command like mount couldn’t be installed in /usr/bin, since mount was needed to mount the second disk (/usr) in the first place.
“The /bin vs /usr/bin split (and all the others) is an artifact of this, a 1970’s implementation detail that got carried forward for decades by bureaucrats who never question _why_ they’re doing things,” Landley explains, “It stopped making any sense before Linux was ever invented, for multiple reasons.”
He then continues to details these reasons. First, Linux already has a temporary system that takes care of the ‘this file is needed before that one’-problem. Second, shared libraries solved the issues caused by static linking (which was the norm back when UNIX was created). Third, hard drive space hit the 100MB mark back in 1990, so small disks are no longer an issue.
“Standards bureaucracies like the Linux Foundation (which consumed the Free Standards Group in its ever-growing accretion disk years ago) happily document and add to this sort of complexity without ever trying to understand why it was there in the first place,” he adds, “‘Ken and Dennis leaked their OS into the equivalent of home because an RK05 disk pack on the PDP-11 was too small’ goes whoosh over their heads.”
In a way, this feels like vindication. All those silly rationalisations people are bandying about – they’re all made up after the fact, for reasons that haven’t made sense in at least 30 years – and heck, that never made any sense in the Linux world at all.
Arguing that the UNIX directory structure is a horrible, horrible mess that defiles an otherwise elegant system is like trying to convince a floor tile to flip over. People are so used to their knee-jerk responses about how it all supposedly makes sense, they often refuse to even think about redesigning it for the modern age. Since the geek is a proud and stubborn creature, there’s little to no chance of this ever changing in my lifetime.
“I’m pretty sure the busybox install just puts binaries wherever other versions of those binaries have historically gone. There’s no actual REASON for any of it anymore. Personally, I symlink /bin /sbin and /lib to their /usr equivalents on systems I put together,” Landley, currently working on embedded Linux, concludes, “Embedded guys try to understand and simplify…”
I’m with the embedded guys – I really like to understand things like this!
This also vindicates my first thought when I started using *nix – that “/usr” looks and sounds like “user”. That name never made sense to me.
/usr _IS_ (mostly) for user stuff, it just mostly became obsolete with the advent of /home. FreeBSD actually puts home in /usr/home but links it to /home for compatibility.
It makes sense to have separation between root and user, even Windows has been set up this way since … forever. Third party software belongs outside the ‘core’ of the OS, in my opinion. In Linux the lines are very blurry and not so intuitive but that does not make it unjustified.
Edited 2012-01-30 21:16 UTC
/users
/system
/programs (or /Apps, in this day and age).
More is not needed at root.
(although I personally would like to see a /settings as well, as I argued in my proposal: http://www.osnews.com/story/19711/The_Utopia_of_Program_Management/… ).
What if your boot process forces you to put the kernel and base module on a separate volume? (e.g. you keep most of your OS on a RAID-5/6 volume, but most bootloaders have trouble loading from anything other than a plain or RAID-1 volume).
Also, where do you put site-local overrides for distribution maintainer tools (e.g. a custom version of some libraries)?
Then you fix the bloody bootloader.
Wherever you want. Seriously, you can create your own directory structure and put whatever you want there, and then you can add this to the users’ PATH ahead of /bin. There is absolutely no reason for every Linux distribution coming with a predefined place for something which is actually a very rare occurrence.
Easier said than done. There’s a good reason why bootloaders have this limitation – they are meant to be simple. Higher-level raid volumes can have complicated geometries and be spread over a host of interfaces and buses. Trying to support all but the most trivial configurations will result in a serious set of problems in the bootloader stage (e.g. having to duplicate the entire volume+FS layer of the OS in a rather limited space) with little to no gain.
Lots of systems I manage have site-local overrides and having them in custom locations, rather than a well defined one will result in more confusion than order. Arguably, the situation now isn’t much better and could do with some improvement, but it’s always about finding the proper mix of freedom and regulation.
saso,
“Lots of systems I manage have site-local overrides and having them in custom locations, rather than a well defined one will result in more confusion than order. Arguably, the situation now isn’t much better and could do with some improvement, but it’s always about finding the proper mix of freedom and regulation.”
That seems to be the critical issue doesn’t it? After all, if you want to store different applications on different file-stores, they need to go in different mount points.
But in my opinion this is not the best solution, we only resort to it because linux doesn’t give us other options.
What we need is the ability to store different applications in the same directory on different file-stores. By keeping storage and organization separate, we’d be free to use the best organizational structure for us without overloading it for storage based requirements.
All programs would use the exact same directory structure whether they’re distro-specific or user-specific, and whether they’re installed on the local system or the office lan, etc.
Unionfs technologies enable these kinds of setups. While they’ve been around for many years as kernel patches, the linux admins steadfastly refuse to merge them into the stock kernel.
The best one could do to create a unified directory structure without a unionfs is to use the separate mount points as we do today, and then create a script to symlink all the apps individually into a common namespace.
I rather like /boot mounted to seporate hardware. I tend to drop it on an SD since pretty much every machine now has an SD reader. If the machine is dualboot, I can pull the SD and leave the HDD MBA booting to a secondary OS. If the machine is encrypted, I can pull the SD and leave the machine without any boot partition; given that the boot partition is normally unencrypted that also means not having an unecrypted partition sitting on the HDD that’s supposed to be fully encrypted.
There is also grief with Win7/Truecrypt and Debian/LVA-Enc dualboot. Truecrypt won’t see the Debian bootup but will kick off the Win7 bootup. Debian’s boot loader won’t see Truecrypt/Win7 but will see Debian’s encrypted LVA partitions. Rather than wait for the developers to “fix the boot loader” I simply boot Debian from an SD and use the bios menu to select the HDD boot when I want the Win7 environment.
I think the biggest pain in my years of dualboot systems has been working around an OS that imposes a single boot loader on the drive instead of allowing a boot partition from any mountable media. Heck, Win even gave grief when dualbooting NT/win98 let alone NT+non-MS-OS and you can always count on Windows overwriting the boot sector with it’s own loader so you’d best have Grub/Lilo handy on a removable media or you’ll be fiddling with SuperGrub trying to get back into and fix the proper multi-boot menu loader after Windows kindly screws it for you. (becaus in Microsoft’s world, no one would ever dualboot a non-MS OS so may as well just kill off whatever the user had there already)
/boot or something similar would also be necessary to have it mounted read only.
The problem with dumping files into a less ‘structured’ hierarchy is that you limit versatility and customization options. Is it better to have 500+ executables in one directory rather than ORGANIZING (ie compartmentalizing) them?
I think the best solution lies somewhere in between over simplifying and over complicating.
Edited 2012-01-30 21:35 UTC
That looks a lot like:
Documents and Settings
Windows
Program Files
Just sayin’
Windows 7 is closer:
\Users
\Windows
\Program files
\Program files(x86)
And the Windows directory layout is a paragon of logical thought?
Why is the ‘hosts’ file buried in WINDOWS/System32/Drivers/etc?
‘System32’? of an 64bit OS?
I’ve been using a variety of Operating Systems since DEC Dos V8 on the same PDP-11 as K&R, VMS, i/OS through to the likes of Android and IOS.
I have to say that the LSB version of ‘where things are put’ works for me.
Finally, I’d really like to shake the hand of whoever thought it would be neat to default some critical directories (eg Program Files in windows) to contain a SPACE character.
I’d give them a ‘Glasgow Handshake’.
Frankly every OS has its oddities.
You can blame programs that hardcode paths like System32. Yes, in 64-bit Windows, System32 holds the 64-bit system files and SysWOW64 holds the 32-bit system files. But it wasn’t a good technical design decision, it had to do with crappy 3rd party software. Since Microsoft is in the business of actually allowing 3rd party software and drivers to work consistently, they have to make compromises like this (and this one isn’t a problem, just an oddity).
Mac OS X Since 10.0 :
/Users
/System
/Applications
Yeah, and those that actually have to dig through file locations occasionally find the “windows way” a nightmare! I need to find where a file is on Linux? There are a ton of ways. I need the actual location of an executable? Use ‘which <program>’ and it’ll show you the full path. You need to know what package a file belongs to? dpkg -S <filename> (at least on Debian based distributions.)
The Windows way is a mess, because while you have Program Files or Program Files (x86), not all installers default to put things there, and even then you can change the default so not all systems will have files in the same place…. Then you have the fact that some installers want to put the company name in there, so you end up getting a huge mess. A good example is Steam. C:\Program Files (x86)\Steam\SteamApps\common\rainbow six 3 gold\Mods\ is where I have to install mods to. Not all mod installers detect where it is installed, so I manually have to drop files in there.
Far easer to have /usr/games/doom/ and throw the files in there, wouldn’t you say?
Just because mod creators are lazy doesn’t mean that we all should be.
99% of all installers let you modify the installation path … just saying.
Yeah, but doesn’t that kind of negate having a simplified path? This was my point, if you manually have to track down where the data goes, then there is failure on the part of the installer in the first place.
I have re-read your original comment 3 times.
Pretty much every installer I use puts it all it files under the installation directory. Modern programs put their user settings in /Users/AppData.
If it has the same “bitness” as the installation then it goes in Program Files, if it is 32bit … Program Files(x86). This seems pretty straight forward to me.
I really get what the problem is?
The shit thing on Windows 64bit is for compatibility, using system32 for 64bit programs and WOW64 for 32bit programs.
You wouldn’t say that if you weren’t so lucky to be using an English version of Windows. Over here in Norway, half the programs install to “C:\Program Files” and the other half to “C:\Programfiler”. Depends on whether the guy writing the installer used the proper API to get the directory or just hardcoded the path.
Not really the fault of the system though.
Is that a bad thing? Not if you ask me.
Well, then take a look under them. And you’ll see how some of us take the linux “nightmare” anytime over the above.
Looks like Mac OS X mixed with OpenStep to me. I actually prefer /Users to /home, and love having GUI apps installed in /Applications on OS X, but moving /bin /sbin and so forth is just going to cause problems with shell scripts and other apps that expect things to be in certain places.
Add /config, /logs /services to that (aka, srv, what var have been renamed too some time ago).
You don’t want service data (database, upload, VMs, repositories…) to be with the program data. Those subvolumes/partitions/disks have different backup policy, mount options and SELinux policies. Merginf the two would be a security issue. It would also make impossible to use alternative medium like SAN or NAS.
As for config, you don’t want them to be spread across the system. /etc is a terrible name, but the concept is good.
/logs is quite obvious (it fit in no categories you described above, it would be stupid to force it into one).
I’d actually include /opt or /optional or /thirdparty.. whatever name makes sense.
You have /system stuff that is core OS/distro.
You have /programs stuff that is user application level stuff not core but still delivered through the distribution.
You have /opt stuff that is completely third party; not delivered through the distribution.
and of course /home for your users and maybe /root cause root should still be segregated.
Between tarball installs and third party distro package installs, it’s nice to have an opt tree to dump it under instead of mashing it in with system/apps managed by the distribution repositories. An example would be Splunk installing under /opt because it’s installer is provided by the company directly not through a distribution repository.
Thankfully more people are catching onto that perhaps there’s a bit of a legacy mess here. I think Fedora 17 has the right idea here:
http://fedoraproject.org/wiki/Features/UsrMove
That’s the best approach I’ve heard suggested that isn’t something alienating like GoboLinux.
So why to /usr/bin and not to /bin then?
I’ve answered this here: http://www.osnews.com/permalink?505180
But in case you also want to hear it from more official people: http://www.freedesktop.org/wiki/Software/systemd/TheCaseForTheUsrMe…
It simply makes more sense to do it /usr, that way it is possible to contain everything of that nature onto its own separate file system that could then be snapshotted, shared and mounted however you need it. Symlinking /bin, /sbin, etc into /usr gives you additional compatibility as a freebie since everything will exist in both locations.
Sounds good, but the name itself is totally obsolete. It could be renamed from /usr to /system or something, but I guess legacy weight is too big for it.
Don’t break things if you don’t have to. You can always present it in a nicer way in the GUI, but don’t break things if you don’t have to..
You can see problems with it already with scripts hardcoding /usr/bin or something. Normally you don’t have to change anything since it already works, it’s just at some point legacy clutter becomes too numerous obscuring the big picture.
Edited 2012-01-30 23:27 UTC
Yes, and we who are Windows users know this well Glad to see we’re not the only ones dealing with it. One thing we all seem to have in common is that backwards compatibility = the bane of our existence.
Sounds good, but the name itself is totally obsolete. It could be renamed from /usr to /system or something, but I guess legacy weight is too big for it.
Why? What is wrong with Unified System Resources? 😀
That was my thought as well, and It was answered pretty well in the wiki faq linked above a couple of days ago. Apparently, someone changed the explanation.
The old explanation was that everyone else that had fixed this issue had moved everything into /usr, so to be compatible with all of them (mostly solaris), they were going to follow suite.
That made sense to me.
While I approve of Fedora 17’s idea, I still think that GoboLinux still has the best idea thus far. Why make things cryptic and messy on purpose? in GoboLinux you know the programs are stored in /Programs
tuaris,
Thank you so much for the pointer to GoboLinux, it’s a great concept!! In my own distro I faced many similar challenges when I tried to modernize the legacy paths since so many applications hard code them.
I found using symbolic links worked, but that’s very unclean when the namespace is cluttered with both the old and new paths. I see that GoboLinux solves this with a new kernel feature called GoboHide, which provides an element of backwards compatibility while cleaning directory listings.
http://gobolinux.org/index.php?page=doc/articles/gobohide
Personally, I think K&R made a mistake when they decided that names starting with a dot should be hidden. It would have been much more flexible to have the hidden flag as an attribute you could set on any file or folder. GoboHide would then not be necessary.
Fedora 17 is actually the reason for the lwn article, the hackersnews article, etc.
In a way that makes even less sense. Why not just put everything in /?
Maybe it’s “made up after the fact”, but bringing order into chaos is laudable goal. It’s too bad the Linux folks (and SysV folks in general) never “got it”.
Just look at the hier(7) man page on FreeBSD to see a hierarchy that makes sense:
* / is for the OS, what’s needed to boot
* /usr is for the OS, what’s needed after boot (can be NFS mounted)
* /usr/local is for third-party apps installed by the user
Nice and clean, and makes sense. bin directories are for normal user apps. sbin directories are for system admin apps.
But, FreeBSD has a clear separation between “OS” and “third-party apps”, which Linux doesn’t have.
The unfortunate side-effect of this change is that the /usr/bin directory is going to be absolutely *HUGE* as every single application installed (from OS utils, to Xorg, to KDE, to Firefox, to Apache, etc, etc, etc. And there won’t be many sub-directories.
It’s bad enough already, but this is just going to make it worse.
The way I had always seen it explained across all *nix systems was that /bin and /sbin were for core files that were required to boot (throw /boot in there too) anything else is ‘userland’. For the record, Debian and almost every other Linux distribution is the same as what you’ve shown there with FreeBSD. /usr/local is for source compiled (Third party) software. Whereas anything that is packaged and distributed with the distribution (handled by the package manager) goes into the proper /usr/bin, /usr/sbin with the data going into /usr/share/<packagename>. Debian’s packaging guidelines are pretty strict about this.
For the record Thom, I had always heard that this was the way it was meant. That /bin and /sbin had run out of space and it was for mounting other drives (or doing network mounts later on). That’s the true beauty of the Unix file system hierarchy.
AND /bin and /sbin had statically linked executables, so that you can’t corrupt the essential part of the system by installing a malicious .so (still true on the BSDs but no longer true on Linux, where the ideology is to have everything possible dynamically linked, in spite of the fact that vendors’ insufficiently-tested .so‘s break mission critical apps. Much too frequently.
The part about statically linked executables isn’t true these days on all Linux/Unix, it just is: without dependencies on /usr
Then again in Linux you just have an initrd these days anyway.
Except that in the BSD’s /usr/local is for everything that comes as a package so it’s not like Linux.
I also like the way FreeBSD makes a separation between the base OS and the extra stuff. However, I think they should look at Fedora and try to come up with something similar that cleans up the filesystem while still maintaining the current elegance of this separation.
The side effect of merging everything to /usr (and becoming huge) will bring into light the deeper filesystem design issues with Unix/Linux. The developers are taking things once step at a time, which in this world seems to be the only way to get from point a to point b.
I expect/hope Fedora will one day end up with a filesystem that takes the best parts of GoboLinux, FreeBSD, and OS X.
/system – everything that is the base operating system with {/etc /vat /usr /run} contained within.
/home – user data
/programs – programs that are not part of the base operating system, with each program having it’s shared libraries and stuff contained within it’s own directory. (disk space is cheap these days)
/settings – settings for installed programs
Edited 2012-01-31 01:40 UTC
Fedora has the right idea in mind when they want to simplify the Linux hierarchy, but why put everything in /usr ? Historically /usr was for the user directories, but we’ve got /home that is now the de facto. Why not put everything back in /bin, /sbin, /lib, etc ? At least you wouldn’t need to type ‘cd /usr’ everytime you need to go somewhere
Because that would be more work and less compatible. By putting everything into /usr, you have a *single* directory to concern yourself with when it comes to mounting, sharing and snapshotting an image of the current operating system’s binaries.
By symlinking in /bin, /sbin, etc. from root into /usr, you also make sure that files automatically exist in both locations for free which actually increases compatibility with scripts and such that make assumptions about locations.
Doing the inverse of this means you have to not only deal with making sure /usr/* still exists in some form for compatibility reasons, you’re also producing more clutter in root and you’re adding to the overhead of attempting to manage a single image of the current OS state since they no longer could exist on the same separate file system.
Uhm, why would snapshotting / be any harder than snapshotting /usr? Snapshots happen at the filesystem level, and if everything else is a separate filesystem (/home, /var, yadda yadda) then / is a filesystem all to itself … so why would snapshotting it be hard?
And if you want to talk about polluting the / directory, have a look at all the virtual filesystems that need mountpoints. Just have a look at the output of “mount” on a Linux system these days. There’s a good 8 or so virtual filesystems that all need mountpoints in / (/proc, /sys, /dev, various tmpfs, etc).
Edited 2012-01-30 21:33 UTC
You’re not understanding the problem domain here, this is more about maintaining a collection of machines that all share the same set of binaries at the same patch level, etc. Snapshotting and sharing / includes things such as /etc and potentially /var which would have to be excluded in a lot of cases since they’re not generic enough to want to share across machines. There’s plenty of situations where you will want to maintain the binaries separate from the configuration, such as hosting scenarios.
But hey, you know, no one working on Solaris or Fedora or anything has put any thought into this at all and they’re just doing it for kicks really.
Edited 2012-01-30 21:41 UTC
I understand the problem domain perfectly. I support over 3000 diskless Linux stations as my day job. And every day I fight with the asinine Linux directory structure (coming from a FreeBSD background where things make sense).
We share out / via NFS, including /etc. And /usr, /var, /home are also shared via NFS. And other filesystems.
If you’re going to amalgamate directories, then leaving / with a bunch of empty directories and symlinks is not the way to do it.
I’ve read all the mailing list threads about this project, and all of the reasonings given boil down to “this is how we’re doing it, deal with it”. There’s no actual, good, solid, evidence-based reasons given.
Yes, Linux filesystem hierarchy is a mess. But this doesn’t do anything to help fix that.
No, you really don’t seem to understand this. You’ve yet to present a SINGLE compelling reason why this should not be done and how a merged /usr collection is somehow damaging. Instead, you present credentials similar to my own as if I’m supposed to bow to your authority and then counter with ‘nuh uh’. Thankfully no one has to listen to you.
Which Linux distributions are you referring to when you say that the file system hierarchy is a mess? I use mostly Debian, and from what I understand, it sounds like it’s just like FreeBSD’s. I’ve been trying to get a BSD system installed lately, but for one reason or another, the install seems to fail (was trying PCBSD 9 recently under Linux KVM and it choked during the install process. But I have played a little bit with FreeBSD 8 and the only real difference I noticed was that the device names are totally illogical (coming from a Linux standpoint). ports is cool though.
User installed application gets put into /usr, along with all the “OS” utilities. Some user-installed apps end up in /opt. Which goes where is completely random.
Every single configuration file for the OS utilities and user installed apps is jumbled together into /etc. And then split between /etc/default, /etc/<appname>, /etc/<appname>/conf.d-style directories, and a few other places.
Log files are scattered around /var/log, /var/run, /var/lib, /var/<appname>.
The Linux filesystem hierarchy is a mess.
That would make more sense, in that the Linux initrd/initramfs has taken over the “goal” of the / directory (enough filesystem and utilities to boot the OS) which kind of makes / redundant. It would make more sense to move everything from /usr to /, then the other direction.
I can just see in Fedora 18 or 19 where the directories directly off / will be empty, which just seems pointless.
If they’re going to “optimise” and “streamline” things, then at least do it right.
Of course, that would break just about every piece of software out there that expects to install to /usr. But since when has Fedora ever worried about not breaking things needlessly?
I don’t think it makes a big difference wheter it is / or /usr, but I’ll add my .02 cents anyway:
Long run / for real files (Fedora 18,19, maybe 20):
/boot
/etc
/home
/usr
/var
that looks pretty tidy _to_me_, but I’ll admit it is not obvious what all the dirs do, but it can be explained in a few sentences if you are interested. People with no interest will never get any filesystem PERIOD
Cluttering that with runtime dirs and bin, games, include, lib, libexec, local, sbin, share, src and tmp really makes it a mess.
Everything else will be either not as flexible or will still need explaining.
Edited 2012-01-31 00:00 UTC
Which would be mostly proprietary software which can’t be rebuilt. The bulk of open source software that I’ve seen can be fixed by telling the build system where to put it. I don’t see much of any reason why software shipped with a distribution would be affected much at all by this change.
Most closed source software that I’ve encountered doesn’t care where you run it from so long as it can find some libraries. A shell script wrapper to set LD_LIBRARY_PATH or whatever and off you go. It’s probably more likely that your binary only software will break due to an ABI change.
At lest that’s been my experience (happy go lucky as it may seem).
I’m neither for or against it. IMO there’s bigger things to fret about.
The idea is that you can snapshot /usr seperately from /var, /etc, /boot, etc.
If /boot, /var, /home, and /usr are separate filesystems, then it means that / is also a separate filesystem. Why would snapshotting one filesystem be easier than snapshotting another filesystem? And why wouldn’t you want to snapshot the application configuration files along with the applications (meaning, /etc along with apps under /)?
No, this is for when there is just one filesystem (partition is to technical for most users ? and the technical users always get the sizes wrong ?)
With btrfs for example you can snapshot directories.
So for example when you do distribution updates/upgrades and do automatic snapshotting of the binaries in /usr.
There could be data in /var you don’t want to undo when you restore the binaries.
Something like that ? Anyway, Fedore has a plan. Have a look at the site: “The merged directory /usr, containing almost the entire vendor-supplied operating system resources, offers us a number of new features regarding OS snapshotting and options for enterprise environments for network sharing or running multiple guests on one host. Most of this is much harder to accomplish, or even impossible, with the current arbitrary split of tools across multiple directories.”
http://www.freedesktop.org/wiki/Software/systemd/TheCaseForTheUsrMe…
Yeah, it may be impossible to do with the current split. But moving things to /usr instead of leaving them in / is just wrong.
I’ve read through the mailing list threads, the wiki articles etc. And all of their reasoning for moving to /usr applies just as well to moving to /.
That’s what I thought too. Consolidation makes sense, but why further the / root and /usr split, if that was the historic problem to begin with?
And if I have a look at my root directory:
bin cdrom hardy initrd.img.old lib64 media proc sbin sys usr vmlinuz.old
boot dev home lib lost+found mnt root selinux tftpboot var www
build etc initrd.img lib32 maverick opt run srv tmp vmlinuz
And then at my /usr:
bin etc games include lib lib32 lib64 local sbin share src
It’s clear which the appendix is. And if you were to move the /usr stuff onto main (nowadays have only one partition anyway), then there would hardly be additions.
But it also becomes appearant that “/games” is kind of redundant. There’s little point in having a separate binary directory for one type of applications. That is just other historic cruft (namely to shame those users who have games installed). Likewise should /include actually be part of /src. And I’m not entirely sure about /share, but that should probably go into /var anyway.
http://wiki.debian.org/FilesystemHierarchyStandard
Maybe the filesystem isn’t used the way it was decades ago, but there are certain use-cases where the original multi-disk split still does make sense.
Btw: your example of kill/killall is particularly badly chosen. “kill” does the same thing on every *nix, but “killall” has wildly different meanings. Try “man killall” on Solaris and see for yourself that it’s not always a synonym for “kill `pgrep <name>`”.
Now I will agree that the names are somewhat cryptic – it takes a bit getting used to, but just like every other system developed largely by academia, there is a somewhat steeper learning curve, mostly because of jargon. But once you get past that, it starts to make sense (mostly).
There are use-cases where having a separate /bin, /usr/bin and possibly /usr/local/bin is beneficial – diskless network-booting machines spring to mind. A minimal kernel+initrd via PXE contains / and most everything else is loaded from NFS mounted at /usr. The /usr/local environment is provided for site-customized versions of stuff in /usr, making it easy to keep the original distro stuff separate.
Again, I’m not saying that the *original* reasons why Unix had those were very good reasons to begin with, but bad solutions can adapt to new problems and become good solutions to those.
Hence the development of pkill/pgrep. Much nicer to use than kill/killall.
One could even argue that the Solaris version is actually much truer to its name, since it does what it says on the cover: kill all active processes.
“Since the geek is a proud and stubborn creature…”
Does Mark Shuttleworth come to mind? Hint: unity, HUD etc
I think one of the next Fedora features is to rearrange the directory structure moving the entire thing to /usr, and it’s 50% done by now.
Anyone else looking forward to StaLi: http://sta.li/
Yep, cool stuff. If you like Stali, I think you’ll love this:
http://www.usenix.org/event/hotsec10/tech/slides/suzaki.pdf
While it’s hard to argue with why the inventors of UNIX did it a certain way, the proliferation of it is actually simple, and quite sane as well.
It goes back to the day when people used NFS extensively for workstations. Here’s what they were all used for:
/bin – Local disk, dynamically linked binaries (against libraries in /lib)
/sbin – Local disk, statically linked binaries
/usr – NFS share
/usr/bin – Shared binaries (apps..) dynamically linked against libs in /usr/lib
/usr/sbin – Statically linked binaries
Basically, /bin and /sbin contained what was needed for system maintenance, and enough programs to mount the NFS share.
Before you mounted the NFS share, /usr simply was not there.
It annoys me to no end that many Linux distros today put dynamically linked binaries in /sbin…
But, I digress.. The usefulness of this layout got outdated the moment cheap large-capacity harddisks entered the market.
It is annoying, supposedly Linux distributions no longer include static binaries for /bin and /sbin.
They were typically required on Unix systems for single-user mode.
Someone told me that static glibc has been broken for many years now.
No wonder Linux folks don’t understand the reasoning behind /{,s}bin and /usr/{,s}bin.
Statically linked just isn’t as useful anymore, with chroot and initrd on Linux you don’t really need it.
That is why it kinda got into disarray.
When your mission critical apps die because some *&*$)&^ vendor screws up testing on a revised shared object library, you damned well wish you were able to have a fully tested statically linked version.
Always install vendor app in a jail, zone, lxc or whatever it is called on your system and you are done ?
By vendor I meant RedHat.
Sorry, that solution doesn’t work for them ;-(
You’re installing patches into a mission-critical system without trying them on a test machine first?
About gobolinux:
http://www.gobolinux.org/
A really nice look at system directories…
On at least BSD, there is the hier(7) man page that documents the filesystems hierarchical layout.
/bin == statically linked binaries, standard unix programs.
/sbin == statically linked binaries, system utilities, typically intended system administrators.
/usr/bin == shared binaries, base programs for users.
/usr/sbin == shared binaries, base system utilities for users.
/usr/local/bin == shared binaries, 3rd party user programs.
/usr/local/bin == shared binaries, 3rd party system utilities.
It’s not that difficult, and it’s pretty easy to understand.
http://www.openbsd.org/cgi-bin/man.cgi?query=hier&manpath=OpenBSD+C…
http://www.freebsd.org/cgi/man.cgi?query=hier&manpath=FreeBSD+9.0-R…
http://netbsd.gw.com/cgi-bin/man-cgi?hier++NetBSD-current
http://developer.apple.com/library/mac/documentation/Darwin/Referen…
Edited 2012-01-30 22:44 UTC
By the way, Debian GNU/Linux and Arch Linux has ‘man hier’ as well.
I still think that anyone who has ever complained about being confused by the FHS has not done five minutes of research into the WHY of it. It makes perfectly logical sense. And is by far one of the cleanest file structures I’ve ever seen, and I’ve seen a lot of them.
Take the Amiga for example, it’s got a decent file structure for the most part, Libs, Devs, C, L, S, etc. Now for anyone who has not used an Amiga, would think “what the hell is L for?” Wait, I don’t even know right now. Can’t think of it, but it’s laid out in a fairly organized manner. Windows on the other hand is a horrible mess. Not because of Microsoft, but because of the third party applications. The Linux Distributions (and FreeBSD) are good because the package management keeps things clean.
Not everything in /bin and /sbin is statically linked on all the BSDs. For instance FreeBSD and MidnightBSD have quite a few binaries with shared libraries in /bin. /rescue is supposed to be the safety net for this sort of thing.
Well, first lets recap that standards structure do not come always from a well thought process in its beginning. Most of the times they are decided upon a need to make interoperability work once a “critical mass” is achieved on some activity or when technical or practical knowledge determine than for safety reasons. The last cases are not prevalent on what we are talking about, at least a priori.
So, it really does not matter how it started but how it will guarantee that its goals will be fulfilled.
Now, going back to the proposed merge of bin/* to usr/bin/*.
I know that fedora (red hat) people do a lot of the hard work to make a working and modern system of FOSS software, linux + gnu + projects, but frankly I think they should grow inside a more respectful attitude towards the FOSS community. Look how this “fix it now” posture created a lot of problems that took “forever” to be finally solved and wasted too many manpower. HAL and Pulse Audio (ok, I know, first rushed by ubuntu folks), just to name few, are terrible examples of this.
We already crossed that point when “fix it now” was a must. Our systems already work reasonably. We already have the knowledge and already achieved the critical mass point. It is time to do things properly and have open discussions about possible solutions for our problems like they have on every other engineering field, be our concerns related to interoperability, safety or whatsoever it may be.
If someone wonder, I really prefer a more structured approach like they have on debian guidelines and FreeBSD. I see this “dump everything on /usr/bin” as a rushed solution more than a properly build one.
Really… Am I the only one, who thinks that all what Thom said is correct? I mean, the directory structure on unix and linux is shit. Really. This is the only word, that is as precise as possible to describe it.
I mean, when making it like
/apps
/user
/systems
How easy would it be to build on that? I mean, there is no good reason to make a directory system like that, with virtual links for programs, which needs the /usr /bin and so on stuff
Yes, you are the only one
The symbolic links for everything that Fedora is doing is retarded. Every package managed program goes into the proper /usr/bin, the data into /usr/share/<packagename>/ and documentation goes into /usr/share/doc/<packagename> how hard is that? Then anything you compile yourself gets put into /usr/local/bin with data going to /usr/local/share. Granted, you can change this in the configure scripts, but that is the default.
Makes total sense.
I’d say the only ones who have problems with the fs hierarchy in linux are those who’s only agenda is to look at the fs. The rest of us are pretty much fine.
Only the people who doesn’t read books ignore the difference.
It does make some sort of sense to split /bin and /sbin (or whatever you want to call them: e.g. /userbin vs. /systembin), so that normal users get /bin in their PATH and root gets /bin and /sbin in their PATH.
/usr/bin and /usr/sbin make no sense now of course plus most systems don’t partition /usr separately either, so this “binaries won’t be available if a non-/ partition isn’t mounted” issue doesn’t really apply nowadays.
To be honest, /bin and /sbin make sense to me both from the user vs. system split and also that they are about the shortest name they can be that remains somewhat descriptive of their contents.
I’ve never been too concerned about filing system paths to be honest – the fact that Linux doesn’t use ludicrously archaic drive letters for mount points makes it already far superior to Windows already 🙂
So this is all fine. new installs can do the new stuff.
What i really dislike are the current rolling distributions that are basically screwing up folks who have /usr on a separate partition. Arch’s latest antics are to install modprobe on /usr/ instead of /
Funny part is my initrd builder somehow got reset and i ended up with a trashed system since my ‘/’ mounted but i had no way to load up modules from ‘/’
IMHO, FHS seems to be among the last things that hold highly volatile s***e like Fedora and Ubuntu together these days. Break that and what is there to be Unix-like anymore? Wonder if other distributions will follow.
Edited 2012-01-31 04:52 UTC
Is somewhat fun that everyone is yelling about old stupid standards when talking about a directory structure that even today makes sense … while using a qwerty keyboard.
How do you know they aren’t using Dvorak or Colemak anyway?
For most users, the file hierarchy doesn’t matter in the slightest. The package manager stuffs files into the appropriate places and things like PATH/LIBPATH/MANPATH make it transparent to the end user. And that is assuming that you’re using a shell. It matters to GUI users even less since everything is presented in a tidy sort of way by the DE. So if you’re think about it and you’re not in one of the following categories, you’re thinking about it all wrong.
Developers, well, it doesn’t really benefit them but they just have to deal with it. Besides, it’s probably a lot easier for them to deal with a complex file hierarchy than it is for them to deal with an ever changing file hierarchy. This is something you’d notice as soon as the file hierarchy changes and things just stop working.
The people who need the complex hierarchy are systems administrators. Having binaries in separate places is useful because it allows the file system to be split up. This is something that we discovered historically, and sometimes it is still useful so we keep it. The utility ranges from a read-only OS partition to network shares. Heck, even Mac OS X has (or at least had) provisions for this. That’s why there are subdirectories under /Users that serve the same function as subdirectories under /System and (IIRC) there are provisions for network mounts too.
Remember, not everyone runs a completely up-to-date simple-minded system.
Not everybody even runs just one system: some of us have to deal with a flock of mis-matched servers (with a variety of distributions, yet), as dictated by the PHBs and bean counters, to which we don’t even have root…
Having separate /bin /sbin /usr/bin /usr/sbin folders is a very important thing. It separates critical system binaries from other computer applications.
Merging them is like saying let’s dump everything in c:\programs and files\ on windows XP (the last ever windows I used back in 2004) into c:\windows\
It is plain stupid. Let’s not keep changing things just because it makes us look cool.
you are so… or you just play dumb?
http://www.mactech.com/articles/mactech/Vol.17/17.08/Aug01MacOSX/in…
About 3600 words for an introduction on a directory structure.
Thanks for proving my point.
Edited 2012-01-31 11:30 UTC
And to use MacOSX on a day to day basis (I worked with 4 people that use macs), knowing the directory structure is not necessary.
You are Apple-hater. Mac OS X have beautiful and elegant files and folders organization. But you hate Apple to much to see this.
regarding:
“The Unix-like directories are there because so much of Mac OS X (and the software that runs on OS X) requires these directories to exist. …
Apple does not encourage developers to use these directories, but provides them for compatibility with other Unix-based operating systems.”
http://www.mactech.com/articles/mactech/Vol.17/17.08/Aug01MacOSX/in…
http://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard
From the first time I read that, it seemed simple enough to me.
You might think Gobo Linux brings order, but you aren’t taking into account that it carries the “legacy” structure too, it just hides it.
http://wiki.gobolinux.org/index.php?title=The_GoboLinux_Way#The_.22…
That’s not simplifying things, it’s just sweeping things under the carpet. Hiding complexity is not the same as removing complexity (though of course it has its place). If you actually want to understand the system you still have to understand it, and with a second system as well, now there is more to understand.
The Unix Filesystem Hierarchy Standard comes from a tangled history, it may be true, but it’s been morphed into something sensible and more importantly, standard. Yes, usages have evolved, but that’s only right and proper to put things to good use.
You try and reinvent a standard, you often just end up with two systems because you have to support the legacy. Which is only worse if not everyone sees it as “legacy”.
(Wayland is wonderful, I love it, but it won’t ever kill X, but with X running on top, it makes X actually better!)
This Fedora change is small, but to me pointless (or worse), not even a carpet sweeping exercise.
The Debian multi-arch change, now there is a file hierarchy change I can get behind. In fact, as someone who has done a bit of cross compiling for ARM, and runs a x64 system (thus i386 too), it’s very exciting. I hope it gets pushed out to not just other Linux distros, but to other *nix distros. Maybe even an addition to the standard, you never know!
jabjoe,
“You might think Gobo Linux brings order, but you aren’t taking into account that it carries the ‘legacy’ structure too, it just hides it.”
Of course you are correct, but solving that is a real conundrum. Unfortunately one is required to maintain legacy paths whenever compatibility is required since the old paths are hard coded in binaries and libraries throughout the linux software codebase. The *inability* to change them makes the status quo ideologically imperfect, even those who don’t mind the legacy paths should recognize this.
In my personal distro, I’ve given it some thought and came up with two possible compatibility solutions. Keep in mind, a motivating goal is to keep application specific files together so that, as with GoboLinux, applications can be installed by untarring a single directory. A list of commands can be built using symlinks pointing to their applicaiton directories.
Solution A. Remove the dependency upon legacy paths by modifying the kernel and/or libc to treat file requests as “named resources” instead of absolute file system paths. This mapping could be stored in either a system wide or application specific named resource map. This way, when an application requests “/etc/resolv.conf” or “/usr/bin/perl” or “/dev/ttyS0”, it could be mapped to the actual file where-ever it is located. The mapping could be optional and revert to previous behavior as needed.
This would provide two immediate benefits:
1. The dependency on fixed legacy fs paths is eliminated (or rather it’s internalized to the binaries/libraries).
2. It could provide excellent documentation about what the external dependencies are for any given application, and makes it trivial for administrators to change them per application without recompiling a thing.
Solution B. Run applications inside an automatic CHROOT environment to mimic the environment they expect.
Basically, the chroot contains a fake root directory layout which symlinks to the actual directory layout and can do so without polluting the actual directory layout with legacy paths.
chroot/hostfs — mount –move of /
chroot/bin/ -> hostfs/app/bin
chroot/sbin/ -> hostfs/app/bin
chroot/usr/bin/ -> hostfs/app/bin
chroot/etc/ -> hostfs/config
The problem with solution B, is that while it works in a compatible way, users of these apps will see the legacy paths instead of the hostfs paths.
Or Solution C, KISS, just stick with the standard as you are going to have to anyway.
It’s not unreasonable things have hard coded paths to the standard location. The problem isn’t those following the standard, it’s those breaking it. Bend the standard, maybe campaign for revision. Evolution not revolution.
jabjoe,
“It’s not unreasonable things have hard coded paths to the standard location. The problem isn’t those following the standard, it’s those breaking it. Bend the standard, maybe campaign for revision. Evolution not revolution.”
It is unreasonable when the paths don’t reflect the legitimate organizational requirements of your distro/system.
You’ll have to admit that for many, the hierarchy is full of legacy decisions and exceptional logic, which is a source of chaos. It’s not unreasonable to want the ability to clean that up moving forward.
Then the question has to be why doesn’t the organizational requirements of the distro match the software that will be used with it? All bar a few of the other distros manage it…
I don’t believe this is the case for people who actually read what the standard is.
The origins maybe, but things are bashed around to fit and have a sensible purpose. It is damn right amazing it is as simple and logical as it is after so much time and evolution. I read it and I could believe someone designed it to be like that today (with typing in mind admittedly), not that it was from over 40 years of evolution. For computers especially, that is damn right gob smacking.
Imagine what it would be like if it was a stack of reinventions. The horror of 40 years of revolutions.
I’m sorry, but you aren’t, least not from my point of view. You are creating yet another one, then hiding the standard one you must have, making everything worse for anyone who has to pop the hood.
jabjoe,
“Then the question has to be why doesn’t the organizational requirements of the distro match the software that will be used with it? All bar a few of the other distros manage it…”
That’s a good question, with open source software at least one can change the source code and recompile software specifically for the distro. For my distro, I seriously considered doing it, but it became too much work for something that would become an ongoing maintenance burden. Ideally 1) it wouldn’t be necessary to modify the source and recompile software in the first place, and 2) one ought to be able to create a singular RPM/DEB which would work across distros. So long as it’s necessary to hard code absolute paths in scripts/libraries/binaries, those packages impose legacy paths on the distros which use them. For example, Mint may not have wanted to use the same file system layout as Ubuntu, but since they’re using the Ubuntu repositories, they have no choice.
If we migrated to “named resources” instead of absolute paths over time, as I suggested, I think linux app organization would be better off and it would aid distro developers in organizing their file systems as they saw fit.
For the record, it’s not a lack of understanding about the current hierarchies, but rather the observation that those hierarchies are based around assumptions about storage which aren’t valid for all of us. In other words, it’s somewhat of an imposition of third party policy.
Yes, going against the grain is hard. The wood isn’t going to change it’s nature for you. Others aren’t going to use your indirection standard so you can ignore the existing standard. Invisible redirection standard is by redefinition less clear than the filesystem standard we have now.
I think you will find it very fustrating to go against the grain like this. You will just end up raging against others not doing as you ask for your project. Blaming them for your difficulties. Save yourself, if you still can!
jabjoe,
I think your missing the point: it’s not about the difficulty, it’s about continually discussing and working on ways of making linux better. You seem to indicate that it’s good enough for you, therefor other’s like me shouldn’t bother tackling it to suit our needs. But my opinion is that complacency might be holding linux back. It’s not to discredit the achievements of the past by any means, but we also need to look forward. I’m not alone in wanting more consistency in the file system, and I’m glad others like GoboLinux are working on ways to try to clean things up a bit.
I don’t think the file hierarchy needs to be seen by most users. The UI guides them to stay in home and provides drives not mounts (though of course they are really mounts). They have UIs to configure things. Any users more advanced than that can understand the file hierarchy, it is not hard, very little to read. So who is this for?
The difficulty can not be dismissed. Cutting the wood how you want regardless of grain, just makes crappy furniture.
The difficult is not in the changing of decades and decades of code (which you aren’t going to be able to do, so you will need a legacy tree). The difficult is getting everyone to commit to such a large body of work for so little gain. They won’t agree to a all new hierarchy, and they won’t agree to some indirection standard. Even if the world wasn’t what it is and you convinced the majority of developers to change, there will still be enough that you have to do a legacy tree, making your system more complex than what was there before as it requires both and to understand how they interact.
We already have a mess like this with audio. OSS, ALSA and PA. Revolution doesn’t work. If the Linux audio guys had evolved OSS instead of made ALSA, they would have come to something like OSSv4 and we wouldn’t have this mess.
I can see you are determined and single minded when you have committed to something, like myself, but you are on the wrong track and I hope you don’t waste years on this. You have been calm and patient making your point, that bodes well, so maybe you can see what I’m saying and avoid this time/life sink.
jabjoe,
“I don’t think the file hierarchy needs to be seen by most users.”
Sometimes it’s difficult to avoid, but I am not going to disagree with you on this point in principal.
“Any users more advanced than that can understand the file hierarchy, it is not hard, very little to read. So who is this for?”
It’s not about only about “understanding”, but about organizational consistency, clarity, and the simplicity of keeping application files together. Frankly dos applications were much easier to manage because they had this property. Of course external dependencies are sometimes a necessary evil, however it would be great if those external dependencies could be managed using simple & explicit resource dependency maps.
Of course what we have now works, and the repositories hide alot of the complexity for most users. But that is just pushing the problem upstream. Install an app like asterisk or ffmpeg from source and tell me dependency hell isn’t a problem under linux. I think solutions are within reach, but we need to start with cleaning up our messy inconsistent directory hierarchies.
“The difficulty can not be dismissed.”
I don’t think the solution needs to be as difficult as you imagine. Mine is just one solution, but there are other possibilities. I think it could be done transparently by adding a process resource map which would get loaded by the distro. This map could be anywhere, but in a GoboLinux hierarchy it would make sense to keep in the application’s directory. This would allow the distro to abstract absolute hard coded paths in binaries to distro specific paths. As with GoboLinux today, this would also make managing/running multiple versions a breeze.
“You have been calm and patient making your point, that bodes well, so maybe you can see what I’m saying and avoid this time/life sink.”
I’m not getting a clear message on whether you are speaking so discouragingly because you don’t want others to make progress on this front, or if you merely think we’re too entrenched in compatibility issues to fix it.
Oh please don’t start saying shared object files aren’t needed any more. For each process running on you system, add up the sizes of all the shared files in use. Then add up all the files sizes, but only add once for each shared file. Compare. It was like 1GB last time I checked! That’s before you get into updating being cleaner with shared files. I’ve got very little time for the “we don’t need DLLs any more” camp. It’s easy to find what uses what forwards using ldd and backwards (and forwards to if you want) with repository info.
Never touched asterisk, but I’ve built ffmpeg a few times over the years, I don’t remember anything about it being hard. Grab the source, do, “apt-get build-dep ffmpeg” and ./configure, make and make install. Easy. You can also keeping doing ./configure and get each dependency if fails against. Some project have either as part of ./configure or separate, a script to pull down the dependencies from a number of repository systems. Setting up a build environment in Linux is far far far far far far far easier than on Windows or RiscOS.
And how are you going to convince them it’s a problem they need to solve? Its not for a problem for them…. Not everyone sees what you see as a problem, as a problem. Those people aren’t going to lift a finger to help.
I’m trying to save you time/pain. I don’t think for a moment you will get any further than Gobo and it will be a painful experience. I guess that is discouragingly, but the motivation is kindness. I’m sure it is at very least an educational experience, reopening all those sealed can’s of worms.
Anyway I think we call this, this is now a crazy thread. We aren’t getting any where with each other. Good luck, maybe I’m wrong and something will come out of it. I’m a happy debian user and the only file hierarchy change I want right now is multiarch.
jabjoe,
“Oh please don’t start saying shared object files aren’t needed any more. For each process running on you system, add up the sizes of all the shared files in use”
I didn’t really say that, but I consider it a separate topic. I don’t consider shared libraries bad per say, but some kind of organization is needed to help keep sanity. It’s not a good thing to dump all libraries in global directories even though that’s where they go today.
“Never touched asterisk, but I’ve built ffmpeg a few times over the years, I don’t remember anything about it being hard. Grab the source, do, ‘apt-get build-dep ffmpeg’ and ./configure, make and make install. Easy.”
I would always expect that the distro repository sources to have already been customized for the distros, however they are very frequently out of date. If you install a version directly from the developers, the onus goes to you to fix it up to work with your distro. Ideally extra work wouldn’t be necessary.
“And how are you going to convince them it’s a problem they need to solve? Its not for a problem for them…. Not everyone sees what you see as a problem, as a problem. Those people aren’t going to lift a finger to help.”
Not everyone needs to be convinced, just enough to make a sustainable project.
“I’m trying to save you time/pain.”
Constructive criticism is good, but to be perfectly candid some of these remarks read as though you are trying to speak down to us rather than to be helpful.
Unix being “elegant” is contentious at best.
Why do we have to deal with the sexagesimal anachronism when dealing with time? Why 60 minutes in one hour instead of 100? Doesn’t it make more sense?
Why do we use the obsolete Gregorian calendar? Why all the months are not equal?
There was an attempt to modernize the calendar during the french revolution. It lasted 12 years and was reverted back to the Gregorian calendar.
The answer is simple: we have built a culture on top of the Gregorian calendar. Changing it means reinventing almost 2000 years of culture for little benefit.
But then, the french revolution still reinvented thousands of years of culture. Why did that happen despite the cost? Because it was worth it. The new system of government is way better than the old one. It has real benefits.
Any attempt to change a file system hierarchy is not worth the trouble. There is no benefit and lots of things to reinvent, starting with the file system hierarchy itself. For what benefit? Having /usr called /users? ln -s /usr /users. Pointless. It’s the exact same thing.
I’m not against progress, I’m all for it. But please, starting over for redoing the same thing is not progress. It’s a huge step back.
Hum, your summary is a bit biased:
– French revolution also invented the metric system which stayed, even though the calendar didn’t, why one and not the other? I’m not sure that there a logical reason.
– also remember that the French revolution was followed by emperors, so the “benefits” weren’t so obvious, it took a long period of time before the “benefits” stuck..
I’m not sure I agree with you, what the difference between the FHS and the imperial system?
Both are obsolete, and the metric system was worth the pain IMHO, so maybe a FHS like Gobolinux would be interesting.
Thom, what Rob said here was the exact opposite of your summary; shared libraries introduced a new problem that was not there before – they require bin and lib to match, so when splitting the system over two partitions you must, at the very least, make sure nothing in /bin depends on /usr/lib.
No idea how Linux folks have it, but /bin, /etc and /sbin is base system in FreeBSD, /usr/local/bin, /usr/local/etc is for not base system – transimission, samba, etc.
Works pretty well for me.
Not only the Unix filesystem naming is ancient and stupid, the very concept of filesystem is also ancient and stupid.
Computers should have databases of information, not filesystems.
Databases are slow.
And a filesystem is a database, optimized for data access in large units, many GBs in size.
Databases are optimized for data access in tiny units, such as strings, or single numbers. They aren’t good at huge units in hundreds of Mbytes.
They are not.
It’s not. Files are unstructured binary blobs. There is no way to query what’s inside them.
So can databases handle TBs of data, not only GBs.
They are.
What info exactly do you expect to be able to query from a jpeg or mp3? Only meta-data will have anything meaningful for a query, and yes, that sort of data, makes sense in a database and often is put there. But its also embedded in the file as its tiny and that ensures it stays with the file.
You could view the filesystem as a database where the path system is the primary index. Find and Grep can be used, with others, to query. Ok, it’s the command line not SQL, but I won’t be surprised if someone has written something to do it with SQL. Indexing scales better than sequential scan and there are things to do exactly that for your files. But they are all in userland. The kernel need only provide the basics, the primary key and the data.
It’s the wrong tool for the job. You don’t store that in a database, you store it in a file on a filestore. Many database just aren’t design for storing GBs in a column entry for a row. It’s just not what they are for.
Size. Author. Title. Date. Compression rate. Encoding rate. Decoding rate. Etc. There are many other attributes to query.
But it is put there by specialized software. Querying for metadata is not a standard feature of most filesystems, as is, let’s say, the POSIX file interface.
But it is not relational.
These tools fail to return structured data, especially from non-text formats.
I never said anything about kernels.
Nope, it’s the right tool for the job. The various development problems we are having today are due to the lack of databases in a large degree.
These GBs that you speak of would be broken down to their individual parts, if stored in a database, and they will be indexable, and queriable, discoverable by any program, they would support transactions, and they would allow programs to be notified of changes in the data store. All these capabilities are absent, more or less, from today’s data storage systems.
As I said before, metadata. Or maybe media information from something like ‘mediainfo’.
Nor should it be. Wrong level to have it.
Not quite, but it’s not really hierarchical either, not with links, it’s more of a network. Still easy to argue is a database of a form.
The case in question is meta data, the tool ‘extract’ sticks the meta data in a set structure out to stdout. With that, find and grep, you could write a “query” that find all jpegs that where taken with a certain camera. Of course this is a sequential search and a indexed one would be better. Plenty of programs that build custom indexs/database of meta data for this kind of purpose, but you don’t want to index everything in every way, just in case.
Example please. Windows WMI certainly hasn’t convinced me.
What parts exactly? Not all fileformats are like that, many are just blobs. What about a video, once you dismiss the metadata stuff, how are you going to store the stream? Binary blob block/chunks, just like a filesystem? On their own one is likely meaningless, and won’t have anything in it you can search. Then why bother? Creates lots of needless big vacuuming for no gain.
Many filesystem do offer transactions, and even with those that don’t, you can still do it. The SQLite site has some a doc on how they do it generically: http://www.sqlite.org/atomiccommit.html
Pretty much every OS has a notification system for file or folder changes.
Linux’s is inotify, or even just the “select” call.
Found this, sounds like what you want:
http://en.wikipedia.org/wiki/Pick_operating_system
It’s not metadata. It’s data. And it’s not only media that have useful data.
Yes it should. There should be a standard for it, to allow applications to interoperate at data level.
You can’t run queries on a filesystem regarding the data inside the files, and therefore it is not a database.
Nope. The discussion is about information management, not metadata only.
The schema of the information output cannot be queried at runtime.
But not all jpegs taken on a certain afternoon, or within a specific time period, or with multiple cameras or users, or many other things.
If this functionality was supported out of the box, these programs would be redundant.
Almost every application has a layer of data input/output from/to files. This layer would be redundant if databases were supported out of the box.
Nope. All file formats have an internal structure, otherwise they could be read after they were written.
As an array of frames, and each frame being an array of pixels.
Nope. One can search for a particular scene, employing pattern matching with another picture, or even a hand-drawn one, for example.
Many, but none of the major ones, as far as I know.
SqlLite achieves transactions through various mechanisms, as described in the paper. These mechanisms are based on the functionality provided by filesystems, but the filesystems themselves do not have the concept of transaction.
In order to enable transactions in another application, one has to rewrite the same SqlLite mechanism.
But these notification mechanisms are not compatible with each other, many O/Ses don’t even have such notification mechanisms, these notification mechanisms don’t work over the internet, and applications cannot be notified about what exactly changed inside a file.
Not bad. It’s a step in the right direction.
I don’t think you have understand much of what I just said. I give up. Please try and program your crazy idea.
Just please ponder why Pick is in the dust bin of OS history and Unix not only survived but was endlessly copied.
jabjoe,
“I don’t think you have understand much of what I just said. I give up. Please try and program your crazy idea.”
Thanks for the support, haha. I offer you two pieces of advice for talking to people like me in the future. 1) try to talk to us as intelligent peers, nobody likes to be talked down to. 2) Focus less on discouragement and more on constructive criticism which might yield better ideas.
“Just please ponder why Pick is in the dust bin of OS history and Unix not only survived but was endlessly copied.”
Never used pick. On a side note unix was backed by one of the biggest monopolies in it’s time, which might have a lot to do with it.
No. UNIX was not backed by a monopoly. You must be thinking of DOS.
AT&T was not allowed to go into the OS business. It was during this time that UNIX grew to acceptance.
“No. UNIX was not backed by a monopoly. You must be thinking of DOS.”
Actually I was talking about AT&T who created unix.
“AT&T was not allowed to go into the OS business. It was during this time that UNIX grew to acceptance.”
AT&T was the sole developer of a unix-like OS from 1969 through the late 70s. I don’t know what AT&T was allowed to do as a result of the antitrust lawsuits, however as far as I know those were filed in the 1974s and didn’t take effect until the 80s by which time Unix was very well established. I’d guess if it weren’t for the AT&T lawsuits, Unix might still be the predominant OS today. I’m just speculating though and this was all before my time.
I’ve understood everything you said.
It’s ok, you don’t have any arguments, I understand.
Apparently, it’s not that crazy, since there are various implementations around.
For the same reason BETA, a superior video format, died and VHS, an inferior format, prevailed.
For the same reason the Amiga, a superior computer, died and the PC, an inferior computer, prevailed.
For the same reason the MC68000, a superior CPU, died, while the 80×86, an inferior CPU, prevailed.
For the same reason LISP machines, a superior computer architecture, died, and other, inferior ones, prevailed.
For the same reason BeOS, a superior IS, failed, and other inferior OSes prevailed.
See? it’s not always the better product that succeeds.
And that’s why he have so many problems in IT.
Edit: oops, I got conversations mixed up and replied earlier to wrong post.
Anyways, it is interesting how two people can look at the same thing and come up with different opinions. But as long as we recognize that we all have different needs, then we should be able to get along.
Edited 2012-02-03 14:55 UTC
But we all do have the same needs, in the end. That’s the problem. We all need to read data, convert them into native values, make computations on them, and convert them back to byte streams to save them on disk.
So much code is dedicated to saving and loading data…all this effort could have been saved for something better.
“As an array of frames, and each frame being an array of pixels.”
Um.. video isn’t like that. You CAN tread video that way, but it is too incomplete. For instance, a video frame is actually a collection of blocks, each block is compressed. Subsequent blocks use tables generated from the first block for further decoding…
And then there is audio… each audio track associated with a frame is blocked.. and there can be many separate audio tracks.. And for any video track, there can be multiple alternate video tracks.
Now we get to the additional packaging… The video may be encrypted as well… Some tracks encrypted, some not.
NOT GOOD FOR A DATABASE.
And trying to coerce hundreds of different formats into a database would make the database useless. Especially when most data won’t need the complexity.
I have worked with wether simulation data. A single run is 100GB or more (one week at low resolution). Some simulations are in 3D for just a few hours (also 100GB). Searches are made for intersections…(weather patterns)… yet the pattern is not something that can be described in SQL, which is really really bad at it.
These searches are closer to what is used in gaming – A 3D mathematical intersection from different points of view….
Database queries are just really stupid at that. And really really slow.
JPollard,
“Um.. video isn’t like that. You CAN tread video that way, but it is too incomplete. For instance, a video frame is actually a collection of blocks, each block is compressed. Subsequent blocks use tables generated from the first block for further decoding…”
I agree that a typical database would not do a good job with pixel level data. In theory it could be done, but the overhead would be insane. Some day a database might be bold enough to include video codecs, but not today’s databases. Even with codec support, I wouldn’t be very keen to use “SQL” to manipulate frame buffers… although it might for an interesting challenge.
“Now we get to the additional packaging… The video may be encrypted as well… Some tracks encrypted, some not.”
Ether the decryption key is available or it is not, this would be true whether the streams are stored in a database or not, so I don’t really follow what you are saying.
I’ve seen lots of databases, from oracle, MySQL, MS, Sybase…
None of them are any where near as fast as a filesystem.
Try searching blobs for information… VERY slow.
Try locating a blob given just a short name… nope. not gonna find it.
Try searching for all files of that name… Fairly quick at that… depending on how many indexes it has to go through.
Try maintaining metadata (acls, permissions, ownerships..) possible.. but try searching- REALLY slow.
How long does it take to recover? databases have to replay their journals.. can take hours for a database of a couple of GB. Especially if it is updated continuously.
Databases have their place. They are very good at non-structured small units of data. Relational database suck at structured data though – they have to constantly rebuild the structure. SLOW.
Filesystems have been tried in databases (look at sqlfs for one). They can work. But they are really slow.
Filesystem is just an address system for different byte streams. Everything is a stream of bytes. It is very simple and very flexiable. Simple is good.
It’s not simple. There is a huge hidden complexity behind it, evident in the huge amount of code dedicated to convert the byte streams to useful data inside a program.
How’s that any different than code decode a jpeg or other on disc files? Stick it in a userland lib. Less in kernel space the better. Plus often, you don’t need much userland code (maybe for ALSA, but not OSS 😉 ). A framebuffer file is just that, you can even memmap it and use it raw!
It is simple. The way I see things, the Turing machine works on addresses, certain address ranges map to certain things. Unix abstracts that so the address system is file paths, and the certain address ranges are files. It then abstracts further by making those certain address ranges map to files via a standard layout (and pushing settings to ioctl), rather than device specific. As everything is done via a file interface, you don’t need a million different syscalls, you only need file syscalls and a few others for what isn’t files (sleep, fork, etc). It makes everything standard and means you can plug pretty much anything together.
Edited 2012-02-01 20:53 UTC