“It’s a question that crops up with depressing regularity: Why don’t Linux filesystems need to be defragmented?. Here’s my attempt at answering once and for all. Rather than simply stumble through lots of dry technical explanations, I’m opting to consider that an ASCII picture is worth a thousand words.” Read the explanatin here. Elsewhere, “Why is Linux Successful?“
If your disk is close to full, no amount of smart FS design will avoid fragmentation. The author fails to mention that very important fact.
Try to tell fanboyish Mac users that HFS+ can not entirely avoid fragmentation when the disk is close to full, and they’ll scream at you and deny it. Linux users are not nearly as bad, since they’re usually more computer-educated.
He doesn’t fail to mention it…
From the article: “Fragmentation thus only becomes an issue on Linux when a disk is so full that there just aren’t any gaps a large file can be put into without splitting it up. So long as the disk is less than about 80% full, this is unlikely to happen.”
Reading the fine article I find:
Fragmentation thus only becomes an issue on Linux when a disk is so full that there just aren’t any gaps a large file can be put into without splitting it up. So long as the disk is less than about 80% full, this is unlikely to happen.
Where did Mac come into the article at all? The article is discussing FAT as a common filesystem shared by Windows and Linux users. If the article was written to compare HFS+ or NTFS and FAT it would be similarly valid. There is no “Omission of important fact” as has been well noted.
You MS fanboys are an insufferable lot.
Edited 2006-08-19 15:46
I stand corrected about the omission …
But the comments about Macs comes from the fact that some fanboyish people believe it can be completely avoided.
For the record, I own three Macs, and one PC. ๐ I was amused when you called me an MS fanboy.
I was worried OSNews may link this article (first saw it on digg, read the comments, wanted to puke).
“Here’s my attempt at answering once and for all.”
I hope so – it couldn’t possibly get any worse than this.
On the other hand, “Why is Linux Successful?” could have been a decent post on OSNews. I thought it’s going to be same license thing again, it’s not. Very refreshing.
Care to explain what about it made you desire to spray your half digested food and stomach acid out your mouth?
It’s hard to rebutt something that doesn’t even make an argument.
This is an (older) study benchmarking the effect of fragmentation on various filesystems. A freshly formatted file system may perform great, but what about 6 months later? It uses a tool that artificially “ages” a filesystem (randomly creating/deleting files).
http://www.informatik.uni-frankfurt.de/~loizides/reiserfs/index.htm…
Quoting
“The following graphs show that all file systems suffer of the aging. After 50000 creations the systems seem to reach their minimum performance. Best seen is the performance drop of up to 30 % when looking at the read or fread performance as it is not much influenced by lots of disc seeks as the rread performance. But also the random read performance significantly drops down to 50 to 70 %.”
So it seems to me all Linux filesystems could use a bit of freshening up after a while… But the defrag tools don’t exist, or they exist for a fee. (the case of reiserfs)
It’s hard to read this article without finding glaring ommissions that make it utterly pointless. Of course, NTFS is a big leap forward from the FAT filesystems. NTFS is a log-structured filesystem, which means that the on-disk format represents a time-sequential log of all data and metadata writes. During idle periods, the kernel processes the log, allocating blocks within its spacially-optimized B+ tree and copying the data from the log to the new location.
Basically, NTFS uses the log to optimize for writes and the B+ tree to optimize for reads. The filesystem metadata is always consistent, and unflushed sections of the log can be flushed upon recovery from a failure. NTFS was designed with the premise that modern operating systems provide sophisticated page caching algorithms that hide read latency, thus it is more important to optimize for write throughput. It’s a pretty well-designed filesystem, except it has some over-zealous serialization semantics that slow it down considerably. These provisions are only appropriate for physical media that can be unmounted suddenly and without OS control, such as USB keys and floppy disks.
I guess this article is about as good as it could be without even considering the concept that filesystems are based on block I/O and not character I/O as the article vaguely suggests. For one thing, in FAT, NTFS, and most Linux filesystems, files are block-aligned. That is, they always start at the beginning of a block, and any space left at the end of the last block of the file remains unused. The most notable exceptions on Linux are ReiserFS and Reiser4, which implement tail-packing to store multiple small files (and the tails of large files) in the same block.
Of course, Linux is not the be-all-end-all of filesystems, not by a long shot. Sun’s ZFS, IBM’s JFS2, and HP/Veritas’ JFS/VxFS are superior to anything available for Linux in most respects. All of these filesystems, in turn, have their own little problems. At this year’s Linux Filesystem Summit, representatives from the Linux kernel community and industry leaders discussed where Linux filesystems should be headed in the next five years. A particularly attractive option is to apply the concepts of log-structured filesystems to alleviate the write latency associated with copy-on-write (COW) filesystems such as ZFS.
EDIT: Oh, in related news, Mingming Cao submitted the first series of patches to the LKML that creates a fork of ext3 that will be called (you guessed it) ext4. This is a much-needed stop-gap solution to handle the huge data requirements of Linux in the enterprise, as well as other enhancements that break the on-disk format.
Edited 2006-08-19 02:39
I think you’re referring to the Linux Filesystem *Workshop*, where indeed some very interesting discussion on the future of filesystems took place.
A great three piece series about the LFW on LWN.net: http://lwn.net/Articles/190222/
Also, I think it’s arguable whether the filesystems you listed are superior to *anything* available for Linux, especially since IBM’s JFS is actually available for Linux… http://en.wikipedia.org/wiki/JF
That’s exactly the meeting I’m referring to, and I know it was called a “workshop” in the LWN article. This was just one of many subsystem-specific conferences held prior to the Linux Kernel Summit, which is an invite-only conference held immediately before the Ottawa Linux Symposium. Most of the subsystem conferences are called summits or mini-summits.
The version of JFS in the Linux kernel is really old and not even remotely close to the sophistication of IBM’s flagship UNIX filesystem, which has been completely rewritten since they contributed JFS (v1) to the Linux kernel. This is a common misunderstanding made worse by the proliferation of even more journaling filesystems called JFS.
The Linux flavor of JFS is a very good choice for low-power mobile applications.
Edited 2006-08-19 02:54
The version of JFS in the Linux kernel is really old and not even remotely close to the sophistication of IBM’s flagship UNIX filesystem, which has been completely rewritten since they contributed JFS (v1) to the Linux kernel.
The OpenJFS originates from the OS/2 JFS branch of the tree – you could say it is a a port of an old port – hmmm where is the port ?
JFS2 on AIX is a _completely_ different filesystem than JFS on AIX, other than the similarity in name.
I don’t know how much difference there’re today between JFS on Linux and JFS on AIX, but they have been enhanced and maintained by different group of people for a long time.
Linux is a kernel which supports many file-system types. It also has it’s own homegrown file-systems being ext2, ext3 (ext2 + journaling) and soon ext4. Linux also currently supports all the file-systems you mentioned minus ZFS. This story is propagating a very false simplification. Linux doesn’t have one FS, not even one flag-ship FS, it supports many FSes.
Linux does not have JFS2, which is the default filesystem on 64bit AIX.
Of course Linux supports multiple filesystems, just like everything else on Linux: plenty of designs, forks, and implementations (otherwise where xxxxxx number of packages come from?). That’s why this article is so stupid, starting from the title.
Rather than simply stumble through lots of dry technical explanations, I’m opting to consider that an ASCII picture is worth a thousand words. Rather than simply stumble through lots of dry technical explanations, I’m opting to consider that an ASCII picture is worth a thousand words.
I know you have a strong desire to discredit the author, but I think he gave a pretty strong introduction into how this was an explanation for people who actually have to ask. Not for people who know how a filesystem works.
Block structure would definitely make up for “!!” being appended to most files, in fact, on a typical filesystem it’s work for all but about 1 in 2048 situations. However, most file appends aren’t “!!” but are something larger than his entire example filesystem. Explaining things like blocks is just way too technical to explain to Joe Ubuntu Newb about why there’s no defrag utility.
Frankly, if you’re going to argue with the guy it can’t be because he’s technically wrong in some manner of omission as omission was the purpose of the article.
My point was that Windows has a pretty capable filesystem in NTFS. The author is claiming that “Linux” is somehow superior in preventing filesystem fragmentation based on rough depiction of how FAT works.
I don’t care if he gave a 100-page treatise on FAT, he’s not going to convince anyone that it’s a viable modern filesystem. A Linux kernel developer could code a filesystem superior to FAT32 in less than 2,000 lines of code (ext2 is only 6,000). My 1999 Honda Civic is more reliable than a 1974 AMC Gremlin, but I don’t need ASCII art to explain why.
If he set out to explain why ext2 and other Linux filesystem don’t have the fragmentation problems of FAT in a way that anyone can understand, he should have said that FAT was designed by Microsoft in the mid-80s to support 5 1/4″ floppies and 20MB hard disks, while ext2 was developed in 1993 based on the design of big UNIX systems. That pretty much sums it up.
Meanwhile, we at OSNews are intelligent people who like to talk about operating systems. While some people would simply say that NTFS is better than FAT, I decided to briefly explain how NTFS is more advanced than many people realize. I didn’t mean to argue or discredit, I intended to give a more astute readership a more technical perspective on a not-so-technical article. Maybe people will learn something.
So explain to me in extra chromosone terms why NTFS still needs to be de-fragmented occasionaly on a disk that is only using 40-50%?
I understand that NTFS is much better than FAT32, but I also know that every once in a while I still have to defrag the laptop I use for work, or performance suffers.
Theoretical stuff and “artificial aging” aside, I’ll stand by actual observation. My ext3 filesystems just don’t get fragmented.
Here is an example from a server I’ve had running continuously since Aug 25, 2003. It has been running a manufacturing accounting package for about 30 users for coming up on 3 years:
/usr: 132121/1281696 files (2.7% non-contiguous), 1649461/2558343 blocks
I’ve done a spot check of a number of my servers running ext3 and none are over 3.1% fragmented.
If I ever have a customer that needs to run a filesystem artificial aging application, I’ll be sure to check out Con Kolivas’ defragmenter. ๐
Edited 2006-08-19 03:29
Theoretical stuff and “artificial aging” aside, I’ll stand by actual observation. My ext3 filesystems just don’t get fragmented.
No reason to argue, I’ll assume that, under normal conditions, most Linux filesystems behave better than NTFS when it comes to fragmentation and thus don’t need defragmentation.
What about common, worse case scenario cases? That would be files downloaded by Bittorent and, perhaps, VMWare image files.
A little experiment with a ~700MB file downloaded by Bittorrent (filesystem: ext3, 34% full, client: Azureus)
sudo filefrag “filename”
“filename”: 5567 extents found, perfection would be 6 extents
The few other files that I have around, downloaded by Bittorrent, show similar numbers. (Try it yourself). Unfortunately filefrag doesn’t work on NTFS and I can’t directly compare (I assume it will be bad there as well).
What this shows is that fragmentation can happen (not only under laboratory conditions), and a defragmentation tool would be welcome.
Edited 2006-08-19 04:04
This is why Azureus has a preallocate option. I use it. At the expense of a little delay before downloading, you get a nicely linear slab of bytes.
Here’s what I get on a 700MB bittorrent download with a 61% full ext3 filesystem:
ubuntu-6.06.1-alternate-amd64.iso: 199 extents found, perfection would be 6 extents
I find that fact uninteresting.
What I do find interesting is the result of copying that file with cp and then dd’ing each file to /dev/null and timing the result. The cache is flushed before each trial by dd’ing another 2GB file to /dev/null.
Original file: 11.25 seconds
Copied file: 14.34 seconds
I’ll attribute the fact that the copied file is actually slower to it being on a slower part of the disk.
But these results certainly do not support the idea of fragmentation being a significant problem even in this “worst case” example.
Edited 2006-08-19 04:46
So it seems to me all Linux filesystems could use a bit of freshening up after a while…
I used to wonder about this. Then I realized that backups pretty much render it a moot point and refresh the filesystem at the same time. But only if they are done regularly.
Tar up the contents of a filesystem, compress it, store it somewhere and then reverse the process and write it back out to your disk. Fragmentation is back too [near] zero and your data is saved for now and you may start accumulating more.
# cd (to the one you want to defrag)
# wget http://ck.kolivas.org/apps/defrag/defrag-0.06/defrag
# chmod +x defrag
# sh defrag
This has worked for me so far and works with any Linux filesystem, I hope you find this a great tool and may it benefit you as it does for me.
I don’t get what’s the second article’s point. They mentioned BSD failure(?) is caused by overcontrol, thus leads to “fragmentation”. In my opinion, Linux is far more fragmented than BSD. There are too many Linux distros, compared to BSD, only less than 10.
http://bbspot.com/News/2000/4/linux_distros.html
Linux has one universal kernel – Linux.
All GNU/Linux distributions are based on the same kernel.
This is what they mean.
Not really. Many distributions (especially the more popular commercial ones) ship highly modified kernels with backported features and custom patches. A RedHat 2.6.x kernel is most definitely not the same as a vanilla 2.6.x one.
Not that highly modified. During the 2.5 days, the RedHat 2.4 kernel was substantially different, though still binary compatible, with the vanilla 2.4. With the release of 2.6 and the new development model, the distro kernels are much closer to vanilla, and custom patches are getting pushed upstream for inclusion.
In fact, Andrew Morton declared quite explicitly that if he felt that a vendor was substantially patching the kernel and not pushing the patches upstream, he would do everything in his power to thwart their efforts.
As far as backporting is concerned, though… backported from where??? With the 2.6 development model, what is there to backport from? -mm?
First, I reject the notion that the BSDs have failed. The broad definitions of “failure” that include BSD also include Linux.
The idea being expressed here is that Linus was able to bring a large group of developers together and keep them together. There have been no forks, no huge power struggles, and no alternative development communities. The Linux kernel project is a unified community endeavor that aims to produce a UNIX-like kernel that combines performance and universality. Special interests to not get very far in the kernel community.
The proliferation Linux distributions (about 10 of which are especially notable) is an example of the strength of the Linux culture. One can argue that some of these distros have a large overlap in target audience, but many of them, especially the smaller ones, take the same Linux kernel and accomplish a much different set of goals.
Linux has significant commercial prospects and this economy will continue to grow. Many of the Linux distributions are produced by commercial software vendors, all of which have equal opportunity to drive revenue based on their Linux offerings. It’s a free market (actually much more so than the proprietary OS market), and this will continue to drive investment in the Linux platform.
Besides, the many Linux distributions are predominantly composed of common code. The arguments that suggest lots of wasted development effort are largely, albeit not completely, misguided. Bug reports are generally passed upstream to the package maintainers, whose fixes are passed back downstream to the distributions. Many distributions even share common package repositories.
Do we need this many Linux distributions? Probably not. Is the abundance of Linux distributions hindering progress? Slightly, at most. Is it contributing to progress? Definitely.
Where do they get the idea that the BSDs have fail. when the BSDs are delivering as promised. OpenBSD,being the most secure OS, NetBSD being one of the most portable OS out there and FreeBSD easy to use on the desktop and on the server.
If Mac and Windows didn’t suck, people would’ve used them,” DiBona said.
Well, maybe the counterpoint could be if Linux did not suck so much on the desktop level, there would be greater market share.
“Adoption of the Linux desktop is more likely in emerging markets where there is no legacy,” Heohndel said.
Gee, nice how he could fairly understand social-political-economics issues so well..oh wait he didnt take those into consideration.
“We need to do whatever compromise is necessary to get full multimedia capability on Linux so non-technical users don’t dismiss us out of hand,” Raymond shouted.
A lot more than multimedia is needed there pal. Way beyond multimedia, Linux users just dont f**kin get it. 99% of people working on computers just want things to work the simplest way possible. For them, computers are a means to simplify life. They have no desire what so ever to play around, tweak, etc..and to date that is all Linux is, a geek toy.
The only thing hurting Linux..is Linux users, they need to get over their insecurities and obsession with Windows and start over. More importantly, Linux development needs to be taken out of the hands of geeks and put in the hands of professionals. Fact is Linux did not even register until companies like Novell and Redhat started to make a serious attempt at putting out an OS that could compete.
Did you ever compare for example an Ubuntu desktop with a Windows desktop for the end user? The first one is much, much easier to handle – I can tell this from my own experience with a few friends & family who I’ve switched to Linux. Gnome is much more consistent and user friendly than Windows is imho. In our company, we’re currently test-driving Linux (Ubuntu) on the desktop, and our five participants are very impressed so far, although they had to re-learn a lot of things, they now think thats it’s all very easy & intuitive. And for me as the admin Linux is just so much easier to maintain.
Quote by el3ktro: Did you ever compare for example an Ubuntu desktop with a Windows desktop for the end user?
Yeah Multiple monitor support sucks hard and you have to go to command-line to load nvidia drivers.
The crappy gui ‘control panel’ imitations don’t work well or give you access to all the features you need. Whenevery you change something in one of the ‘control panel’ that the various distro’s or window managers put out it overwrites your changes you’ve manually made. Honstly your better most the time to say to hell with the crappy GUI configuration i’ll just learn what and where to edit.
Quote by el3ktro: Gnome is much more consistent and user friendly than Windows is imho.
Configuration is still stored in a thousand little text files and things that should be autodetected still aren’t (or aren’t properly). Kernel’s sometimes need recompileing to support certain features the vendor’s may not have included in there kernel. A binary for one Linux distro can’t just be brought over to another linux distro to run it often has to be recompiled on the new machine. Redhat’s package management scheme was revolutionary for it’s time a million years ago when it was invented and now various other distro’s have decided they’ve no need to improve it (they all do pretty much the same thing). Binaries are stored all over the freaking place not in a nice orderly fasion like the ‘program files’.
Quote by twenex: Since when are former employees of Transmeta (Linus Torvalds), Digital (Jon Hall), and present employees of Google (Andrew Morton) not “professionals”? If, on the other hand, by “professional” you mean the arrogant marketroids who gave us Windows hype and Windows software, no wonder Linux is doing so well.[/QUOTE]
I belive the term ‘academic profesionals’ would apply to Linus and many of these people, they are people who make something that meets there needs and serves a purpose and they can be proud of it. You should look up a few of his quotes and browse a few of his posts to learn he’s definatly not the type of professional ssa2204 was talking about. Though I do see your argument I don’t belive any of them are currently up to the job of making linux a mainstream desktop os.
As for the other’s I hope your not suggesting that just because a few people work for a company there goals are the only focus of that company. If so congradulations on your linux fanboy status. FACT: If linux is going to become marketable as a good destktop os the linux community has to be more open and realize that there are alot of glaring issues. I’m not saying windows is perfect … ex: I hate the windows registry, it’s values SHOULD be stored attributes in the folders of the programs that use them. However it is no worse opperating system than linux. It is easier to use, and it’s more expensive. If you have trouble maintaining windows get firefox instead of IE and a decent AV/Firewall.
“Yeah Multiple monitor support sucks hard and you have to go to command-line to load nvidia drivers.”
I agree with the first point, you can get multimonitor running pretty good with Nvidia & ATI cards, but it’s still not as comfortable as it is in Windows. Speaking about Ubuntu: Installing the Nvidia or ATI drivers involves installing one single package (you can do this graphically) and editing one single line in a config file (replace your current driver with “nvidia” or “fglrx”. Imho this is easier than in Windows, where you first have to go to the vendor site, find the right driver, download it, run it, install it, reboot etc.
“Configuration is still stored in a thousand little text files and things that should be autodetected still aren’t (or aren’t properly). Kernel’s sometimes need recompileing to support certain features the vendor’s may not have included in there kernel. A binary for one Linux distro can’t just be brought over to another linux distro to run it often has to be recompiled on the new machine. Redhat’s package management scheme was revolutionary for it’s time a million years ago when it was invented and now various other distro’s have decided they’ve no need to improve it (they all do pretty much the same thing). Binaries are stored all over the freaking place not in a nice orderly fasion like the ‘program files’.”
So you think that some huge binary monstrum like the Windows registry is a better solution? If I want to remove a program _completely_ from my computer, this is often almost impossible in Windows. On Linux, I purge the package and then delete this prgram’s config dir or single config file in my home folder.
The RPM package system is crap imho, but it got better. Debian’s Apt is way better imho, and it’s far superior than anything that Windows can offer. If you know a program’s name, it’s way easier to install this program on Linux than it is in Windows, where you again have to browse the vendors homepage, download, probably unpack, install … On Linux, it’s a few mouse clicks to install any of those thousands programs coming with the distro.
The Linux file system is imho better than Windows’s file system, it’s just completely different than what you expect. Binaries are stored in any of the bin/ directories, libraries (“DLLs”) are stored in any of the lib/ directories, commercial packages go to /opt, any runtime data of programs goes to /var, temporary files go to /tmp etc. You have ALL your programs that belong to the OS itself in /bin, all programs which are for the root/admin only in /sbin and all your user programs in /usr/bin – I don’t know, but this is pretty easy and pretty well thought.
I don’t know what hardware you have, but on my machine, Ubuntu detects everything, including my camera, USB sticks & MPยง player, my printer etc. etc. This of course depends on the hardware, and on the vendor to support Linux or not.
The only thing hurting Linux..is Linux users, they need to get over their insecurities and obsession with Windows and start over. More importantly, Linux development needs to be taken out of the hands of geeks and put in the hands of professionals.
Since when are former employees of Transmeta (Linus Torvalds), Digital (Jon Hall), and present employees of Google (Andrew Morton) not “professionals”? If, on the other hand, by “professional” you mean the arrogant marketroids who gave us Windows hype and Windows software, no wonder Linux is doing so well.
wow i dont understand why such crappy article about “linux defragmentation” gets too much attention…
i miss hpfs lol
http://www.lesbell.com.au/hpfsfaq.html
http://www.kdedevelopers.org/node/2270
This article is just misleading, stupid, and doesn’t talk about fragmentation at all. It talks about loading times, and the dev doesn’t even realise what some kernel dev said some time ago : processors go faster, mem goes faster, but apps become IO bound and disks are still the slower part of the system.
I have SCSI disks, so my seek time are well under 10 ms, and it’s unlikely to get this for people using PATA or SATA disks.
And his talk about Linux kernel abilities are wrong too.
Well, not worth it, but one comment has a link to a better understanding of the problem.
This is an entirely different subject though.
Time to upgrade for this guy. Any modern Windows installation should use NTFS by now.
He explains why he’s using FAT: “Since both Windows and Linux users make use of FAT filesystems, if only for USB flash drives, this is an important filesystem – unfortunately, it suffers badly from fragmentation.”
I would imagine he also chose FAT as an example because it’s relatively simple compared to filesystems like NTFS.
in the sense that it’s fragmented through the many poorly compatible distros. More standards like fd.o and LSB/autopackage are needed (and not only standards themselves, but wide-scale compliance to them).
yes, in a big way
While it’s true that you can lower the fragmentation with a smarter file system allocation policy (unlike FAT32 *and* NTFS), you shouldn’t believe anyone who tries to tell you something doesn’t need defragmentation tools.
That’s just not true, and shows a big misunderstanding of the subject, no matter how many ASCII graphics he throws at you ๐
While I am no file system expert, I tend to credit the article in what he says: He never says filesystems with smart fs allocation policy doesn’t need defrag tools, he just says that it is stronger than fat, and describes a situation where the framentation would occur… I don’t know … seems okay to me… perhaps it is just my ignorance on the subject, though.
I agree with you alexandream – he set out to give us less-techy people an overview, and IMHO he succeeded. I personally found the simplified model of a filesystem was more useful than a complicated description of how filesystems work.
Unfortunately, his article wasn’t titled “Why FAT needs to be defragmented more often than other file systems” – I haven’t criticized his explanation, but the conclusion he made with the title and in the text as well is just wrong.
why does windows need defragging, and why does pretty much every other operating system not need defragging.
Writing a simple to understand article about a technical subject is VERY difficult. The author should be commended on his efforts.
Case in point, when describing something in simple terms, you have to leave out certain details such as block size. You can’t then turn around and say he left out details and bash him for it, otherwise it wouldn’t be simple anymore. Seems like a lose-lose scenario when it comes to many posting here.
We should all strive to be able to write clear and concise articles like this one without having irrelevant details cluttering the main focus.
I wish Mr/Ms butters could edit for OSNews. We would definitely read more news then gossips then.
Jarek P.