HFS+ bit rot

Thom Holwerda 2014-06-11 macOS 73 Comments

HFS+ lost a total of 28 files over the course of 6 years.
Most of the corrupted files are completely unreadable. The JPEGs typically decode partially, up to the point of failure. So if you’re lucky, you may get most of the image except the bottom part. The raw .CR2 files usually turn out to be totally unreadable: either completely black or having a large color overlay on significant portions of the photo. Most of these shots are not so important, but a handful of them are. One of the CR2 files in particular, is a very good picture of my son when he was a baby. I printed and framed that photo, so I am glad that I did not lose the original.
If you’re keeping all your files and backups on HFS+ volumes, you’re doing it wrong.

HFS+ is a weird vestigial pre-OS X leftover that, for some reason, Apple just does not replace. Apple tends to be relentless when it comes to moving on from past code, but HFS+ just refuses to die. As John Siracusa, long-time critic of HFS+, stated way back in 2011:

I would have certainly welcomed ZFS with open arms, but I was equally confident that Apple could create its own file system suited to its particular needs. That confidence remains, but the ZFS distraction may have added years to the timetable.

Three years later, and still nothing, and with Yosemite also shipping with HFS+, it’ll take another 1-2 years before we possibly see a new, modern, non-crappy filesystem for OS X. Decades from now, books will be written about this saga.

About The Author

Thom Holwerda

Follow me on Mastodon @thomholwerda@exquisite.social

73 Comments

2014-06-11 2:12 pm
The123king
At least it’s not FAT, which i’m sure will still be used and shipped in products 50 years from now.

2014-06-12 12:50 am
sergio
Not it doesn’t. At least FAT is simple enough to be recovered reading the raw data from it… HFS+ is not, it’s complex, stupid, strange and faulty FS.
HFS+ is the worst technology in OSX by a far margin. It sucks and it sucks really hard I suffered it several times. It cannot be used for serious work at all.
Two years ago I’ve worked with HFS+ over iSCSI and It was hell man, the HFS “catalog” corrupted every time and I was unable to mount the FS back. And It was really hard to get the FS in “clean” mountable state again cause Disk Utility refused to repair it… so I had to repair it using Disk Warrior!!
Never had a problem with Linux and ext3 (using the same iSCSI storage box)… if the FS had a problem, fsck repaired it and I simply mounted it again. HFS+ is a SHAME, I hate it.
PS: ZFS is the best FS in the world no doubt about it I use it every day in my servers… but it’s crazy resource hungry (CPU+Mem), It’s the ideal FS for a Mac Pro or an iMac but not an option for a Macbook energy consumption would be to high. I think that’s the reason why Apple dropped it.

2014-06-12 1:42 pm
egarland
PS: ZFS is the best FS in the world no doubt about it I use it every day in my servers… but it’s crazy resource hungry (CPU+Mem)
ZFS is also overcomplicated, has performance issues, and and most of its advantages don’t apply when dealing with a single disk system. It’s the wrong choice for a desktop filesystem where you just want basic functionality, no surprises, and for the filesystem to not get in your way. I’m not sure what other filesystems exist in the BSD world, but a port of ext4 seems like it would be a good idea.
The issue I see would be support for the resource fork, which is probably why ZFS implementation is stalling, but given the relative simplicity of ext4 I think it would be easier to add it there and would yield a more useful final product.

2014-06-12 3:14 pm
Alfman verbose=1
egarland,
…I’m not sure what other filesystems exist in the BSD world, but a port of ext4 seems like it would be a good idea.
The issue I see would be support for the resource fork, which is probably why ZFS implementation is stalling, but given the relative simplicity of ext4 I think it would be easier to add it there and would yield a more useful final product.
We shouldn’t forget that ext file systems are relatively old as well. ZFS/Btrfs are more future-proof.
https://en.wikipedia.org/wiki/Ext4
In 2008, the principal developer of the ext3 and ext4 file systems, Theodore Ts’o, stated that although ext4 has improved features, it is not a major advance, it uses old technology, and is a stop-gap. Ts’o believes that Btrfs is the better direction because “it offers improvements in scalability, reliability, and ease of management”
2014-06-12 8:48 pm
MrVain
ZFS is also overcomplicated, has performance issues, and and most of its advantages don’t apply when dealing with a single disk system.
ZFS is an enterprise filesystem that scales up to petabyte and larger installations, handling 100s of disks. Single disk installations is not the primary target.
Have you seen benchmarks comparing ZFS to BTRFS on many disks? 15 disks and more? ZFS blows BTRFS out of the water. BTRFS might be faster on single disk (is it?) but ZFS scales way way better up to really large installations. For instance, the IBM Sequoia super computer uses Lustre+ZFS because ext4 did not scale and did not provide data integrity, and the ZFS installation is >55 PB with 1TB/sec band width. Try that with BTRFS. 🙂
Regarding rumours that ZFS needs tons of RAM, they are not true. ZFS has a very efficient disk cache called ARC and disk caches loves RAM. If you only use, say 1GB RAM, then you will not be able to cache anything so read speed degrades to disk speed. But you do not need tons of RAM to use ZFS, the only thing that happens is that the RAM disk cache will be penalized, so you will have slower performance. I used ZFS on a 1GB pc for a year without problems, it was my only computer, so I used it 24/7.
2014-06-12 9:16 pm
d3vi1
ZFS is also overcomplicated, has performance issues, and and most of its advantages don’t apply when dealing with a single disk system. It’s the wrong choice for a desktop filesystem where you just want basic functionality, no surprises, and for the filesystem to not get in your way. I’m not sure what other filesystems exist in the BSD world, but a port of ext4 seems like it would be a good idea.
The issue I see would be support for the resource fork, which is probably why ZFS implementation is stalling, but given the relative simplicity of ext4 I think it would be easier to add it there and would yield a more useful final product.
First of all: resource forks are implemented on all modern filesystems: on NTFS they are called Alternate Data Streams, on Linux they are called Extended Attributes and they are supported on ZFS quite well. It basically means that you have multiple streams of data in a file. You can have a file that has tags, description, icons and anything you want in parallel to the actual contents. In Windows they were implemented in Windows 2000 and back ported to NT at service pack 3 or 5.
Secondly: most people think that the important ZFS features are the volume management ones, sure they are cool, but not important as we already have them implemented in LVM, hardware raid, CoreStorage, etc. The important ones are: snapshots, data and metadata integrity assurance, lack of fsck, insane limits, per filesystem properties and extendability. Those are things that OS X could really use even if they exclude the zpool part with the volume management.
Third: Ext4 is a terrible idea for a filesystem, it’s almost as bad as HFS+ since it’s ext2+journaling (=ext3) + larger limits in file size and volume size. It doesn’t give you any of the advantages of ZFS or WAFL for that matter. While the code and the on-disk data structures are cleaner and less error-prone, it does absolutely nothing to prevent data loss in the usual scenarios (bit rotting and idiot at keyboard).
Lastly: ZFS is not complicated, it has a lot of features, but it’s not really complicated and it’s incredibly well documented by Sun/Oracle and it also has at least two implementations that are completely open source (one R/W implementation licensed in CDDL and two R/O in GNU GPL v2+) made by Sun/Oracle. It’s perfectly capable of serving a modern laptop computer. Yes, for server file sharing usage you should have 8GB of RAM for every 1TB of storage, but that only applies to server usage patterns, otherwise it’s noticeably faster than HFS for desktop usage. When people use core-storage for fusion-drives it makes me sick thinking how cool L2ARC on ZFS is. It’s like comparing a Ford model T with the latest Tesla model S. The only common thing is that they both have 4 wheels.
2014-06-12 10:37 pm
MrVain
ZFS is also overcomplicated,
Forgot to answer this falsity too, so I adress it here instead. Here goes:
ZFS is much simpler than ordinary filesystems, in fact, ZFS has been called a “rampany layering violation” by Linux developers.
Old filesystems like ext4, NTFS, etc – all have several layers, filesystem, raid manager, volume manager, etc. This is the reason they can not provide data integrity, because errors might creep in when data pass one boundary to another domain. Within one domain, data might be reasonable protected, but there is no protection when data passes to another boundary.
OTOH, ZFS is monolithic, so ZFS manages everything, filesystem, raid, volume manager, etc. The data never leaves ZFS control. You need end-to-end checksums that follow the data even as it passes boundaries (which is what ZFS does in a correct way).
So this “rampant layering violation” is exactly what makes ZFS so good at data protection, which Linux developers fail to see when they try to FUD and shoot down everything Solaris has done. Ironically, BTRFS is modeled exactly like ZFS, despite that Linux developers said monolithic filesystems was a bad design.
So, because ZFS is monolithic it is badly designed and a rampant layering violation. When the Linux camp copies ZFS into BTRFS, then BTRFS is the best thing since sliced bread, and the Linux camp stops talking about “rampant layering violation”:
https://blogs.oracle.com/bonwick/en/entry/rampant_layering_violation
Also, because ZFS has control of everything, the ZFS devs could delete much unnecessary duplicate stuff. So there was a benchmark that showed LoC, how many Lines of Code each filesystem had, and ZFS had something like 60k LoC or so (can not remember exactly), whereas only the raid manager for ext4 had something like 50k LoC, and then you had to add ext4 filesystem, raid manager, volume LVM manager, etc – and it ended up at 250k LoC for Linux ext4 or so. I can not remember the exact numbers, but you can google for the link. Or look at http://www.open-zfs.org and browse the source code.
My point is that ZFS was the smallest filesystem of all. By far. And if ext4 have five times as much source code, which solution is over complicated? And ext4 can not even do half of the stuff that ZFS does, despite using 5 times as much code.
So you are wrong on this too. Nothing was correct:
-ZFS is simpler than any other filesystem (read the link above by the main ZFS architect)
-ZFS is much faster than any other filesystem (it scales the crap out of everyone else when you go to large disk isntallations)
-And probably other stuff too.

2014-06-13 12:05 am
saso
ZFS had something like 60k LoC or so (can not remember exactly)
Here’s the exact state of things (as of June 2014 in Illumos):
usr/src/uts/common/fs/zfs: 75,100 [kernel module]
usr/src/lib/libzfs: 14,872 [userland interface library]
usr/src/common/zfs: 1,912 [kernel & userland common portions]
usr/src/cmd/zfs: 5,782 [dataset manipulation command]
usr/src/cmd/zpool: 5,255 [pool manipulation command]
usr/src/cmd/zdb: 3,107 [ZFS debugger]
usr/src/grub/grub-0.97/stage2/*zfs*: 1,758 [minimal read-only implementation for GRUB]
grand total: 107,786
This excludes comments and empty lines with biasing on the side of excluding lines when in doubt (e.g. combined code + comment line), so it’s possible that this number is about 1% too low. Hope this helps.
2014-06-15 9:25 pm
MrVain
ZFS had something like 60k LoC or so (can not remember exactly)
Here’s the exact state of things (as of June 2014 in Illumos):
usr/src/uts/common/fs/zfs: 75,100 [kernel module]
usr/src/lib/libzfs: 14,872 [userland interface library]
usr/src/common/zfs: 1,912 [kernel & userland common portions]
usr/src/cmd/zfs: 5,782 [dataset manipulation command]
usr/src/cmd/zpool: 5,255 [pool manipulation command]
usr/src/cmd/zdb: 3,107 [ZFS debugger]
usr/src/grub/grub-0.97/stage2/*zfs*: 1,758 [minimal read-only implementation for GRUB]
grand total: 107,786
This excludes comments and empty lines with biasing on the side of excluding lines when in doubt (e.g. combined code + comment line), so it’s possible that this number is about 1% too low. Hope this helps.
Thanks for the research and helping us with the numbers. The number 60K LoC I refer too, was an old number. Anyway, ZFS is today 108k LoC, I have always been amazed how much goodiness that can fit into 100k LoC! ZFS is really a killer filesystem, complete, scales large, safe, with some nice functionality (snapshots, etc).
Do you know how many LoC BTRFS has? And ext4? Would be interesting with up-to-date numbers. The numbers I saw, where old and back then ZFS was tiny compared to ext4. How is the state of ZFS today? Has been turned into an excessive bloated filesystem, or is it still lean and mean? If you do the ext4 LoC research, please consider that you also need LVM, etc – to get a complete filesystem solution so please add LVM and the rest too, to ext4 LoC.

2014-06-13 2:09 am
sergio
I agree with you here, ZFS is too complex for 99.9% of desktops.
For a workstation like a Mac Pro or an iMac it can be a good choice though… I mean, just the snapshot feature ZFS is a killer solution for professional use.
Also the checksum verification could be very useful for a workstation.
But hey… for the typical Mac users ZFS is an overkill no doubt about it. Ext4 would be fine.

2014-06-12 8:55 pm
kristoph
It sucks and it sucks really hard I suffered it several times. It cannot be used for serious work at all.
The content of this post does suggest something ‘sucks’.
Two years ago I’ve worked with HFS+ over iSCSI and It was hell man, the HFS “catalog” corrupted every time and I was unable to mount the FS back.
We have a ton of content, used by our design team, that has exactly this implementation and we’ve not had a problem with it in the 2+ years we’ve had it running.
And It was really hard to get the FS in “clean” mountable state again cause Disk Utility refused to repair it… so I had to repair it using Disk Warrior!!
OMG you had to repair a file system fault using a 3rd party tool, with GUI. Seriously how did you cope? It’s sounds just unbearable.

2014-06-13 2:38 am
sergio
We have a ton of content, used by our design team, that has exactly this implementation and we’ve not had a problem with it in the 2+ years we’ve had it running.
Well… I know HFS+ very well believe me, I’ve been working with it more than 15 years since OS8.5 or so. And yeah, it works fine in the typical cases but it’s not reliable when you push it to the limit. It’s not a serious filesystem, even plain old UFS is way better.
In the case that I mentioned before, the iSCSI connection dropped randomly (We ran it over Metro Mirror and We knew it) and HFS+ just went into an _unrecoverable_ state every time the connection was dropped (really unrecoverable state, you cannot fix it).
In the Linux machines we had exactly the same unreliable iSCSI connection… but with ext3 you just run fsck and the FS is in mountable state again, just 1 minute.
HFS+ was a mess, cause the Disk Utility refused to repair the FS and We had to use Disk Warrior every time and some times even copy the files with Disk Warrior to a new filesystem.
We opened a case to Apple to know why Disk Utility refused even to touch the FS… and they said: “you have to do a backup and recreate the FS” hahahaha (this is true, not a joke)
As I said before, HFS+ is a toy filesystem not a Unix filesystem. Any Linux or BSD FS is 100 times better. I really don’t know why Apple doesn’t adopt ext3 or FreeBSD UFS. It doesn’t make sense to me.

2014-06-13 3:07 pm
NeoX
HFS+ is the worst technology in OSX by a far margin. It sucks and it sucks really hard I suffered it several times. It cannot be used for serious work at all.
I love these kind of blanket statements. You have had problems, it seems. But I have never come across anything major with HFS+, at least for myself and I have been using it since its inception back in the Mac OS 8 days.
That being said as someone who makes a living repairing and supporting Macs and PC’s I have very rarely come up against corruption on client systems that could not be repaired without formatting. But rarely did it result in loss of data and I can say the same for Windows NTFS/FAT as similar problems have occurred. Nothing is perfect.

2014-06-14 1:40 am
sergio
You are right about HFS+, I’m not saying that HFS+ is totally unusable (I use it everyday and I really _LOVE_ OSX). I’m only saying: HFS+ is the worst possible option, far behind the filesystems used in almost any other Unix.
In my humble experience: ext3 is more reliable, UFS is more reliable, ext4 is more reliable and with more features, XFS just like ext4. All open source technologies that Apple can adopt.
Why Apple is stuck with HFS+?! It doesn’t make any sense to me. I think is laziness, They just don’t care about technologies without marketing bling.

2014-06-14 8:46 pm
_txf_
HFS is positively ancient. Over the years things have been hacked onto it. They still byte swap data in the kernel to deal with the big endianness of hfs.
Also in my experience corrupted data from hfs drives are far more difficult to recover than those of ntfs or ext. When HFS blows up it really does some damage.
Apple is very fond of using very simple solutions hidden under a layer of shininess, often little more than hacks. It is all dandy when everything is working, but when there is a problem things can often go wrong in a spectacular fashion.
2014-06-16 2:49 pm
NeoX
Why Apple is stuck with HFS+?! It doesn’t make any sense to me. I think is laziness, They just don’t care about technologies without marketing bling.
I am not really sure about that either. I would say legacy but they have not been too concerned about supporting legacy stuff the past several years.
The rumor about ZFS going into OS X as default has been around for many years now so I think for now it is what it is. It is a hassle, for them and users, to change file systems so that could be a factor.

2014-06-11 2:28 pm
Brunis
So, it’s like the WinFS that never was.. a shame for these major OS’ not to have decent filesystems.. BFS might have had it’s problems and quirks, but god damn it had so many cool ups i didn’t care! It revolutionizes the possibilities for new applications.
2014-06-11 2:30 pm
kramer
I’ve had to recover files from a crashed harddrive imaged with ddrescue before. HFS+ has a lot to recommend it. In particular it seems to strive for contiguity of files across the media…hugely simplifying recovery. And technically a filesystem can’t be blamed for bitrot. It’s the harddrive/media. Filesystems (unjournaled ones) tend to lose whole files. Unless a FS implements error detection AND correction, bitrot will happen. Do any common filesystems currently implement the correction half of things? I’ve never actually looked.

2014-06-11 2:55 pm
dvhh
Your point is fair when you compare it to unjournaled FS
I think the contiguity enforcement is hugely dependent on the FS driver and software pre-allocation for space, and what would be the necessary for such contiguity (free space consolidation on write ?).
Would you recommend it for backup as it is used in Apple Time Capsule ? (I usually have little problem recovering file from FAT32 filesystem either, but would absolutely not recommend it for backup ).
2014-06-11 3:32 pm
WereCatf
Unless a FS implements error detection AND correction, bitrot will happen. Do any common filesystems currently implement the correction half of things? I’ve never actually looked.
Btrfs and ZFS both implement checksums, ie. they’ll detect if a file has been corrupted. Also, if you enable RAID-mirroring or -parity in Btrfs it can correct these errors. I don’t know how ZFS handles these things, I have no personal experience with it, but I’m pretty sure it does the same thing.

2014-06-11 9:12 pm
Drizzt321
ZFS does have some self-heal capabilities if you have RAID or redundancy enabled (see https://en.wikipedia.org/wiki/ZFS#Data_integrity). At the very least it can detect and notify of corruption, and if you regularly scrub the filesystem it can, at the least, notify you of them or follow it’s normal self-heal pattern if possible.

2014-06-12 1:42 am
someone
I think it makes more sense for Apple to integrate checksumming with Time Machine and Versioning instead of implementing a RAID-based self-heal feature

2014-06-11 11:42 pm
saso
I don’t know how ZFS handles these things, I have no personal experience with it, but I’m pretty sure it does the same thing.
ZFS has the exact same bit rot detection and self-healing capabilities, so it’ll detect bit rot, pick the correct copy when mirroring or recalculate when using raid-z, return the correct data to the app and rewrite & read check the original block on the device that returned incorrect data (that’s the self-healing part). On single-device pools there’s also the option of storing more than one redundant copy of user data (“zfs set copies={1..3} <filesystem>”), allowing to counteract bit rot even there, though without the ability to survive catastrophic device failure. ZFS metadata is always stored in more than one copy, so even in the case of non-recoverable user data loss, the filesystems stored on the pool will remain fully navigable with file names, permissions and extra attributes fully accessible. In cases where pool corruption is too widespread to allow successful operation, if the corruption happened relatively recently, it’s usually possible to return up to a few minutes back in time on the pool transaction history to try and get to a recoverable point in time (albeit with data loss possible, as blocks might have become reused).

2014-06-11 2:32 pm
said1
I had problems with HFS+ starting with MacOS 8.1, just right it was introduced…
The VFS layer of OSX should be quite similar (identical?) of BSD, so porting ZFS should have been trivial. It’s strange that they dropped the support.
At least UFS should be an alternative on OSX, inherited from NeXTSTEP.
FAT has the advantage of being very simple, tiny and easy to implement on disparate platforms, so became quickly ubiquitous.

2014-06-13 12:11 am
lundman
The VFS layer of OSX should be quite similar (identical?) of BSD, so porting ZFS should have been trivial.
Having just done that, I can tell you it is definitely not identical. Some over-all similarities, but Darwin went further with making the API modular. They also restrict what you are allowed to call, which is really frustrating.
http://openzfsonosx.org/

2014-06-11 2:44 pm
Dark-Star
This is most probably not a HFS+ problem. Bitrot occurs on the physical media (the magnetic layer on the harddisk platters). HFS+ just doesn’t contain any way of checking/correcting these (on the hardware-side) uncorrectable errors. The same can happen with ANY filesystem as long as you don’t do any sort of checksumming and redundancy. ZFS is only one example that does it right. btrfs and ReFS are two others.
You *need* data checksumming to detect those errors, and you *need* redundancy (multiple disks or at least large ECC chunks) to correct them.
Note also that RAID alone doesn’t help you in any way with these corruptions (it can detect this sort of silent corruption, but can’t recover without checksums since you have no way of knowing if your data is bad or your parity)
I suggest everyone read the PDF called “parity lost and parity regained” (google it) to get an insight on why data integrity is so damn hard…

2014-06-11 5:10 pm
darknexus
Agreed. HFS+ has a lot of issues, particularly in the performance area, but this sounds more like a failing hard drive to me. Keeping backups isn’t enough, and you should always have more than one if possible. You need also to periodically pull out the backups and check them, otherwise you can easily run into situations just like this one whether you’re using HFS+, NTFS, BTRFS, or any other filesystem. For once I’m in the awkward position of absolving HFS+ of blame, and we long-time Mac users know there’s plenty to yell about when it comes to that cursed filesystem. Nevertheless, I’d put this problem down to failing drives rather than a failing filesystem used on those drives.

2014-06-11 7:04 pm
smashIt
but this sounds more like a failing hard drive to me.
even a failing harddrive will tell you that it couldn’t recover your data (harddrives use checksums and errorcorrection internaly)
OS-X must ignore these messages in order to produce corrupted files

2014-06-11 9:02 pm
Bill Shooter of Bul Platinum Prime
What messages do hard drives send about particular files being corrupted? I’m not aware of any….
The common boogy man to invoke is “cosmic rays” they shoot through space and hit your HD, flipping bits. No message. No way for filesystems without check sums like HFS+ to automatically test for file integrity.

2014-06-11 9:34 pm
Alfman verbose=1
Bill Shooter of Bul,
What messages do hard drives send about particular files being corrupted? I’m not aware of any….
Well yea, it’s usually after the fact when the file is being accessed, which can be years after the corruption happened.
The common boogy man to invoke is “cosmic rays” they shoot through space and hit your HD, flipping bits. No message. No way for filesystems without check sums like HFS+ to automatically test for file integrity.
I thought it was a bigger problem for computer RAM? “Cosmic rays” are probably cliche, it may be due more to the fact that our digital computers are implemented on probabilistic properties of physics intrinsic to all machines. The smaller it is, the more quantum physics gets in the way.
It’s statistically likely that all our PCs have produced several errors this past year (given how few of us are using ECC RAM):
http://techcrunch.com/2009/11/02/new-study-proves-that-ecc-memory-m…
What data these memory errors end up affecting is probably sheer luck. Presumably once in a while the ram in a hard drive itself is also affected such that corruption occurs even before the hard drive’s ECC gets calculated for the disk sector. This would be a good reason to use ECC RAM plus a checksumming file system.
Edited 2014-06-11 21:36 UTC
2014-06-11 9:43 pm
smashIt
What messages do hard drives send about particular files being corrupted? I’m not aware of any….
not files, but blocks
if a block is unreadable you will get a CRC-error

2014-06-11 9:11 pm
acobar
HDs do not keep reading all sectors in its plates (or chips on SSD) and, unluckily, sometimes you will discover it too late unless you use utilities to do so frequently.
This is a thing we really must be proactive, we must keep backups and syncs of important things. I frequently mirror some directories between my computers, to servers and also burn them to good DVD and Blu-Ray medias, I like specially DVD-RAM medias (it is a shame that the original enclosing is not to be found anywhere and even less drives that can be used with them). Also, good Blu-Ray disks are said to be more resilient to scratches and have a longer life timespan.
It is worrisome that lots of people trust their data in HDs and think that one extra copy on external drives is enough to keep them safe from data loss.
Edited 2014-06-11 21:22 UTC

2014-06-12 8:36 pm
MrVain
This is most probably not a HFS+ problem. Bitrot occurs on the physical media (the magnetic layer on the harddisk platters).
A filesystem should detect and correct errors on the physical media. Just as ECC RAM should detect and correct errors in RAM. So, yes, it is the responsibility of the filesystem to provide data integrity. No other component in the computer can provide data integrity. For instance, RAM should not provide data integrity, nor the CPU, nor the disk drive (because they have lot of error correcting codes and still disks can loose data as we all know).
HFS+ just doesn’t contain any way of checking/correcting these (on the hardware-side) uncorrectable errors. The same can happen with ANY filesystem as long as you don’t do any sort of checksumming and redundancy. ZFS is only one example that does it right. btrfs and ReFS are two others.
This is not really correct. ZFS is not “only one” example, it is the ONLY example. There are several research papers on ZFS and data corruption and all concludes that ZFS does provide data integrity. CERN researchers writes that even very expensive storage solutions fail to provide data integrity, that is why CERN has switched to ZFS for long term storage (CERN concludes ZFS safe). Read the wikipedia ZFS article for links.
OTOH, Windows ReFS that you mention “doing it right”, does have rudimentary checksums, but only checksums the data. The metadata is NOT checksummed. You have to explicitly say to ReFS that is should checksum even the metadata. It does not suffice to just add checksums to a storage solution, you need end-to-end checksums which is what ZFS uses. Also, it is not easy to get end-to-end checksums correct, for instance the ZFS wikipedia article explains:
http://en.wikipedia.org/wiki/ZFS#Data_integrity
“…In-flight data corruption or phantom reads/writes (the data written/read checksums correctly but is actually wrong) are undetectable by most filesystems as they store the checksum with the data…”
I suggest everyone read the PDF called “parity lost and parity regained” (google it) to get an insight on why data integrity is so damn hard…
Yes, data integrity is damn hard, and until several independent researchers have proved a storage solution to be safe, I will not trust it. ZFS is proven safe by several independent researchers (they inject artificial errors, and ZFS detects all errors whereas other filesystems does not even detect some errors). CERN also writes: “checksums are not enough, you need end-to-end checksums (ZFS has a point)” so they switch to ZFS now.
With that said, I would not trust ReFS nor BTRFS for my important data, until they have been validated by external researchers to be safe against data corruption.
Just because you can not break your own crypto, does not prove it is safe. It needs independent research by professors in Computer Science. Read the links at ZFS for more information and read all papers there.

2014-06-11 3:08 pm
p13.
I’ve never really understood the need and motivation for the resource fork. It’s archaic, and leads to all kinds of problems. A well known manifestation of one of these problems is dotFiles everywhere on non-HFS volumes mounted on a mac.
HFS+ isn’t very robust, either.
I remember the ZFS announcement, and i was really excited about it. I also remember sun making a (semi) public statement about it, and then apple refuting that statement just a few hours later. Nothing was heard of the project since that public statement incident.
ANY modern solid filesystem would be better, but if it included volume management, i would be grateful.
BtrFS or ZFS would rock. Ext4 would be pretty nice.
Edited 2014-06-11 15:24 UTC

2014-06-11 5:15 pm
darknexus
I’ve never really understood the need and motivation for the resource fork. It’s archaic, and leads to all kinds of problems. A well known manifestation of one of these problems is dotFiles everywhere on non-HFS volumes mounted on a mac.
Oh, not just the non-HFS volumes. They’re there on HFS+ volumes as well. Turn on show all files in the finder (hidden preference), or do ls -a from the CLI and you’ll see them even on your HFS+ volumes. Archive up a folder and they’ll be there, even on a Mac-native disk.
Side question: Do OS X and HFS+ even support true filesystem-level resource forks anymore or do all the APIs just redirect to these dot files when used regardless of filesystem?

2014-06-11 5:33 pm
p13.
The dotfiles you see there are just regular unix dotfiles ie. “hidden” files and directories.
AFAIK resource forks are still stored as node.

2014-06-12 12:06 pm
tbuskey
ZFS without RAID is a bad idea. When your data gets corrupted, ZFS will either fix it with the good copy (from the non corrupted part of the RAID) or turn off access so corruption doesn’t continue.
I’ve had a number of single disk ZFS pools do this. It’s ok for backups (think a bad tape), but not good if it’s your primary drive.
I think Apple backed off ZFS because of this. Maybe when laptops start shipping with 2 drives it’ll be a good choice.

2014-06-12 12:08 pm
p13.
By default it remounts read-only, but you can override this pretty easily. The thing is, with ZFS, the filesystem knows about this through it’s CRCs. With HFS, you wouldn’t even know it’s happening.
ZFS’s default behavior is definitely a much better way to handle this.
2014-06-12 2:19 pm
saso
When your data gets corrupted, ZFS will [..snip..] turn off access so corruption doesn’t continue. I’ve had a number of single disk ZFS pools do this.
Here’s how to fix that:
# zpool set failmode=continue <poolname>
You’re welcome.

2014-06-11 3:11 pm
silviucc
Nobody in the fanboy camp is going to care about “ZFS filesystem” as a bullet point when they buy their Macs.
HFS+ is going to be kept on there as long as it is “good enough” ™

2014-06-11 5:39 pm
Drumhellar
If anybody can find a way to make ZFS sexy for the mass-market, it’s Apple.
Making mundane features appear as gifts from God is what Apple does best. Remember how back when Puma was released, resizing windows in a manner that wasn’t completely unusably slow was touted as a “Feature”?

2014-06-11 6:08 pm
silviucc
Indeed, but resizing windows is something that the fool can see. ZFS being self-healing and offering top-notch data integrity ain’t exactly like that.
Disclaimer: I’m not a marketer and even though I can easily see through bullshit I’m quite incapable of producing it. As Apple employs top bullshitters, they might come up with something for ZFS (or any other decent FS out there)

2014-06-11 9:32 pm
Drumhellar
Indeed, but resizing windows is something that the fool can see. ZFS being self-healing and offering top-notch data integrity ain’t exactly like that.
They made Unix sexy. Apple is completely capable of making ZFS sexy.
Look at their site on Mavericks’ advanced technology:
http://www.apple.com/osx/advanced-technologies/
The first item on the list is timer coalescing. That’s just about the least sexy part of an operating system, but daaaamn do they make it sound tits.

2014-06-11 6:08 pm
The123king
Not only is Apple good at making mundane features appear as gifts from God, they’re very good at selling shit like it’s plated in gold. The multitasking bar pre iOS7 comes to mind, as does Ping. Though, as a general rule, if it’s shit, then Apple usually axe it fairly quickly for something slightly less shit, (like the iOS7 multitasking “switcher” thing, and iTunes Radio)

2014-06-11 9:53 pm
No it isnt
Mac users will tell you about how much money they’ve saved by using DiskWarrior instead.
2014-06-12 1:37 am
someone
I think you are seriously underestimating Apple’s ability to market “esoteric” core OS technologies (eg. Grand Central Dispatch, Timer Colescing). They pulled off an OS X release with “zero” new features (Snow Leopard)

2014-06-12 6:05 am
bert64
Snow leopard introduced OpenCL…

2014-06-14 8:15 pm
djame
Not to mention it’s was arguably the best os Apple ever issued.

2014-06-11 4:26 pm
mattsaved
The transition from mechanical to solid state drives changes things up at the lower level. At the upper level, Apple pushed autosave and a backup system that is easy enough that average people use it.
Seams like the advantage of changing filesystems now might not last long before Apple would want to change to some considerably different paradigm.

2014-06-11 6:10 pm
The123king
A completely different paradigm? Seems like files, folders and a WIMP interface are going to stick around for years to come. Somehow i can’t see that changing in a hurry, even with mobile devices

2014-06-11 4:51 pm
PAPPP
Early on you could install OS X to UFS volumes, I did it with 10.1 or 10.2 at some point just to try. It is also an obsolescent FS, and there were some gotchas like weak support for resource forks and a 4G file size limit because it was an ancient UFS dialect pulled straight out of NeXTStep, but the support for an archetypal Unix-style filesystem was there.
It stopped being usable as the system volume in 10.5 ( https://support.apple.com/kb/HT2316 ) and has apparently been entirely excised as of 10.7, which suggests they did something terrible, either to their VFS semantics or with renewed use of resource forks.

2014-06-11 5:35 pm
p13.
The main problem was the endianness of this UFS variant. It was/is big endian, where x86 is little endian. Result is tons of overhead.

2014-06-11 5:54 pm
PAPPP
Ah, right, I remember that from sparc/intel Solaris and their different UFS endianness (and other implementation details). Apple’s descends from a m68k platform, so of course it would be big endian.

2014-06-11 6:12 pm
The123king
And their desire to be fully backwards-compatible with previous versions hasn’t helped. I’m pretty sure any “new” FS will still be related to HFS, and will still be backwards compatible (and big endian)

2014-06-12 1:44 am
someone
According to John Siracusa’s review of Lion, endianness has little performance impact.

2014-06-11 6:21 pm
M.Onty
Decades from now, books will be written about this saga.
… Just not especially interesting ones.
2014-06-11 8:39 pm
ezraz
But bit rot is on the hardware side. So much of our digital stuff won’t be readable in 50-100 years, unlike our vinyl, our magnetic tapes, and our bound paper books. Besides the storage issue, you need the keep obsolete digital technologies around to read and translate all this digital data.
0 = Number of my mac files I’ve lost due to a system crash, a drive problem, an HFS problem, or bit rot in *IN 22 YEARS*.
This “reinstall” thing is a Windows thing. I swear it’s like non-mac users just accept “oh well, have to reinstall” as a tech support solution. I literally have some data on my drive from 6 macs ago, and I have about 4 journaled backup drives around the house, and the cloud has some stuff on it too.
I have indeed seen a mac eat it’s hard drive, and I couldn’t save anything with disk tools, but it was a friends machine who completely ignored any maintenance or warning signs and just kept using the iBook until the drive ate it along with his non-backed up data. So it’s possible, just not all that common.
Just saying…

2014-06-11 11:26 pm
M.Onty
Readable life span of hard discs assumed to be ~5-15 years.
“””” burned CDs known to be ~10 years
“””” pressed CDs assumed to be 20–100 years
“””” magnetic tape known to be ~30 years
“””” standard print paper known to be as low as ~50 years
“””” 35mm film known to be ~80 years
“””” burned ‘gold’ CDs assumed to be ~300 years
“””” vinyl assumed to be 500+ years
“””” ‘archive’ cotton paper known to be 500+ years
“””” vellum known to be 1000+ years
“””” bloody great slabs of stone engraved in big letters known to be 10,000+ years
All this is before considerations about whether data is stored in analogue or digital, and what technology is required to read it back.

2014-06-12 12:30 am
Alfman verbose=1
M.Onty,
All this is before considerations about whether data is stored in analogue or digital, and what technology is required to read it back.
It’s an interesting list, good food for thought!
With analog representations, like film or audio tape, there’s no “moment” that the data went bad, it’s just a slow process. There’s no way to recover lost fidelity. This is an advantage of digital representations, which can be reproduced indefinitely with no loss of data by taking care to verify each copy is accurate. With this in mind, it’s unlikely that humanity will ever loose the digital works created today so long as society possess the will and technical ability to create exact duplicates.
bloody great slabs of stone engraved in big letters known to be 10,000+ years
It depends if it’s exposed to weather or not. Visiting a cemetery with gravestones shows that they really don’t fare so well:
http://www.freephotos.se/view_photo.php?photo=889&cat=0&order=date
Marble is probably a better choice.
I suspect pressed disks (aka DVDs) would last a great deal longer if they were less information dense. The more material there is to represent a state, the more difficult it is to change/misread that state.

2014-06-12 9:03 am
M.Onty
With some mediums the decay is indeed very gradual. Film most notably. I was using the point at which it becomes impossible to recover the most important data as shorthand for ‘life span’. So in film’s case after 80 years you would probably just about be able to recover the original colour values, albeit without that much confidence in their accuracy.
With some others like magnetic tape it seems to be more clearly defined.
Also, I must point out that marble is a kind of stone. Pick the right type, carve your letters large enough and you can get 10,000+ years. (Even if they have to be so large they basically just say “I was here”.)
But regardless of the technicalities, I was emphasising that all this life span business is really no more complex than picking the most robust material you can, writing the data as big as you can, in an actual language (which is neither analogue nor digital).
A better way is to make something interesting enough that you known people will be continually maintaining it and copying it. See English hill figures.

2014-06-12 1:47 am
someone
But you do get “bit rot” in stone slabs and vellum (another good reason to store as much data as possible in plain text formats, which is not possible in the case of photos)
Edited 2014-06-12 01:47 UTC

2014-06-12 3:48 am
Alfman verbose=1
someone,
But you do get “bit rot” in stone slabs and vellum (another good reason to store as much data as possible in plain text formats, which is not possible in the case of photos)
In rome, I was amazed to see how the monuments appeared to have “melted” due to the effects of rain over roughly two thousand years:
https://en.wikipedia.org/wiki/Colosseum#mediaviewer/File:Rome_%2…
Also the Roman Forum is littered with stone monuments that have cracked apart and fallen to the ground. The remaining pieces are amazing, yet it’s clear that only fragments of the original structures managed to survive:
https://en.wikipedia.org/wiki/Roman_Forum#mediaviewer/File:Roman_for…
Many stone monuments undergo preservation efforts to ensure they don’t break apart further, even relatively new stone structures are already cracking:
http://www.nytimes.com/1989/11/24/us/a-face-lift-for-mount-rushmore…
So, while some stone remnants might last thousands of years, most will not survive, at least not without some kind of sheltered environment. For what it’s worth, the scientists at NASA decided to use golden disks containing both audio and data to leave a mark of humanity in outer space for the ages:
https://en.wikipedia.org/wiki/Voyager_Golden_Record
The KEO project, a time capsule of humanity is also relevant here, uses specially made DVDs:
https://en.wikipedia.org/wiki/KEO
This is more interesting than HFS+, right?
Edited 2014-06-12 03:56 UTC

2014-06-13 5:49 am
rhavenn
In rome, I was amazed to see how the monuments appeared to have “melted” due to the effects of rain over roughly two thousand years:
https://en.wikipedia.org/wiki/Colosseum#mediaviewer/File:Rome_%2…
Most of this “melt” is from the last 100 years or so due to acid rain and general industrial pollution (smog, etc..). Sure, clean rain water will eat away, but it takes a lot longer then 2 or 3 thousand years for any “hard” rock like marble or granite.
Edited 2014-06-13 05:49 UTC

2014-06-18 11:57 pm
zima
another good reason to store as much data as possible in plain text formats, which is not possible in the case of photos)
But there are text-based image formats!
Install IrfanView and open any photo with it. Now choose “Save as…” and pick PBM/PGM/PPM and in the options of saving those formats choose “Ascii encoding”
Now open the resulting file in a text editor.

2014-06-12 8:39 pm
ezraz
Readable life span of hard discs assumed to be ~5-15 years.
“””” burned CDs known to be ~10 years
“””” pressed CDs assumed to be 20–100 years
“””” magnetic tape known to be ~30 years
“””” standard print paper known to be as low as ~50 years
“””” 35mm film known to be ~80 years
“””” burned ‘gold’ CDs assumed to be ~300 years
“””” vinyl assumed to be 500+ years
“””” ‘archive’ cotton paper known to be 500+ years
“””” vellum known to be 1000+ years
“””” bloody great slabs of stone engraved in big letters known to be 10,000+ years
All this is before considerations about whether data is stored in analogue or digital, and what technology is required to read it back.
great list! i have my most important music on vinyl, one of the finest uses of PVC plastic ever.
i’d like to add some small observations —
magnetic tape lasts longer than 30 years, because i have cassettes that old that still play. many 8 tracks from the 70’s still work. cassette was a cheap and fragile tape. tv stations i worked at kept their entire archives on expensive 2″ tape, and expected it to last 30+ years. they stored it properly, away from moisture, heat, sunlight, and temperature changes.
i had a buddy about 10 years ago that really got into cd brands and types of coating, etc, so that he could archive all of his production work and his software in the safest way he could afford. after researching he bought pretty expensive cd-r’s and burned them slowly, filling a whole cd-book with his content.
not 6 months later he’s bragging about how smart he was to do this and i say i want to see these amazing cd-r’s. he pulls out his book and is flipping the pages all proud of himself and a few pages in i see what looks like a crack going down one of the discs. he flipped past it so i asked him to go back, i want to see what that was, and sure enough it was a full crack in the foil layer inside of the plastic. he set the book down mad, so i flipped through and about every 1/5 discs was disintegrating inside the plastic with some sort of foil that was turning into a dried crispy flake. completely unreadable after 6 months.
luckily he hadn’t wiped all his external drives yet so i don’t think he lost any data, but since then i’ve never trusted optical for anything critical.

2014-06-12 1:40 am
someone
You are right, but checksum in the FS is designed to detect bit rot caused by hardware failure

2014-06-12 6:38 am
senax
Hi,
ZFS can’t shrink pools which I consider critically important for end-users.. This has been a long known issue which may never get fixed.
https://blogs.oracle.com/jhenning/entry/shrink_a_zfs_root_pool
A long-discussed feature, and a unicorn
The word “shrink” does not appear in the ZFS admin guide.
Discussions at an archived ZFS discussion group assert that the feature was under active development in 2007, but by 2010 the short summary was “it’s hiding behind the unicorn”.
Apparently, the feature is difficult, and demand simply has not been high enough.
The only option to shrink your zfs pool is to add more disks to the system and copy your data. For a server with SAN this can be done fairly easily but for desktops?
I quite like btrfs which allows you to add and remove disks easily even though the rebalancing can take quite a while.
ZFS does have some very nice caching options with ssd which I would love to see in btrfs.
2014-06-12 7:57 am
REM2000
Im so please this has been picked up again. I love the mac but the file system is the worst i have ever used. It corrupts so easily, i do everything safe but it will still corrupt, for example, i had my MacBook Air running for about a month before needing to restart for updates, found that there were loads of corrupted files, this system has a SSD!!!
I have had a cascade file system corruption/crash where every folder/file corrupted one after the other.
It’s about time Apple did something big with the file system instead of giving it bandaid after bandaid, personally im still a massive fan of ZFS, im sure that they could strip out some of the non-needed things like Compression and de-dup and come up with something that is secure, reliable and fast but with not too much overhead (as i know the killer for ZFS is the overhead).
2014-06-12 12:26 pm
torp
What you probably mean is “non checksumming file system is vulnerable to bit rot”. It’s not specific to HFS or anything.
Technical question: if you run the fabled ZFS on a single hard drive, so it has just one copy of the data, does it store enough checksum to recover it in case of bit rot?

2014-06-12 12:34 pm
PhilPotter
The checksum is just to detect whether or not the block is inconsistent. To repair it, you need another copy of the data or a parity block. Not quite sure how ZFS does it but I’d imagine it’s similar to Btrfs, which basically just halves the space on your drive if you set it up with two data copies.

2014-06-12 8:42 pm
MrVain
The checksum is just to detect whether or not the block is inconsistent. To repair it, you need another copy of the data or a parity block. Not quite sure how ZFS does it but I’d imagine it’s similar to Btrfs, which basically just halves the space on your drive if you set it up with two data copies.
ZFS just halves the space on your drive if you specify “copies=2”, so you can repair data with a single disk because ZFS doubles all data. You can also specify “copies=3”.
I have heard that ZFS is the only filesystem able to repair data with a single disk, so I doubt BTRFS has copied this feature yet.

2014-06-12 9:03 pm
PhilPotter
It isn’t as refined then in that regard, but Btrfs can be setup as RAID 1 on a single drive just using the two partitions as block devices, and it will still repair files using the good copy of the data. No protection against disk failure there though obviously.

2014-06-12 7:38 pm
tidux
Really, it is. The Adobe programs are so pants on head retarded that they refuse to install to case sensitive HFS+ volumes, let alone sane filesystems like ext4 or ZFS. Can you imagine the uproar if Photoshop didn’t work on a fresh install of OS X without hacky workarounds?