APFS in detail

Thom Holwerda 2016-06-20 Apple 33 Comments

Apple announced a new file system that will make its way into all of its OS variants (macOS, tvOS, iOS, watchOS) in the coming years. Media coverage to this point has been mostly breathless elongations of Apple’s developer documentation. With a dearth of detail I decided to attend the presentation and Q&A with the APFS team at WWDC. Dominic Giampaolo and Eric Tamura, two members of the APFS team, gave an overview to a packed room; along with other members of the team, they patiently answered questions later in the day. With those data points and some first hand usage I wanted to provide an overview and analysis both as a user of Apple-ecosystem products and as a long-time operating system and file system developer.

An incredibly detailed look at Apple’s new filesystem, APFS.

About The Author

Thom Holwerda

Follow me on Mastodon @thomholwerda@exquisite.social

33 Comments

2016-06-20 1:18 pm
Orichalcum
So APFS is led by Dominic Giampaolo.. the BeFS guy. The Be and Palm engineers are definitely found in all sorts of leading positions today (Apple, Android).

2016-06-20 10:17 pm
smashIt
So APFS is led by Dominic Giampaolo.. the BeFS guy.
it’s just sad that apple wasted over 10 years of dominics talent.
the shoult have let him do his job a long time ago…

2016-06-21 7:35 am
henderson101
No, he was responsible for Spotlight and also did a lot of work on making HFS+ not suck. Not wasted – because as a filesystem guy, that is filesystem work.

2016-06-21 9:00 am
Vanders
Making HFS+ not suck is precisely the sort of waste of talent I’m think of, personally. That was always going to be a sisyphean task; they should have set him to work building a new filesystem from the start.

2016-06-21 9:50 am
pmac
They did, but it got scrapped.

2016-06-21 7:37 am
henderson101
Yeah, I pondered if Dominic was involved last week:
http://www.osnews.com/permalink?630316

2016-06-20 2:03 pm
darknexus
He seems a bit short on detail and a bit obsessed with how much better ZFS supposedly is even though we don’t have a full implementation of apfs yet to compare.

2016-06-20 2:20 pm
Thom Holwerda
He seems a bit short on detail
Did you read all the chapters? It’s not just the linked article.

2016-06-20 2:28 pm
darknexus
Yes Thom, I did. I’m not saying the shortness of detail is his fault mind you, but the technical details I’d have expected weren’t there. I was more taking an issue with that in conjunction with his claims of how much better ZFS is even though we don’t have enough in-depth detail to make such a claim. I don’t think I was very clear about that though.

2016-06-20 2:45 pm
Thom Holwerda
Don’t worry – I was just checking to make sure you read the whole thing and didn’t miss it .

2016-06-21 7:47 am
henderson101
Thom, like you I read this yesterday on Hacker News too.
https://news.ycombinator.com/item?id=11934457
The same guy had a massive piece on Apple rejecting ZFS that also surfaced on Hacker News last week.
http://dtrace.org/blogs/ahl/2016/06/15/apple_and_zfs/
https://news.ycombinator.com/item?id=11909606
So it’s not surprising he likes ZFS. He seems to have a couple of other ZFS articles:
http://dtrace.org/blogs/ahl/2016/03/07/big-news-for-zfs-on-linux/
http://dtrace.org/blogs/ahl/2012/11/08/zfs-trivia-metaslabs/
2016-06-21 11:37 am
Luminair
this isn’t just some guy. it’s adam leventhal, one of the many great and outspoken leaders of solaris projects at sun. the reason everything he talks about has a zfs twist is because he’s an actual zfs expert and has experience developing hardware and software systems, unlike most of us who are just random blowhards
2016-06-21 2:49 pm
henderson101
Hey, dude – I know who Dominic Giampaolo is, this guy no idea. I’ve used Solaris, I’ve installed Solaris, but I’m not really all clued up on it. It does explain why he is so interested about ZFS though.
I think the fact Thom didn’t mention that the author was a big Solaris guy justifies my opening line. Anyone can cut and paste news articles from Hacker News really. Not doing any original research on top of that, I dunno, that just seems sloppy.
Edited 2016-06-21 14:53 UTC

2016-06-20 2:37 pm
Bill Shooter of Bul Platinum Prime
Dude, ZFS has been out for a long time. Its had time to stabilize, improve and add features since that time. It would be insane on the scale of Apple having released the iphone back in the year 2000, to think that ZFS today isn’t a better file system than anything Apple is releasing soon.

2016-06-20 3:09 pm
darknexus
Oh I agree, which is why I find such comparisons misleading at best. He’s comparing something which has been around for more than a decade against something that isn’t even fully released yet. It felt like said comparisons were there to pad an article that was otherwise disappointing in genuine detail.

2016-06-20 3:46 pm
javispedro
I agree. The article seems to be extremely short in details. And even for the few technical details given, I find it hard to distinguish what are actual APFS features versus what seems to be wild guessing from the author.
In fact, I’d say there’s more wild guessing than actual information in the article. And despite that he’s already asking for it to be ported to other operating systems as if it was Christ’s third coming or something like that. I understand there’s some excitement here because of the persons involved in APFS, but this is quite too much…

2016-06-20 3:36 pm
albertp
Apple engineers I spoke with claimed that bit rot was not a problem for users of their devices, but if your software canâ€™t detect errors then you have no idea how your devices really perform in the field. ZFS has found data corruption on multi-million dollar storage arrays
This worries me.

2016-06-20 4:02 pm
Alfman verbose=1
albertp,
This worries me.
I agree.
For (many) consumers, disk integrity may already be good enough without FS checksums. But file systems that can’t detect/fix bitrot are inferior for enterprise and “prosumers”, who produce a lot of digital content and need to archive it.
IMHO the feature should not have been omitted, instead it should have been an optional feature that can be enabled to improve integrity, or disabled to improve performance. Maybe it could be added on later… but content being produced today really should be checksummed sooner rather than later.

2016-06-20 5:27 pm
acobar
For (many) consumers, disk integrity may already be good enough without FS checksums.
I don’t know if it is still the case but disk controllers used to have error recovery circuitry and tried hard to remap and recover data from sectors sometimes not even reporting errors to the OS if it could cope with them or, when using SMART, reporting if the rate of errors or its number passed some threshold. Perhaps, the situation is not as bad as seems to be. Much worst scenario comes from faults on the controller itself, cache on processor or errors in RAM memory. It is a shame that most computers come with memory without ECC (bit flip used to be a serious risk and with the ever growing miniaturization the risk is two fold, less isolation and less charge density).
Anyway, granted, OS FS checksums are an added bonus and welcome layer.

2016-06-20 7:59 pm
MrVain
For (many) consumers, disk integrity may already be good enough without FS checksums.
I don’t know if it is still the case but disk controllers used to have error recovery circuitry and tried hard to remap and recover data from sectors sometimes not even reporting errors to the OS if it could cope with them or, when using SMART, reporting if the rate of errors or its number passed some threshold. Perhaps, the situation is not as bad as seems to be.
Hard disks have CRC error correcting codes, as wikipedia explains it better than me:
https://en.wikipedia.org/wiki/Hard_disk_error_rates_and_handling#Err…
“Only a tiny fraction of the detected errors ends up as not correctable. For example, specification for an enterprise SAS disk (a model from 2013) estimates this fraction to be one uncorrected error in every 1016 bits,[44] and another SAS enterprise disk from 2013 specifies similar error rates.[45] Another modern (as of 2013) enterprise SATA disk specifies an error rate of less than 10 non-recoverable read errors in every 1016 bits.[46] An enterprise disk with a Fibre Channel interface, which uses 520 byte sectors to support the Data Integrity Field standard to combat data corruption, specifies similar error rates in 2005.[47]
The worst type of errors are those that go unnoticed, and are not even detected by the disk firmware or the host operating system. These errors are known as silent data corruption, some of which may be caused by hard disk drive malfunctions.[48]”
But the big problem is that there are lot of other stuff that can affect the data: flaky PSU, faulty RAM dimms, bugs in disk firmware, etc etc. So all in all, the error rates are far higher than 1 in 10^16 – as reported by CERN and Amazon. Read the ZFS wikipedia article on data integrity and the links for more information.

2016-06-20 7:19 pm
malxau
IMHO the feature should not have been omitted, instead it should have been an optional feature that can be enabled to improve integrity, or disabled to improve performance.
Fwiw, this is the approach we took with ReFS for pretty much this reason.
In Apple’s defense, checksums without redundancy are pretty pointless. You know you lost data, but you still lost it. They are useful with transparent redundancy (the file system can know to read a good copy), or even manual redundancy (you know you need to reload that file from a backup) but there still needs to be redundancy somewhere. For Apple’s use case, this is unlikely to be present.

2016-06-21 12:30 am
Alfman verbose=1
malxau,
Fwiw, this is the approach we took with ReFS for pretty much this reason. [/q]
I’d like to work on a project like that
I don’t know too much about ReFS because I’m using linux for all my storage. Even on windows I save everything to a samba share which gets backed up. It has the side benefit that I can access documents from any computer. So far so good, but my biggest gripe has been with applications themselves, they’re difficult to backup consistently. I can’t be confident that they will restore & run as they were without incident. So instead I’ve relegated myself to re-installing applications rather than restoring them. It can take a week or two to get the apps back up and re-configured the way I want them.
Anyone have a practical solution to this short of cloning entire partitions? The cost and time of backing up the raw disk adds up quickly.
[q]In Apple’s defense, checksums without redundancy are pretty pointless. You know you lost data, but you still lost it. They are useful with transparent redundancy (the file system can know to read a good copy), or even manual redundancy (you know you need to reload that file from a backup) but there still needs to be redundancy somewhere.
True, but anyone with data worth keeping should already have some kind of backup in place, be it automatic or manual

2016-06-21 11:32 am
galvanash
Anyone have a practical solution to this short of cloning entire partitions? The cost and time of backing up the raw disk adds up quickly.
I use this and have been extremely happy with it:
http://www.snapraid.it/
Its basically offline raid, but it works at the file system level, not the block level. I use it on Windows, but it works fine on Linux from my understanding.
May not meet your needs though, read up on it first.
I use it in a 3 disk array (18TB). So I have 12TB of usable storage. The daily snapshots only take a few seconds to a few minutes to run for me. Pretty painless. My data changes very infrequently though, if you have a lot of data that changes often it may not be so hot.
ps. I’m not using it on my boot partition, so I don’t use it to backup applications or the OS. It can be setup to filter by directory and filenames even – it is very flexible. You probably wouldn’t want to use it to backup things that change a whole lot though (from my understanding from the docs).
Edited 2016-06-21 11:36 UTC
2016-06-21 2:20 pm
Alfman verbose=1
galvanash,
That’s very interesting. It sounds like it adds some redundancy at the FS level rather than at the volume level.
This part sounds concerning:
The main one is that if a disk fails, and you haven’t recently synced, you may be unable to do a complete recover. More specifically, you may be unable to recover up to the size of the amount of the changed or deleted files from the last sync operation. This happens even if the files changed or deleted are not in the failed disk. This is the reason because SnapRAID is better suited for data that rarely change. [/q]
[q]ps. I’m not using it on my boot partition, so I don’t use it to backup applications or the OS. It can be setup to filter by directory and filenames even – it is very flexible. You probably wouldn’t want to use it to backup things that change a whole lot though (from my understanding from the docs).
Well, in my case it’s the applications that I have trouble with. For example, say my computer were to die, and assume I have a completely up to date FS + registry backup. Now I have to borrow my wife’s computer or buy a new system and restore my work applications quickly (dev tools, collaboration apps, VPN software, etc), but therein lies the rub: windows application dependencies are scattered throughout windows directories, application directories, userdata directories, the registry, etc, and to make things even more complicated ever since MS started coming up with schemes to virtualize application and registry paths, many resources are not even where an application thinks they are.
So long story short, I’ve given up trying to back up individual windows applications because I have no practical way to restore them when I need to.
I’ve given some thought to how I’d solve this starting a-new, and IMHO gobolinux has the best design for this so far. For existing legacy software with files scattered everywhere I think we’d have to modify the OS to record all the dependencies generated/used by the application in a database. This database would effectively tell us everything the application depends on to run.
2016-06-21 3:54 pm
DeadFishMan
Well, in my case it’s the applications that I have trouble with. For example, say my computer were to die, and assume I have a completely up to date FS + registry backup. Now I have to borrow my wife’s computer or buy a new system and restore my work applications quickly (dev tools, collaboration apps, VPN software, etc), but therein lies the rub: windows application dependencies are scattered throughout windows directories, application directories, userdata directories, the registry, etc, and to make things even more complicated ever since MS started coming up with schemes to virtualize application and registry paths, many resources are not even where an application thinks they are.
So long story short, I’ve given up trying to back up individual windows applications because I have no practical way to restore them when I need to.
That’s one of the reasons that I love Debian and Linux/UNIX dot files. Having found myself in the same scenario but with a different OS, all I needed to do knowing that I would have to replace a system that was about to go bust was to simply…
dpkg –get-selections > appslist.txt
… and backup my /home partition. And then…
dpkg –set-selections < appslist.txt
apt-get -u dselect-upgrade
… and then restore my /home partition.
At the end, I’d have a system fully restored to the point it was when I backed it up. Sure, I could have cloned partitions but that is only useful when you’re going to restore it exactly on the same system or an ideally identical system to replace the faulty one but that wasn’t the case then.
It surprises me that there isn’t anything nearly as straightforward with non-Linux OSes.
2016-06-21 7:10 pm
FlyingJester
Some Windows apps (usually developer oriented) have portable installs which fixes this issue.
Unfortunately though, most Windows stuff relies on arcane Registry keys in incomprehensible and incosnsistent locations, and using/misusing hidden application directories in the user folder.
Some Linux stuff is just as horrible (sadly some Gnome and KDE components, and Steam is actually worse on Linux than on Windows), but it tends to be the exception rather than the rule.
…and don’t lump all Unixes out there, like they are as bad in this way as Windows
Edited 2016-06-21 19:11 UTC

2016-06-21 11:19 am
Luminair
is refs dead or what? it’s not in the news as much as it should be if it is to be the superior dancer

2016-06-21 6:17 pm
malxau
is refs dead or what? it’s not in the news as much…
ReFS has a server focus, and server things don’t get the press coverage that consumer things do. Perhaps more generally it’s also hard to be in the news for enhancements, even important ones. ZFS and btrfs don’t get too much press coverage these days either. But file systems, particularly big redesigns, aren’t very usable in their first release, so enhancements are still the real story.
ReFS in Server 2016 still contains plenty of enhancements. Some of the significant things I remember are 4Kb clusters to allow for checksums of VHD files without performance prohibitive read/modify/write; a recovery log to ensure clusters and other tasks needing fast durability are more efficient; block cloning that can be used to create multiple files sharing the same physical blocks; and my personal favorite, per cluster tracking of valid data, so a large allocated file doesn’t need to implicitly zero disk blocks because the file wasn’t written sequentially.
2016-06-22 2:37 pm
Luminair
amen, thanks

2016-06-20 8:40 pm
FlyingJester
I think the comparison with ZFS is apt, since OS X actually had ZFS support, although hidden once. While it’s true that APFS would need to mature to make a ‘proper’ comparison, that would not have been necessary had they used ZFS (or Hammer, or btrfs, etc). It’s still a real concern.
2016-06-21 2:26 am
Windows Sucks
I wonder if anyone has any thoughts about how Apple will move people off HFS+ on production machines?

2016-06-21 7:02 am
shotsman
Via TimeMachine perhaps?
2016-06-21 7:20 am
Adurbe
They need on really. They only support a 4/5 year lifespan for machines. Just pre install it on new machines and allow it to take over organically. If they make it to easy, no one will pay for the new feature