ZFS is the FS for containers in Ubuntu 16.04

Thom Holwerda 2016-02-21 Ubuntu 56 Comments

Ubuntu 16.04 LTS (Xenial) is only a few short weeks away, and with it comes one of the most exciting new features Linux has seen in a very long time… ZFS — baked directly into Ubuntu – supported by Canonical.

A very welcome addition.

About The Author

Thom Holwerda

Follow me on Mastodon @thomholwerda@exquisite.social

56 Comments

2016-02-22 1:36 am
Johann Chua
…would replace HFS+ with ZFS or something like it in Mac OS X. Main reason why I haven’t bought an Intel Mac yet. Seriously.

2016-02-22 2:40 am
Macrat
Why does a consumer OS need ZFS?

2016-02-22 2:55 am
evert
Consumers need no data integrity?

2016-02-22 4:33 am
tylerdurden
What’s next? Consumers needing more than 640KB of RAM? Ridiculous!!! .

2016-02-22 2:56 am
zzing
There are a number of features of ZFS, but I think one of the important ones that I read about is that it can automatically tell you if the file is damaged from hashes. I don’t know if BTRFS can do that.

2016-02-22 4:26 am
WereCatf
I don’t know if BTRFS can do that.
Yes, it can. BTRFS does also track checksums.

2016-02-22 3:23 am
Soulbender
It’s not that they need ZFS, it’s that they need something better than the horrible HFS+.

2016-02-22 4:04 am
tylerdurden
Apple still claims OSX to be “world’s most advanced OS”, even though it ships with a filesystem older than half of their userbase. The reality distortion field did not perish with Jobs, apparently.

2016-02-22 3:23 am
vtpoet
I know very little about ZFS, but my curiosity is piqued. From what I can tell (so far) the average consumer has next to nothing to gain from ZFS. It’s overkill (on something like a home computer) and seems to carry its own risks. But hopefully an article will appear, sooner or later, to explain some of this.
Edited 2016-02-22 03:25 UTC

2016-02-22 3:33 am
Macrat
Exactly.
2016-02-22 4:16 am
unix_joe
I know very little about ZFS, but my curiosity is piqued. From what I can tell (so far) the average consumer has next to nothing to gain from ZFS. It’s overkill (on something like a home computer) and seems to carry its own risks. But hopefully an article will appear, sooner or later, to explain some of this.
Bitrot.
It seems like Ars Technica did an article on this a couple of years ago.
2016-02-22 6:40 am
Alfman verbose=1
vtpoet,
From what I can tell (so far) the average consumer has next to nothing to gain from ZFS.
Regularly scheduled snapshots – very useful for recovering from mistakes no matter who you are.
Integrity hashes, good for enterprise and average joes.
Possibly dedup, but not for low end users.

2016-02-23 7:41 am
AER
vtpoet,
From what I can tell (so far) the average consumer has next to nothing to gain from ZFS.
Regularly scheduled snapshots – very useful for recovering from mistakes no matter who you are.
Integrity hashes, good for enterprise and average joes.
Possibly dedup, but not for low end users.
IF your linux computer with ZFS is under attack by malware, you can recover it by restoring an earlier snapshot of your system. And if you were attacked by ransomware, no need to give up and pay, just restore the previous snapshot of your files, and thats it, no need to decrypt your files.
2016-02-26 6:29 pm
Bill Shooter of Bul Platinum Prime
Note, if a virus were to hit your linux computer with the intention of forcing you to pay a virus, and it didn’t delete the snapshots, well that’s a bug in the virus. If the backups are powered up and connected to the infected system, they can’t be counted on to save you.

2016-02-22 8:40 am
dnebdal
I use it on some random computers, like my PC at work [1]. There are some good and bad sides:
Pro:
Snapshots are neat. Easy to roll back, easy to grab files from, very cheap.
The checksum system is solid. Every data block is checksummed, and they’re verified whenever you read a block [2].
Setting up anything from a mirror to multiple RAID6-blocks is easy and well supported.
The filesystem compression works well – lz4 is super-fast and still helps a fair bit; gzip trades speed for compression but is still quite zippy on a modern CPU.
Con:
ZFS can use a lot of RAM, and even tuned down to the minimum for decent performance it’ll probably want a GB or so; way more than HFS+.
The checksums are at their most useful when you have more than one copy, but in e.g. a laptop with a single disk they can’t do more than tell you that you should restore a file from your backups[3] – and most Macs are laptops.
[1] I’m one of the ~27 people worldwide that use FreeBSD on a desktop.
[2] Or whenever you ask it to verify the entire filesystem, of course.
[3] Well, unless you’ve told it to keep multiple copies of the file system in question – which could make sense for your e.g. documents folder.
Edited 2016-02-22 08:45 UTC

2016-02-22 8:59 am
WereCatf
ZFS can use a lot of RAM, and even tuned down to the minimum for decent performance it’ll probably want a GB or so;
That’s one of the things why I prefer Btrfs. Btrfs can do most of the same things as ZFS can do, but it requires a fraction of the RAM for it. The one thing Btrfs doesn’t do yet is automatic background de-duplication, but that’s hardly a priority feature for desktop – or laptop – users.
2016-02-22 7:37 pm
rahim123
Have you had problems recently with BtrFS filesystem out-of-space errors? It’s still listed as an unresolved issue on the BtrFS FAQ page, and it really made me lose all confidence in BtrFS.
2016-02-23 3:39 am
WereCatf
No, I haven’t. I did get those ~3-4 years ago, but more recently? No, not once. I do run several Btrfs RAID1 filesystems and one Btrfs RAID5 at the moment, and while I have had some trouble with the RAID5 – array all the single-disk, RAID0 and RAID1 setups I’ve used have worked perfectly. I am running kernel 4.2.0, though, and I hear there are plenty of RAID5 – related fixes in the newer kernels — perhaps I should upgrade mine.
2016-02-22 9:50 am
sergio
[3] – and most Macs are laptops.
And pretty low end laptops in fact… an entry level Macbook has 4gigs of RAM and a 1.2ghz CPU. That’s less hw than many Android phones. xD
Also OSX is not FreeBSD with Fluxbox, OSX is a fat memory hungry pig… trying to run it on top of ZFS using cell-phone like hardware is crazy.

2016-02-22 4:16 am
Johann Chua
At the very least, a better filesystem than HFS+.

2016-02-22 8:38 am
sergio
At the very least, a better filesystem than HFS+.
Maybe XFS or ext4 but definitely not ZFS.
ZFS is a _huge_ resource hog (and OSX too)… running OSX on top of ZFS is a complete overkill for a notebook (and today 80% of the Macs are Macbooks).
ZFS could be a great option for a OSX Server though… but not for regular OSX installs.

2016-02-22 6:25 pm
tidux
It’s possible to ship ZFS support in the OS but use something else as the default target for a root volume at install time. Linux distros have supported NTFS, JFS, and other random things for years while defaulting to ext2/3/4 or XFS.
2016-02-22 6:45 pm
darknexus
Sure, but then why bother shipping with zfs at all? It’s not as if the majority of use cases for Macintoshes are servers even though they can make damn good servers when configured properly. That still leaves the question of what you use to replace HFS+. Remember that most desktop systems have a root partition on their main drive and that’s it, aside from the EFI system partition and possibly a recovery partition.
Edited 2016-02-22 18:46 UTC
2016-02-22 9:09 pm
SunOS
There’s a huge difference between resource hogging and making use of available RAM for caching duties. ZFS does the latter.
2016-02-23 1:43 am
sergio
There’s a huge difference between resource hogging and making use of available RAM for caching duties. ZFS does the latter.
Yeap and OSX/HFS+ does it too (buffers), but the problem isn’t the buffering the problem is the huge amount of in-memory tables that ZFS maintains and the high computational resources it consumes (in comparison to the almost zero resources that HFS/ext/ufs or any other traditional FS consume).
I love ZFS, I work with it since the first day it was released for Solaris 10 and I think It’s one of the most incredible piece of software ever created… but it’s not a good FS for low end laptops running with a battery.
As I said, I’d love to have ZFS as an option in OSX (in fact We “almost” had it in Snow Leopard) but not as the default OSX FS, it’s a complete overkill and a waste of precious battery resources.

2016-02-22 10:05 pm
tylerdurden
At the very least, a better filesystem than HFS+.
that’s like the lowest bar I have seen set in a long time. Even vfat would be a better alternative than HFS+ 😉
Edit: Why are we going off on OSX on an article about linux though?
Edited 2016-02-22 22:08 UTC

2016-02-23 2:07 am
sergio
At the very least, a better filesystem than HFS+.
that’s like the lowest bar I have seen set in a long time. Even vfat would be a better alternative than HFS+ 😉
Edit: Why are we going off on OSX on an article about linux though?
Not vFAT haha but ext4 or Reiser4 would be a great alternative (resource forks could be implemented with a simple plugin in Reiser).
But hey even the vanilla UFS with soft-updates that *BSDs use would be a huge step forward compared to HFS+!!
I’m not anti-Apple at all, but I agree with you, HFS+ is by far the worst major FS in the market. It sucks from every possible POV and It’s been sucking since at least 15 years.
I really don’t know why Apple keep using it.
2016-02-23 2:47 pm
Drumhellar
But hey even the vanilla UFS with soft-updates that *BSDs use would be a huge step forward compared to HFS+!!
That used to be an option, but you haven’t been able to install on UFS since… 10.3 I think? I think Apple developers just wanted to give up on fully supporting case-sensitivity.
Oddly enough, NTFS would be a great for for OSX, even supporting multiple data streams natively.
2016-02-23 2:51 pm
darknexus
Even vfat would be a better alternative than HFS+ 😉
Nah. At least HFS+ has journaling. I sure don’t miss the days of scandisk. Maybe ext3 though…
2016-02-24 12:02 am
tylerdurden
The journaling in HFS is primitive. You still have to use ScanDisk… I mean DiskUtility with HFS+ volumes regularly.
There really is no excuse for Apple to ship their OS with such outdated filesystem technology.
2016-02-24 3:46 pm
darknexus
The journaling in HFS is primitive.
No kidding there, but it is at least better than nothing which is what VFAT would give you. That said I’ve not needed to use disk utility on a regular basis. However I do agree that HFS+ is a piss poor excuse for a filesystem and should not be shipped with any operating system.

2016-02-23 7:38 am
AER
Why does a consumer OS need ZFS?
Hardly, an Ubuntu Linux desktop with docker installed is a consumer OS nowadays. But always, it seems this will be the future, in that if you run some webapp in Ubuntu or any distro for that matter, the future is it will run inside a container.
To answer the question, yes consumer desktop needs data integrity which is what ZFS is all about, and it offers deduplication so you can save much of your space. But the drawback in my limited understanding of ZFS is that it needs more RAM and it will eat your RAM depends on what containers with ZFS you are using/running. And it also might need a ECC RAM which is more expensive and may not go along with desktop applications especially games.

2016-02-23 7:52 pm
sergio
To answer the question, yes consumer desktop needs data integrity which is what ZFS is all about, and it offers deduplication so you can save much of your space. But the drawback in my limited understanding of ZFS is that it needs more RAM and it will eat your RAM depends on what containers with ZFS you are using/running. And it also might need a ECC RAM which is more expensive and may not go along with desktop applications especially games.
I don’t want to sound harsh but using ZFS dedup even in a powerful desktop (not laptop) is almost impossible, crazy.
The ZFS dedup feature is one of the most hardware hungry feature out there, you need between 2 and 5 gigs of dedicated RAM for each 1Tb of storage that you have in the pool with dedup enabled… and it also consumes CPU time.
Take into account We don’t even enable dedup in mid-tier Sun Fires running Zones only on dedicated storage servers and for specific volumes/filesystem never the entire pool. (to save disk space in the Zones/Containers, we use writable clonesand NOT dedup as people usually think, dedup is some kind of “special” feature not a common used one)
But as general rule, ZFS really needs a LOT of resources, no kidding. I think people don’t realize how heavy on resources it is. I love ZFS, it’s wonderful, magic, but that magic comes with a very high price.

2016-02-23 8:52 pm
Drumhellar
Time Slider!
http://www.openindiana.org/wp-content/uploads/2010/08/oi-b148-gui-t…

2016-02-22 9:05 am
BBAP2005
… would give away osx and updates for free and just charge for their over priced hardware.
2016-02-22 2:54 pm
darknexus
I’m all for replacing HFS+, but ZFS is a bit ram-hungry to use as a simple general-purpose filesystem. It works best on storage servers with loads of memory, and boy does it ever fly then.

2016-02-22 3:04 pm
torp
Not to mention that it also sounds like a full time job.
While it may be worth it on a server, I’d rather not bother on my (regularly backed up) laptops and desktops…

2016-02-23 12:12 am
Drumhellar
ZFS has a lot of useful uses for desktop/laptops – built in compression, and snapshots are very useful for laptop/desktop systems.
ZFS replication is just awesome, too.
http://arstechnica.com/information-technology/2015/12/rsync-net-zfs…

2016-02-22 6:44 am
nicubunu
Some days ago, when I saw this news item for the first time, everybody was talking about the license: ZFS is licensed under CDDL, which, while a Free license, is supposedly designed to be incompatible with GPL. In my understanding, the situation is not fully clarified yet, as an example see https://twitter.com/conservancy/status/700094818035245057 (sorry for the Twitter link)

2016-02-22 8:53 am
WereCatf
That’s what I’ve been wondering about: they have included ZFS in the kernel sources, including modifying the Makefiles to have it automatically as a part of the building – process. Their stance is basically that that building ZFS as a module means the kernel – sources do not become a derivative works!
I just don’t understand that. If I went, slapped some code in the kernel sources – tree, modified Makefiles to reference that code and released the sources to the wild I most definitely would have created a derivative works, so how can Canonical claim that they doing the same thing doesn’t do that? Building and distributing out-of-tree modules and code is one thing, but it’s now in-tree code.
Am I just missing something here?

2016-02-22 11:58 am
nicubunu
the licensing experts have yet to come forward with their opinions. what we see currently talked comes under “IANAL”
personally, i am familiar with this license incompatibility mostly from the old cdrtools debate, and simple logic says it cdrtools was such a big issue, a kernel module should be even bigger. but IANAL

2016-02-22 10:20 am
fabrica64
Tested both, BTRFS better integrated into Linux kernel and definitely much more efficient. I really don’t understand why Ubuntu needs to go ZFS

2016-02-22 10:00 pm
tylerdurden
Because BTRFS is not production ready, and ZFS is. Perhaps?

2016-02-22 10:59 pm
fabrica64
It depends what you mean for production ready. ZFS on Linux is far from being production ready, it’s slower, consumes a lot of memory, it’s not integrated into the kernel and I wasn’t able to use as the boot disk. May be Canonical has improved these shortcomings, let’s see.
I only use BTRFS in production (CentOS 7.2), including boot partitions and never had any problem.
Edited 2016-02-22 23:01 UTC

2016-02-23 4:34 am
Drumhellar
For filesystems, generally, “production ready” means “Won’t ever fail and destroy your data”, which is why BTRFS still isn’t production ready

2016-02-23 9:36 am
fabrica64
So no file-system is production ready, I got more bugs in ZFS and ext4 🙂
2016-02-24 12:04 am
tylerdurden
sure you did.

2016-02-23 7:46 am
AER
It depends what you mean for production ready. ZFS on Linux is far from being production ready, it’s slower, consumes a lot of memory, it’s not integrated into the kernel and I wasn’t able to use as the boot disk. May be Canonical has improved these shortcomings, let’s see.
I only use BTRFS in production (CentOS 7.2), including boot partitions and never had any problem.
Until your production machine gets tossed by BTRFS bug, so backups should be ready.
ZFS is production-ready for a long time even on Linux machines.
ZFS on FreeBSD still consumes a lot of RAM, the FreeNAS folks advice: 1GB of RAM for every terabyte of data.

2016-02-23 9:38 am
fabrica64
There’s no perfect software, I got ZFS bugs too… I think BTRFS is quite stable from a couple of years now.
2016-02-23 10:07 am
WereCatf
It actually is, they just keep those warnings about enospc etc. in the wiki just-in-case. It’s quite understandable, considering that a bug in a filesystem could lead to a disastrous loss of data, and it’s hard for them to say that there will never be any corner-case or anything where such an error could occur.
It’s certainly not as buggy or crashy as some people like to make it out to be and JBOD, raid0 and raid1 modes should be very stable. I can’t recommend raid5/6 for production systems myself as I did run into a few issues there, but I am running an outdated kernel so I don’t know if it’s better in the newer kernels.

2016-02-22 12:00 pm
MrVain
ZFS has a very efficient disk cache called L2ARC. If you only have 1GB RAM in your server, ZFS will not have access to the disk cache which lowers performance down to disk speed, instead of RAM speed. Myself have run ZFS on a 1GB PC with Solaris for over a year without problems. If you have 4GB RAM, then you will have a very small disk cache, so you will never get RAM speed, as ZFS always have to reach out for the disks, degrading performance down to disk speed.
But if you want to use Dedupe – then ZFS require 3-5GB RAM for every 1TB disk space. Dedupe is a memory hog, and Dedupe is broken in ZFS. Never use dedupe in ZFS, it is almost useless. Oracle has bought Greenbyte who have a superior ZFS dedupe engine, best in class, extremely high performance with very low latency – and ZFS will use the Greenbyte dedupe engine in some coming Solaris iteration:
http://www.theregister.co.uk/2012/10/12/greenbytes_chairman/
“…Take 5,000 virtual desktop images, full-fat clones 40GB in size and each with 2GB of swap space. That sums up to 210TB of storage. Apply GreenBytes dedupe and it gets compressed down to under 4TB. A 4TB IO Offload Engine could support 5,000 fat clones….”
http://www.theregister.co.uk/2013/08/27/greenbytes_latency_smash_wi…
“…The system is zero latency…”
But, nice to see Linux trying to catch up on what Solaris have done for decades: running light weight containers on ZFS which is a killer. Install and configure and test a Oracle database in a container, snapshot it, and then you can deploy a new Oracle database VM in a few seconds – fully configured and tested. And each developer have root access to the container (but not root access to the Solaris server).
BTW, regarding BTRFS, the official homepage of BTRFS says:
Q: Is BTRFS stable yet?
A: Maybe.
In other words, BTRFS can not be trusted if you have important data. Read the forums, there are lot of data corruption stories with BTRFS.
2016-02-22 7:41 pm
rahim123
About the claims that ZFS is a RAM hog, I’m pretty sure it’s just a matter of ZFS using a RAM cache when RAM is available. But if you have less RAM it simply won’t use as much of it, if it all. Or I’m sure you can configure it to use less RAM. There’s a good explanation here:
http://distrowatch.com/weekly.php?issue=20150420#myth

2016-02-22 8:37 pm
darknexus
About the claims that ZFS is a RAM hog, I’m pretty sure it’s just a matter of ZFS using a RAM cache when RAM is available
Not quite. That’s what it’s supposed to do and indeed what it does, however without the RAM cache you will notice a definite degradation in your overall performance. With ZFS, the RAM cache is not optional if you want it to work properly.

2016-02-22 9:57 pm
rahim123
Hmm OK, so ZFS without a big RAM cache will perform worse than something like EXT4 or XFS with the Linux kernel’s core caching?
2016-02-23 8:47 pm
Luminair
I ran opensolaris on a 4gb system for a while and it felt fine to me. I don’t have the benchmarks you’re talking about so I can’t speak to those

2016-02-23 10:21 pm
MrVain
It is getting tiresome to try stop this FUD. But a ZFS dev explains:
http://arstechnica.com/gadgets/2016/02/zfs-filesystem-will-be-built…
Richard Yao, a ZFS developer, contacted us after this article was published to clarify some points about ZFS’ RAM requirements. “The code base [for non-deduplication cases] would work fine with 1GB of system RAM for any amount of storage,” Yao told Ars, adding that “the only downside is that the cache rate declines if the working set is big.” The 1GB of RAM per 1TB of storage requirement is actually related to ZFS’ data deduplication features, but according to Yao the math required to calculate the ideal amount of RAM is so highly variable that it defies easy rules of thumb. RAM requirements are related directly to the amount of duplicated data stored on your volumes and a variety of other factors.
“The ‘1GB rule of thumb’ is commonly cited as describing memory requirements for data deduplication and is unfortunately wrong,” said Yao. “It started many years ago and despite my best efforts to kill that misinformation, people keep spreading it faster than I can inform them of the actual requirements.”