The FreeBSD Release Engineering Team is pleased to announce the availability of FreeBSD 10.3-RELEASE. This is the third release of the stable/10 branch, which improves on the stability of FreeBSD 10.2-RELEASE and introduces some new features.
It’s got a ton of improvements to the UEFI boot loader, the Linux compatibility layer, and a whole lot more.
I think one of the really big improvements is support for ZFS boot environments. PC-BSD has had them for a year or so now. On the Linux side of things, openSUSE recently got Btrfs boot environments too. Having used them, I don’t think I will ever run a system again that doesn’t support boot environments. It’s one of those technologies that seems like a small thing, but makes a big difference once it has been put into play.
Like ? Could you explain the benefits ?
I haven’t used them much myself, but the basic idea is that instead of having ZFS filesystems for /home and /, you have e.g. /home, /ROOT/one and /ROOT/two ; at boot you get to choose if you want to use one or two as your /.
It’s a simple idea, but in practice it means you can fork your current system, chroot into the clone, do a dramatic system upgrade, and then get to pick between the original and the upgraded system when you reboot. Or for that matter, you could set things up so every update happens in a new clone, making it easy to roll back if anything stops working. (ZFS clones are cheap, since they only store the changed blocks.)
Oh, and you could theoretically install debian/kFreeBSD and PC-BSD in their own boot environments and pick between them at boot, as long as their kernels support the ZFS version you use.
Edited 2016-04-05 12:07 UTC
Essentially correct, but I think you make it seem overly complicated.
Basically, with boot environments, you use the computer as you usually would. But prior to running any major upgrades or configuration changes, the system makes a snapshot (copy) of the operating system. This requires virtually no additional disk space with ZFS.
If something in the config change or an upgrade breaks the OS, you simply reboot and select the previous snapshot from the boot menu. This rolls back the OS to the last known good state.
This means you can do just about anything to your OS (delete files, break dependnecies) and a simple reboot fixes it.
On openSUSE it is especially nice as YaST does all this for you automatically and you can even do side-by-side diff comparisons of your files. So if a config file changes between snapshots you can see exactly what happened and select which version of the file to keep.
As someone who has worked on servers where another admin changed a config between reboots or a dependency broke during a long uptime, it is very handy to be able to simply roll back to the last known good state.
Isn’t it a bit like booting u-boot and selecting your system image or selecting between two users with admin rights ?
No, I don’t think that parallel works.
I very much like having snapshots at the bootloader level, but the layering violations that ZFS commits troubles me. I’d prefer for it to be at the volume level, independent from the FS. Unfortunately as useful as logical volumes are for snapshotting and allocating volumes with ease, there are no widely accepted standards for volume management. Linux uses LVM, freebsd uses vinum, there’s the solaris volume manager, windows 8 adopted “storage spaces”. If only we had one good standard, it would be perfect for multibooting different operating systems without worrying about physical partition locations on disk.
While you’re at it, wish for a universal filesystem that has full support built in all operating systems and is not FAT32. Baby steps, after all.
UDF?
You’re joking, right? Format a USB stick as UDF, then put it in a Windows, Linux, and Mac machine and tell me just how well that works. No cheating, for reading and writing.
It can be done but you have to jump through hoops to do it. Unfortunately due to some quirks Windows doesn’t work well out of the box reading Linux and OSX UDF formatted drives.
What layering violations ? I like it works for most common case and it does it well. It only prevent some nerd slobs from running filesystem in exoting configuration that virtually nobody else uses or wants.
You have the gasoline, I have the match. Flame on!
Research shows that ZFS is the safest commodity filesystem out there. CERN researchers says ZFS bests even very expensive high end storage system in terms of data safety.
Linux devs have for long called ZFS a “rampant layering violation”, because ZFS is monolithic.
http://arstechnica.com/staff/2007/05/rampant-layering-syndrome/
In Linux you have the filesystem, raid layer, volume manager, etc. Different layers.
ZFS creators rethinked the storage basics and started afresh and came up with this monolithic ZFS. The reason ZFS has superior data integrity, is BECAUSE it is monolithic. That is why ZFS can provide superior data integrity, besting very expensive storage systems out there. As CERN researchers say:
-Adding checksums to detect data corruption is not enough for data integrity.
(For instance hard disks have lot of checksums and still they return corrupt data, 1 corrupt bit in 10^17 read bits, according to Enterprise Fibre Channel disk specs)
You must use a _special_ type of checksum to detect all the different kinds of data corruption that can occur. Not any checksum will do. You need End-to-End checksums for data to be safe.
Even if every layer in Linux storage is checksummed (hard disks have checksums, raid level have checksums, etc), but when the data passes the boundary to another layer – the data might get corrupt. The checksum is _not_ passed to the new level. Instead a new checksum is calculated in the new level, based on what (corrupt) data it received from the underlying level.
ZFS changes this by being MONOLITHIC so there are no different layers. ZFS is in charge of all the layers, so ZFS knows the checksum no matter in which level the data resides in. This means the data does not get corrupt as it passes boundaries. And this is possible ONLY because ZFS is a “rampant layering violation” which the Linux devs mocked ZFS for. They laughed at ZFS and called it badly designed, Sun kernel devs did not know jack, etc etc.
Linux devs did not really understand ZFS. As they did not understand Systemtap (it can crash the server, DTrace can not), nor did they understand Solaris SMF and built the Linux copy systemd. Linux devs just copy without really understanding the reason. For instance, in an old interview Chris Mason said he added checksums to BTRFS only after reading Sun talking about ZFS data integrity, that was only then he realised the importance of end-to-end checksums. And guess what? BTRFS is… monolithic. Just like ZFS. I dont see Linux devs call BTRFS a layering violation? Not Invented Here syndrome?
But Chris did not understand why ZFS is monolithic: because of the desire of End-to-End checksums. And still he did BTRFS monolithic in the beginning, without End-to-End checksums. He did not understand.
So, Alfman, I dont think the future is layered storage, it should be monolithic. Different layers defeats data integrity. Well, now you understand one of the design decisions behind ZFS, why “ZFS commits layering violations”
Edited 2016-04-07 18:45 UTC
Basically, with boot environments, you use the computer as you usually would. But prior to running any major upgrades or configuration changes, the system makes a snapshot (copy) of the operating system. This requires virtually no additional disk space with ZFS.
You can effectively achieve this with LVM and use any filesystem you like on top of it.
Edited 2016-04-05 18:42 UTC
Sure, but it’s not generally as automatic or simple. Technically I could achieve it with multiple hard drives, too. That doesn’t make it the easiest option to maintain.
Take a snapshot and perform updates. It isn’t that hard. Now I haven’t used ZFS or BTRFS so I don’ t know how much easier it is but I wouldn’t trade XFS for either one of them. Neither scale nearly as well.
What’s different is that often the entire process is integrated, as it was with Solaris Live Upgrade. You simply do the upgrade. It handles every step, including taking the snapshot. That’s all I meant. With lvm, if you want the same behavior, you generally have to implement it yourself with scripts. It’s not hard, but it’s that extra bit of work you don’t have to worry about with zfs.
People apparently create petabyte+ pools with ZFS and it works fine – I have no idea if XFS on lvm scales better, but ZFS seems to be good enough for anything I’m likely to throw at it.
As for snapshots, they’re fairly trivial to do (“zfs snapshot pool/filesystem@name”). I haven’t used the boot environment tools, but IIRC beadm isn’t much harder.
Edited 2016-04-07 15:02 UTC
“Works fine” is subjective. ZFS works fine if you have the money to dump into RAM. XFS doesn’t require a massive amount of memory to operate efficiently. In fact ZFS creates more IO, CPU cycles, and memory usage than XFS. The larger the disk/array the more metadata to traverse during filesystem operations when using filesystems like zfs and btrfs.
IBM Sequioa supercomputer has a 55 Petabyte storage solution with Lustre, which uses ZFS as it’s filesystem. It gives 1TB/sec bandwidth. Lustre was rewritten from using ext4 to ZFS, because of scaling problems. ZFS was built to scale. It is 128 bit, instead of 64 bit, because a 64 bit pool is something like ~20ish Petabyte. 10 years ago when Sun designed ZFS, they realised that 20 PB pool will be built somewhere ~2020 or so. In other words, 64 bit filesystems are not enough. BTRFS is 64 bit, just like XFS. They are not built to scale.
People have built 1 PB pools with ZFS for years. They are now doing 2 PB, and later 4 PB. And then 8PB, and quite soon you hit 64 bit limit. That is why ZFS is 128 bits, because it is built to scale. No one can ever fill up a 128 bit pool, so ZFS limits are safe. To fill up a 128 bit pool, requires as many 3 TB hard disks, as the weight of three moons. To move that many electrons into a 128 bit pool, require so much energy that you vaporize all water on earth. etc etc. No, no one will ever fill up a 128 bit pool. But IF someone really harvest the entire earth and several other planets for metal to build 3 TB disks weighing three moons and really fills up a zpool, why, you can create another zfs raid pool. It is not uncommon to run several different raid pools at once.
But the reason people runs ZFS, is not for its performance or superior scalability, or whatever – no, the reason is because ZFS is safe. Research from different computer scientists, show that XFS, ntfs, ext3, etc etc are all unsafe. They might corrupt your data. Hardware raid is also unsafe as it might corrupt your data. The problem is that a filesystem should give back exactly the same data that was stored – but most (all?) filesystems fail this. Worse yet, they can not even DETECT all cases of data corruption. If data gets corrupted, they wont even notice. How can they repair the data then? Impossible. OTOH, researchers show that ZFS is safe, it detects ALL cases of data corruption. And it also repairs all cases of data corruption.
CERN who stores lot of particle data, have researched this a lot, and chose ZFS to store it’s data on Tier1 and Tier2 sites (long term storage). Tier0 is the particle collider that produces data that is streamed for long term storage to other sites. CERN said that ZFS is the only safe solution after researching the storage market, even safer than very expensive high end Enterprise storage solutions (because they can not detect all cases of data corruption, but ZFS can).
All this information, and links and research papers can be found here (read also the top “See also…”)
https://en.wikipedia.org/wiki/ZFS#Data_integrity
All this safety, performance, scalability, etc – comes at a cost. You need some RAM to run ZFS. And you need cpu. Regarding cpu, I read some benchmarks showing that ZFS uses 2-3% of one single core to do it’s magic. Regarding RAM, you need 1 GB RAM or so. I have run a Solaris server for over a year with 1GB RAM. ZFS has a very efficient disk cache called L2ARC, and if you only have 1 GB RAM, you will not get any L2ARC, so performance degrades to disk speed. But if you have RAM, you will have disk cache so performance will be excellent. The desire to have 4GB RAM or even 8GB RAM on a ZFS server, is because you get disk cache. That is the only reason you need RAM. ZFS does not require much RAM to run. In fact, someone ported ZFS to ARM cpu with 16 MB RAM or so, and it worked fine.
YMMV, but the reason people use ZFS is because your data is safe. Not because of it’s snapshots, rollback of system disk, etc etc. If some virus encrypts my system disk, why, I just roll back to an earlier state with boot snapshots. XFS does not scale, it is not safe, it is cumbersome to administer and is not performant – when compared to ZFS.
Edited 2016-04-09 22:05 UTC
Was inotify implemented for this release?
A long time i saw a really good set of benchmarks comparing Linux, open/net/freebsd.
They tested stuff like forking, memory allocation, http latency, scaling with number of served sockets, etc etc
http://bulk.fefe.de/scalability
Well you can always find flaws in benchmarks but these were much better than the rather meaningless game frame rate tests … Or installer reviews.
I’d love for someone to compare and publish comparisons of the main OSes, testing stuff like network stack scalability and latency, memory efficiency, basic numerical performance, stability over time under stress loads, IO throughout scalability and latency, multithread efficiency and degradation limits, some relevant video testing like cpu load during full screen video (netflix, iplayer), etc etc … Not everyone is a gamer!
project_2501,
I agree.
It would be interesting if osnews could publish/maintain something like this, but who’s going to pay for it? They need an intern or someone to do some free work.
Edited 2016-04-05 12:06 UTC
phoronix does that, and it has a paid subscription where one can ask for what kind of benchmarks should be executed.