Fedora’s new Btrfs SIG should focus on making Btrfs’ features more accessible

Thom Holwerda 2024-12-16 Fedora Core 12 Comments

As Michel Lind mentioned back in August, we wanted to form a Special Interest Group to further the development and adoption of Btrfs in Fedora. As of yesterday, the SIG is now formed.
↫ Neal Gompa

Since I’ve been using Fedora on all my machines for a while now, I’ve also been using Btrfs as my one and only file system for just as much time, without ever experiencing any issues. In fact, I recently ordered four used 4TB enterprise hard drives (used, yes, but zero SMART issues) to set up a storage pool whereto I can download my favourite YouTube playlists so I don’t have to rely on internet connectivity and YouTube not being shit. I combined the four drives into a single 16TB Btrfs volume, and it’s working flawlessly.

Of course, not having any redundancy is a terrible idea, but I didn’t care much since it’s just downloaded YouTube videos. However, it’s all working so flawlessly, and the four drives were so cheap, I’m going to order another four drives and turn the whole thing into a 16TB Btrfs volume using one of the Btrfs RAID profiles for proper redundancy, even if it “costs” me half of the 32TB of total storage. This way, I can also use it as an additional backup for more sensitive data, which is never a bad thing.

The one big downside here is that all of this has to be set up and configured using the command line. While that makes sense in a server environment and I had no issues doing so, I think a product that calls itself Fedora Workstation (or, in my case, Fedora KDE, but the point stands) should have proper graphical tools for managing the file system it uses. Fedora should come with a graphical utility to set up, manage, and maintain Btrfs volumes, so you don’t have to memorise a bunch of arcane commands. I know a lot of people get very upset when you even suggest someting like this, but that’s just elitist nonsense. Btrfs has various incredibly useful features that should be exposed to users of all kinds, not just sysadmins and weird nerds – and graphical tools are a great way to do this.

I don’t know exactly what the long-term plans of the new Btrrfs SIG are going to be, but I think making the useful features of Btrfs more accessible should definitely be on the list. You shouldn’t need to be a CLI expert to set up resilient, redundant local storage on your machine, especially now that the interest in digital self-sufficiency is increasing.

About The Author

Thom Holwerda

Follow me on Mastodon @thomholwerda@exquisite.social

12 Comments

2024-12-16 12:01 pm
j0scher
I mean, OpenSUSE has had tools for this in Yast for a very long time, so it’s about time that the Fedora guys do the same.
2024-12-16 1:54 pm
Alfman verbose=1
Well, I’ve said it not long ago, but…I’d love to have BTRFS to replace mdraid but it needs to become production ready first!!!
There have been longstanding problems requiring manual intervention under raid failures (I tested a few weeks back) and this single problem has been a major holdup for adoption. Because of this I had little choice but to provision my brother’s computer with a more traditional mdraid. BTRFS is not up to the task. And I can’t justify BTRFS on servers because of the implied downtime and manual intervention at the local console needed to get running again in situations where “normal” raid systems can keep the system operational.
This needs to be a top priority in order for BTRFS to be considered a good replacement for older raid systems. BTRFS has a really nice feature-set, and I’d like to replace mdraid/lvm…but it’s unsuitable for most production systems where redundancy is used to maximize uptime. It actually maximizes downtime, a failure on any media will leave the system unbootable without manual intervention…. Devs have been dragging their feet on this, but IMHO it’s a disastrous position for BTRFS to take and we end up with ZFS being recommended instead because it has proper support for raid without downtime and manual intervention.
Thom Holwerda,
However, it’s all working so flawlessly, and the four drives were so cheap, I’m going to order another four drives and turn the whole thing into a 16TB Btrfs volume using one of the Btrfs RAID profiles for proper redundancy, even if it “costs” me half of the 32TB of total storage. This way, I can also use it as an additional backup for more sensitive data, which is never a bad thing.
Are you using this BTRFS RAID for root or just additional storage? If it’s just additional storage then at least it’s not imperative to boot. But know that if any disk in the raid goes offline the entire volume goes offline and you’ll need to intervene to bring it up and do so correctly to minimize the risk of data loss.
2024-12-16 2:07 pm
decuser
I’ve had more unrecoverable failures on BTRFS than any other FS. I use ZFS and haven’t lost a bit in nearly 2 decades of continual use. I’d love it if it got some attention. I might even dare to try it out again (it’s been a year)… well, maybe… trust is hard earned and easily lost.

2024-12-16 4:02 pm
Alfman verbose=1
decuser,
I’ve had more unrecoverable failures on BTRFS than any other FS. I use ZFS and haven’t lost a bit in nearly 2 decades of continual use. I’d love it if it got some attention. I might even dare to try it out again (it’s been a year)… well, maybe… trust is hard earned and easily lost.
ZFS has earned it’s reputation for reliability, and that’s incredibly important. Perhaps it’s bias on my part, but I’ve had bad experiences with non-mainline drivers. Normal users who just download and install kernels/modules from their distro don’t experience this because their distros handle it…but those of us who build our own kernels often face the consequences of out of tree modules in a kernel with an unstable ABI. The maintenance burden can be unpleasant.
The other reason I would prefer BTRFS is it’s flexibility in dynamically balancing volumes across disks. In ZFS operations like shrinking a volume or changing raid settings require rebuilding the full array whereas BTRFS handles it really nicely and different sized disks are no problem at all.
I don’t know if BTRFS raid will ever be production ready though. In addition to the points I made earlier, they officially warn against using BTRFS raid 6, which is what I use on my backup/archive boxes. Maybe I just have to get over my objection to non-mainline modules and just go with ZFS as a well tested & mature solution. Even though I like some aspects of BTRFS better, it’s just not ready.
2024-12-17 1:30 am
Andreas Reichel
Would be interesting to understand your use-case better.
I have been using BTRFS in large “consumer” scenarios (not Raid, since all relevant stuff is GIT distributed anyways) and it always just worked. Including many Hard Resets and power losses.
I would try ZFS but not as long as its not in the mainline kernel. Not enough time or interest or pressure for chasing dependencies.

2024-12-16 5:31 pm
Shiunbird
I’ve been running ZFS on FreeBSD for more than a decade now, slowly expanding from RAIDZ1 with 4x500GB and now I have two pools: one RAIDZ2 with 8x 4TB disks and a second one with 4x 16TB disks, on SAS enclosures. Backup to Amazon Glacier (I am building a server to be at my friend’s place 100km from here, but my budget is low now). The two local pools mirror each other and I have a 5TB dataset in HAST for nextcloud, so I can update the servers without downtime.
When I got my first SAS enclosure, the price was cheap for them coming with 8x 2TB disks and the seller was very straightforward in telling me that the disks had an average of 45.000 hours in them.
No big deal. I set them as RAIDZ2 and made the point of replacing one disk per month, with a 4TB disk. When all disks were replaced, I was unlocked with the larger capacity. I monitor the hours and preemptively replace disks now. My graveyard has 16 disks.
If you have a plan, everything is fine. Just have a tested and proven offsite backup and restore procedure. Now I have only new disks (for the first time since the 4x 500GB disks). If your backup strategy is sound (and can afford the downtime) you can save quite a good amount of money with used equipment. I host all my data myself and share a ton.
However – if I were to start today, I’d probably consider giving BTRFS a shot, just because, really, nothing being able to use different disk sizes is very wasteful on a home setup. BTRFS is not as mature as ZFS, but I can afford the downtime.
In all this time, I’ve migrated my data around to rebuild (zfs send | receive) a few times, had 2 disks die on me and never lost data. My initial motivation to go to ZFS was losing some old photos due to bit rot. =(
And Thom, since you are on your journey to get rid of big tech in your life, I guess self-hosting your stuff is a strong step!

2024-12-16 8:37 pm
Alfman verbose=1
Shiunbird,
However – if I were to start today, I’d probably consider giving BTRFS a shot, just because, really, nothing being able to use different disk sizes is very wasteful on a home setup.
I concur, this is a huge selling point for btrfs over ZFS (and most other raid solutions). Add drives of arbitrary sizes whenever you need to without having to build a whole new array is a huge win for flexibility.
BTRFS is not as mature as ZFS, but I can afford the downtime.
My problem is that I specifically need raid to protect against emergency downtime, but if that isn’t important to you then go for it! Having more people to kick the tires with BTRFS helps. But do be aware there are known data integrity deficiencies with raid 5 + 6. This is a bit unfortunate because RAID6 is a useful raid level, but it’s still not ready for production.
https://btrfs.readthedocs.io/en/latest/btrfs-man5.html#raid56-status-and-recommended-practices
And Thom, since you are on your journey to get rid of big tech in your life, I guess self-hosting your stuff is a strong step!
Yes, many of us feel this way…but for others people it’s a big ask. They either don’t see the point in avoiding the corporate overlords or they aren’t willing to do so if it inconveniences themselves.

2024-12-17 1:26 am
Andreas Reichel
@Thom, maybe it was an interesting topic to cover “Self hosting” for different scenarios, the good the bad the ugly.
Scenarios for Private employees, entrepreneurs, SME’s, shop owners:
– E-mail
– Website
– Calendar
– Blog/VLog
I am always blown away how much trust business owners but in Google an Microsoft with all their business. At the same time, running your own e-mail server is really not feasible for many reasons and even setting up a WebCal Server or a Meeting Solution can depend on lots of work.
(I belong to the “do everything yourself, on a rented linux server instance — except e-mail server and website).

2024-12-17 5:33 am
Shiunbird
I’ve been hosting my email successfully for 2-3 years at home. Both carriers give me fixed v4 with reverse dns for extra ~5 eur per month. One is DSL, the other one an antenna on the roof.
I’ve been running mailcow and slowly migrating to a multimaster setup on freebsd based on Michael Lucas’ book.
Snort and aggressive checks against spamhaus help me keep my ip reputation, plus throttling. Mailcow makes it a breeze and I can failover to a VM in the cLoUD in 15 minutes.
Give it a shot!

2024-12-17 2:22 pm
Alfman verbose=1
Shiunbird,
I’ve been hosting my email successfully for 2-3 years at home. Both carriers give me fixed v4 with reverse dns for extra ~5 eur per month. One is DSL, the other one an antenna on the roof.
In the US many residential customers aren’t allowed to run email, but I do run email in a datacenter. It’s certainly possible to self host but I wouldn’t say it always goes smoothly. Unfortunately there can be a lot that’s out of your control. Your subnet neighbors are a big one. Consider that if your neighbors are doing things they shouldn’t your “IP reputation” can be affected too. Some real time blacklists justify this on the basis that ISPs don’t feel pressure to stop spammers if legitimate users aren’t affected.
Most of the time I have no problems with email (ie set and forget), but occasionally it does become a nusance when the spam problem falls on your own shoulders (false positives and false negatives). I prefer when other providers bounce emails because it gives a clear indication of delivery failures. However I’ve noticed that some providers, most notably gmail, don’t bounce emails even when they fail to deliver them, which is really frustrating because it makes it impossible to know if an email was delivered. Logs clearly show that google accepted the email, but they can and do drop emails per whatever heuristics they’re using, which is extremely frustrating when it involves business emails.
With self-hosted email you are much more likely to end up in “spam” folders, not because you’ve done anything wrong mind you, but because there’s safety in numbers. Consider that if a blacklist/reputation provider were to block gmail, odds are nearly 100% that hundreds or thousands of customers will complain, but if some small self hosted server gets blocked, those odds might drop down to near 0%. It’s a shame because it’s not fair, but that’s the way it is. I think most self hosting users have a similar experience: things usually work except when they don’t.

2024-12-17 1:34 am
Andreas Reichel
> You shouldn’t need to be a CLI expert to set up resilient, redundant local storage on your machine, especially now that the interest in digital self-sufficiency is increasing.
UI’s are expensive to develop and to maintain! Also, in case of failure when it matters most often you won’t have an UI but just a plain terminal available.
I myself use ChatGPT for CLI programs very successfully. It became very useful for things, where I know the solution but forgot about the syntax and don’t care to read up. Tell it your objective and the actual parameters, and it will spell out copy’n paste commands (with an explanation, why and what.)
I actually prefer that over an UI these days because of the copy’s paste.
2024-12-17 9:35 pm
Flatland_Spider
The one big downside here is that all of this has to be set up and configured using the command line. While that makes sense in a server environment and I had no issues doing so, I think a product that calls itself Fedora Workstation (or, in my case, Fedora KDE, but the point stands) should have proper graphical tools for managing the file system it uses.
Gnome has the Gnome-Disk-Utility, or Disks, to manage disks. I probably does need to be extended for more btrfs features. I’ve created btrfs filesystems, but they’ve only been on one disk.
I don’t know exactly what the long-term plans of the new Btrrfs SIG are going to be, but I think making the useful features of Btrfs more accessible should definitely be on the list.
This goes for the CLI too. Some of the useful features are hard to use with the CLI.
Btrfs has a laundry list of items which need to be improved to be honest. It checks a lot of boxes, but the features come with caveats. (I do use it quite a bit. Knowing it’s limitations and being okay with those limitations.)