This course walks through the creation of a 64-bit system based on the Linux kernel. Our goal is to produce a small, sleek system well-suited for hosting containers or being employed as a virtual machine.
Because we don’t need every piece of functionality under the sun, we’re not going to include every piece of software you might find in a typical distro. This distribution is intended to be minimal.
Building my own Linux installation from scratch has always been one of those things I’ve wanted to do, but never got around to. Is this still something many people do? If so, why?
People do it for many reasons. First and foremost is to educate ourselves on how the system works and how all the pieces fit together. You can use something like Linux From Scratch, of course, but I think something like Gentoo Linux is a really good tool. You get nice things like good documentation and package management without the boring bits, like downloading and maintaining stuff manually. And you can dig as deep as you want, for learning purposes, or you can skip most of it, if you’re a seasoned user.
Edited 2017-08-16 22:47 UTC
Or Arch Linux ๐
Arch is a great step between something like Ubuntu and LFS for sure. It’s a great distro in its own right too.
But then you wouldn’t get all the compilation errors. Where’s the fun in that? ๐
On a more serious note, I think Gentoo and Arch may overlap somewhat, but they have different purposes. I use both. And I also use Gentoo’s Portage on my Arch, for all sorts of meta goodness.
Well, having used both Arch and a few Gentoo based distros, the main difference between the two is that Arch still has a core of binary packages you can just install without having to compile them, but at the same time they do have AUR, which is where you get to see all the awesome compile errors!
Arch is interesting. I tried it for a while made a howto document for my own use on google/docs but in the end it is just to time consuming. My main desktop and work laptop are ubuntu-mate (with virtualbox for the windows stuff)
Yeah, but LFS forces you to learn (or at least, read), which is what you want to do by building your own Linux anyway
richarson,
I never tried LFS so I don’t know what paths they take with their project, however my first linux distro was based on the GNU toolset. It eventually worked, but it was tons of work to get all the pieces and dependencies up and running. Although back then I did it on my own without a howto.
When I did an overhaul of the OS, I gave busybox a try (popular in routers and such), and what a difference did it make. One self-contained busybox build got me up and running in a fraction of the time. It’s the difference between a single install versus hundreds/thousands of files/dependencies strewn across the FS.
Of course the system will be minimal, but you’ll be up and running with far less effort. I would highly recommend going this route for a first attempt, and only then consider moving to a GNU base when you have more experience. In my case busybox was fine and I decided not to go back to the GNU tools except in the cases where busybox was lacking. Pros=very small footprint, Cons=fewer features.
watching shit scroll by for hours doesn’t make you an expert.
https://fun.irq.dk/funroll-loops.org/
You don’t really learn anything with Gentoo that you wouldn’t learn with Arch or any other bare bones distro that sticks as close to vanilla upstream as possible.
I disagree in part, USE flags and the Portage tree make all the difference. They’re still my #1 favourite feature of Gentoo and they can’t really be replicated on a binary distribution.
Please don’t turn this into a Gentoo vs Arch fight, they’re just different.
Yes, but since there’s so much complexity now security wise compared to back in the day (90s) when firewalls were like.. optional, it is not as appealing.
Anyone seeking to understand or customize their installation, will surely want to do this.
People in the computer science/engineering will surely also enjoy this.
Otherwise, there’s really not much use if you have a beefy machine.
A small side note, something like this is how I feel every time I install FBSD on a machine, since it asks you quite a lot of stuff and requires you to know what you are doing before installing it.
One of the biggest knowlegde gaps I’ve seen with CS students when they graduate is really a lack of understanding of the mechanics of how the OS, libraries, linking, static vs shared libs, Makefiles, etc. all work. Basically all of the tooling around writing software that just isn’t covered in the teaching of cs algorithms. Having a basic handle on these things is critical to being a productive software engineer. Sure, some students take a natural curiosity to it and learn on thier own, but most don’t and end up graduating with a CS degree but not being able to actually write any software.
I’ve often thought that a good way to address this is to have students in the intro CS courses spend the first 2 weeks building an LFS system, which they will then have to use for the duration of at least CS I and II to do the actual coding assignments.
I was not aware of any other way!
๐
I’ve built LFS a few times in the last decade or so. I did it mostly to learn how the system actually works. It was actually quite interesting to see how the different components work with each other.
That being said, you end up spending 2-3 days building a system with X, only to find you do not have any useful applications. Once you finally got all that done, and spend another day building Xorg + a few useful packages, you end up with OpenBox + links2 with graphics, and can browse, maybe, 30% of the web.
Last I time I did LFS (version 7.1), I had to start over because GCC got screwed up somewhere along the way. Speaking of GCC, it took, by itself, the better part of a day to build and test.
I tried Gentoo back in 2006, but my system was slow (even by the standard back then) and I gave up after 2 days.
I question how much people actually learn aside of `in order for this to run, I need A and in order for A to run I need B`, etc… I know people who have stumbled through successfully compiling things on their own, but they’re far from being package maintainers. I know people who managed to compile customized kernels but are far from understanding what most of the features are or how they affect performance, or what they’re useful for. Following how-tos is one thing but being competent as a maintainer, debugger, etc. is quite another.
For most people I’d say if you want a minimal system, take something like a Debian base install with all options/add-ons disabled, then install software as needed until you have something that gives you what you need. And last, purge anything that’s not required or useful.
ilovebeer,
I’m a debian fan as well, however when I do an update on a production machine, it really drives up my anxiety since I’ve been burnt by updates in the past and many of the servers aren’t easily accessible (this was actually one of my motivations for making my own linux distro).
Well, that is the complicated solution.
A much simpler solution, create a VM, rsync or even dd and do an upgrade test. That takes away most of the surprises, except maybe the kernel upgrade and the combination with hardware (great would be if you have the same hardware somewhere so you can try the upgrade on similar hardware).
Something else you can also do is copy the upgraded system back to the server if you know what you are doing (depending on the data).
Lennie,
Well sure, VMs give you more better options, but with physical servers failures it becomes a bigger hassle to fix a broken system especially if clients don’t have a spare ready to go in case of failure. It just makes upgrades that much riskier.
I actually keep very good full system backups, but when rsync takes a good 8 hours to do a differential backup, you know it’s going to be a bad day if you need to do a full restore
Edit:
Actually, I wanted to create a fuse file system for this exact scenario. It would intercept file requests, forwarding them to the backup site in real time so that the system could continue to function, albeit slowly. It would work very much like a network file system with the additional function of resyncing the files to the local disk. In our case we’d also need it to work with block devices.
Edited 2017-08-21 21:49 UTC
In case it wasn’t clear the VM was only used for testing an upgrade (and probably does not need all the data, just a subset and all the system files).
Sounds like you have a lot of small files, ZFS (on Linux) or BTRFS with send/receive might be a better option for that diff backup.
I think Craig’s list used to do something like that a long time ago, something like this:
https://github.com/dpavlin/perl-fuse
Backed by HTTP webservers if I remember correctly.
Something like Varnish would be able to be used as a local caching proxy. If the function of your servers is already a webserver you can do that without Fuse of course. I’ve also made webserver configurations that simply check: file exists ? Nope ? Go get it remotely and create that local file.
Lennie,
Yeah, I guess something like that might work if you could find a way to mount the varnish server as a local file system. I suspect it might perform better to build a tool specifically for this purpose however. I’m not sure how well it would work with random access files or writing, haha.
Edited 2017-08-21 23:06 UTC
Building your own Linux from scratch is probably worth it either for educational reasons or for low-end hardware (Raspberry Pi or something similar).
However, for the vast majority of users I suspect it’s probably going to suck more time than it’s worth. I actually like using pre-built distros because – apart from saving a lot of time – you know you’re in the same boat as everyone else (running exactly the same kernel and userland binaries).
If it wasn’t for my new Ryzen machine needing a recent kernel, I’d still be on CentOS 7 rather than Fedora 26 – getting long-term support of up to 10 years is another bonus of some pre-built distros. I’m hoping CentOS 7.4 will work out of the box on Ryzen (yes, I know about the ELrepo kernels, but I want to use the “official” kernel) – if it does, I’m straight back to CentOS again…
When I started my linux distro years ago I had a few goals:
1. Reliability.
The OS used a static squashfs image where modifications where written to a union file system. This meant the system would always revert to a clean slate after boot. Configuration files could be persisted, but a boot menu option offered 100% clean boot.
2. Remote provisioning
Many linux distros are difficult to install remotely without a local console (for partitioning, install packages, user accounts, etc). My distro automated all of that.
3. Init Daemon
Many of the competing init systems at the time suffered from various problems:
– Too complex
– Inadequate monitoring/restarting of processes.
– To many processes involved.
– Filesystem dependencies.
– Inability to create adhoc jobs.
I wanted to have a usable init system for startup jobs prior to mounting the final file system, however this can pose problems to init systems that assume everything is mounted in place BEFORE switching on init system. For example, many distros typically handle mount failures by dropping to a local console, but with my init system I can actually spawn off an sshd during boot so that I can fix remote filesystem corruption remotely if I need to!
Beyond this, I wanted to be able to add/remove adhoc jobs dynamically without having to touch files for them.
For example
rund @test command=”mydaemon …” auto-restart=30 uid=1000 start
This would spawn “mydaemon” as uid 1000, and restart it after 30s if it shutsdown. The process would be invoked from the init system (pid=1) rather than the current user/shell, so even if I logged off the daemon would still be running.
I still use my own init system with no regrets
4. File system hierarchy
I’ve never been a big fan of the FHS file system hierarchy in linux and I much prefer the gobolinux layout! So I set out to use a similar layout in my distro. However this turned out to be extremely painful because so many of the programs I needed had FHS paths hardcoded. Ultimately it became too much effort and so I gave up on it. In hindsight, even though I still feel my layout was simpler/better, it just was too much work to maintain all the 3rd party code to work with it.
This OS is on routers, NAS boxes, servers. I’ve added lots of packages and daemons over time, but it’s used mostly on in-house and colocated stuff. I’d like to bring it to the point where others could use it too, but there’s so much competition with linux distros that it never seemed worthwhile – I don’t have the resources of ubuntu or redhat.
My approach to this was to not bother trying to fight a losing battle to begin with. I just created my own hierarchy using symlinks and moved on.
The most compelling argument to NOT build your own Linux is security and maintenance.
Sure, it can be a good learning experience to build it and the success may bring you a sense of fulfillment. But if you want to run the thing continuously, then you have to maintain the system continuously, a bug or vulnerability can be discovered at any time and in any component, from kernel to libraries to apps. Maintenance would become a full time work.
nicubunu,
Not to invalidate your point, but playing devil’s advocate here…one could make the case that monocultures are bad for security and you should try to not be like everyone else. Non-standard configurations can make your system more opaque and invalidate an intruder’s assumptions about the installed operating system components.
A standard linux install can be extremely leaky out of the box, like daemons having read access to each other’s or user’s files and system-wide read access to /bin/ /etc/ /proc/ /lib/ etc). You are more likely to be aware of these shortcomings when you build your own distro.
Bugs can even be introduced by distros themselves which you wouldn’t get exist upstream:
https://freedom-to-tinker.com/2013/09/20/software-transparency-debia…
Of course rolling one’s own distro is a big commitment. It can be difficult for small teams to stay on top of things. It’s not for average people.
Edited 2017-08-17 09:19 UTC
Not necessarily…
Building your own distro will give you very big insight into how everything fits together – and more importantly – how security fits in.
Now, I wouldn’t recommend doing it for the long term unless you’re willing to commit the resources to maintaining it – both keeping packages up-to-date and the security aspects. But as an educational exercise it’s a great experience.
really great article i loved it, well worth posting these kinds of things. Thanks
I had a go at this a few years back because I was so tired of the standard filesystem hierarchy of Linux (and Unix in general) that decided to see what it would mean to ‘fix’ it. I wanted something natural and intuitive with a root directory for ‘system’, ‘data’, ‘configuration’ and so forth…
It was actually more feasible than I thought. I had to change directory names and paths like ‘etc’ and ‘usr’ in the source code and compile and build the whole thing. Of course, in the end it was just a – very instructive – exercise.
I got some familiarity with the source code of the kernel and user land and I have to say the fact that things like ‘etc’ are simply hard coded in the kernel hundreds of times (when it would be so easy to make it configurable) seriously dented my opinion of Linux and its makers, actually of the whole Unix community in general.
I think the most interesting about this is that it’s not all that hard and complex. It’s really the management of all of this that makes it hard.
For this reason alone I think everyone should do that at least once. Either with this guide, LFS or Gentoo (if it still works the same way).
It will allow you to see how things are connected, which makes it possible to at least get an idea where a problem you might experience might be coming from. Having a bit of experience also helps in developing gut feeling, in the sense of knowing in which order to do troubleshooting.
And again it’s not that hard. I am not super smart, and I did it (the first time) when I was around 11 or 12, barely able to understand English words when I’d read them.
Since systems are more complex now I think it would be harder, but skimming over all of this I am surprised how little changed in 15+ years.
And don’t fool yourself into thinking that it’s all different for you cause your OS has some colorful configuration tool. They still do the very same thing.
@Thom: Really, go for it. It doesn’t need huge amounts of time, nor is it all that hard.
Edited 2017-08-17 10:05 UTC
An exercise in wasting energy and time. Also, the distro you’ll build will be unmaintainable – i.e. there’s no easy way to upgrade its components.
Back in the early days of Linux: compiling your own kernel was not abnormal which partly reflected the desire to minimise the kernel size (given a few MB of RAM) and partly reflected the ambient techie credentials around.
The thing that dragged me into it was configuring net cards, etc to non standard locations which, of course, became redundant with PCI arriving. Also the device drivers being dynamically loaded removing another reason. My other recollection was how well organised it all was – still took a while to build on a 386.
These days, I would probably drift to a BSD if I had the need; Linux these days carries a whiff of the code once, test EVERYWHERE about it.
Edited 2017-08-17 10:29 UTC
That takes me back to when I got started with Slackware. That distro has always been “lean and mean” but the documentation encouraged you to compile your own optimized kernel for even better performance, and to tailor it to your specific hardware.
These days it’s less of an issue; you can keep the “huge” kernel from the installer or you can switch to the “generic” kernel shipped on the install media, and tweak it to your system, but modern PCs won’t feel any different doing so.
I don’t know of anyone who does it by hand anymore, but there are numerous tools to handle most of the work for you. Buildroot (https://buildroot.org/) is a great example of this, and Gentoo (which has been mentioned in other comments) comes close and is used by a very large number of people. The primary arguments I see from people who use these are:
1. You get exactly what you want, no more, no less. This is actually the big reason I stopped using distros with pre-built packages (I ended up going from Ubuntu to Debian, and then eventually switched to Gentoo), I was building enough of the packages locally myself that there was no point in not just building the whole system locally. The pre-built Vim packages on FreeBSD are ironically one of the best examples I have of this, they have one with almost everything turned off at build time, and one with almost everything turned on that requires installing most of a desktop environment, so people who only want some of the functionality, or want everything but the GUI (which is a common case) have to build it themselves.
2. Your system design isn’t constrained by the upstream ideas of how the system should look. This in particular means you can do pretty much anything with the storage stack provided you can boot it, and aren’t limited by things like Fedora’s refusal to support anything but ext4 for the boot partition, or many distros insistence that you have to have a swap partition.
3. Because you put it together, you have a much better chance of being able to put it back together yourself if it breaks. This is also a big one for me because I do kernel testing sometimes. With my current Gentoo system, I can rebuild the entire OS without even needing install media if I have to, simply by virtue of knowing exactly how it’s all put together (and a bit of creative work with the boot options).
Overall though, it really isn’t as hard as so many people seem to think to build a Linux system from scratch. The only gimicky part is ordering the builds right, and having a powerful enough system to get it done in a reasonable amount of time (GCC is particularly bad, even on my Ryzen 7 1700 with DDR4-2400 RAM, a GCC build containing just C, C++, Fortran, and Objective-C takes almost an hour to finish with the system being otherwise completely idle).
I’m not sure these statements are correct.
Fedora should let you change /boot to at least XFS, and that may be what it defaults to at the moment.
Alpine Linux doesn’t really give you any options on disk partitioning. That’s the one example I’ve run into of a distro insisting on a swap partition. Using Fedora as an example again, Anaconda will complain about not having a swap partition, but it will let you continue with the install.
Edited 2017-08-17 15:21 UTC
Hmm, I actually hadn’t realized Fedora supported XFS for /boot. The big problem there for me is that it doesn’t support BTRFS (despite GRUB2, which they use pretty much exclusively, supporting it just fine) or non-hardware RAID (which upstream GRUB 2 also supports just fine).
As far as swap partitions, Alpine is indeed the only distro I know of where the installer enforces the creation of a swap partition (although it’s pretty easy to get rid of after the fact by booting something like System Rescue CD), but I was more referring to the fact that most mainstream distros will bug you during the install if you don’t have a swap partition. I guess I’m just not too fond of software questioning my judgement (I dislike ‘rm’ alias for root that Fedora has by default for the same reason).
There is a bug in grubby that prevents Fedora from using /boot on BTRFS volumes. (https://bugzilla.redhat.com/show_bug.cgi?id=864198) Boot entries aren’t created, and grub2-mkconfig needs to be run manually after kernel updates. (https://ask.fedoraproject.org/en/question/72423/can-i-format-boot-as…)
md RAID is supported for /boot in Fedora. I just tried it in a VM with Fedora Server 26, it was a UEFI VM even, and my home server running C7 has /boot on a md mirror.
I’d like a little bit more control of disk partitioning in Alpine. Having to go back and fix the disk partitioning isn’t what I consider user friendly.
I understand aliasing ‘rm’ to ‘rm -i’ for root. That’s a small safety measure.
Yeah, but it’s not all that safe when it gets you in the habit of using ‘yes | rm` all the time
As an experienced admin who builds servers, I find it very user friendly. It’s stark minimalism is a nice change of pace from distros which even in their minimalist forms are still pretty heavy.
How does it drastically deviate from standard package management expectations? Are you talking about ‘data’ mode where only data lives on disk and the OS is run out of RAM?
‘sys’ mode is normal server mode. ‘data’ mode is kind of weird, but it makes sense given it’s origins as an embedded OS for firewalls and such.
I build my own LFS back in 2012, back when I was just starting with Linux. It was pretty fun and I actually used my LFS system as my main computer for a while, doing all my daily computer stuff on it.
I’m not sure if many people do it, but it’s something to do if you want a deep dive into the details about how distros are built and operating systems provisioned. The knowledge about compiling code comes in handy if you get really deep into Linux administration too.
Why? It’s fun and for the sense of accomplishment. It’s like running a marathon for some people; that’s what they enjoy doing.
Keep in mind, a LFS build isn’t maintainable in the long run with out an automated build system. I’ve found Gentoo and Funtoo as more maintainable alternatives which are just as fun.
LFS helped me to level up my linux skillz.
But i have built an entire FreeBSD system from ports, after building my own custom kernel. I assume it is a similar process.
In my experience LFS is a bit more involved; it truly is from scratch. You start with an existing Linux host system that you build from, then chroot into the LFS partition once you’re far enough along, and continue from there. When you’re done with the first book, you’ve got a bare bones Linux build you can boot into, and from there you can begin to build up your system natively and ditch the host OS.
I’ve done it twice; the first time was in the mid 2000s and I got stuck with some incompatible libraries before I managed to make it bootable. I tried it again a few years later and achieved a self-booting, standalone Linux installation, but I was so fatigued at that point that I never bothered with X or any other modern desktop stuff.
Morgan,
Alas, it doesn’t seem likely that’s we’ll be changing any time soon.
Edited 2017-08-18 06:44 UTC
Actually, it’s quite different.
Linux is a the kernel, and the rest of the userland is downloaded “from the internet”, from here and there, hither and yon. It’s one reason many folks struggle to get LFS to work, despite its detailed instructions.
BSD is not like that, since it’s a combined kernel and user land, with a single source of truth code base. It’s much more routine, and reliable, to “build the whole OS” on BSD compared to Linuxi.
whartung,
I often find linux development less organized and more chaotic. It’s my opinion that the BSDs are better engineered, but alas many people, including myself, still go with linux simply because it’s more popular and better supported.
If someone is primarily interested in creating a minimal Linux distribution and learning how to cross compile the first stage, I recommend going the musl+busybox way. For this purpose I maintain a set of scripts that finally lead to a bootable CD image for x86 or a chrootable root filesystem for ARM: https://github.com/mschlenker/TinyCrossLinux
Since I ran into several issues with live CD boot scripts in 2008 I started maintaining my own LFS based distribution with about 800 packages. With this low number of packages, the risk of breaking something when changing a few packages is much smaller than with distributions that contain 10000+ packages.
In many cases package management is not needed, the whole system can be built as a single image. For updates one can go the ChromeOS way of keeping a spare boot partition where one can roll out the updated image. Also take a look at OStree for system updates.
So yes, maintaining a LFS based distribution can make sense especially for special needs where stripping down an existing system might cause more work than building a small system from scratch.
If interested in the LessLinux scripts, take a look at
http://blog.lesslinux.org/ and https://github.com/mschlenker/lesslinux-builder. There is not much activity on the blog currently, but I am doing builds as base for commercial distributions (mostly live systems for rescue and virus scan purposes) on a regular basis, so I can always point to latest DVD images and help with questions.
which is the successor of ROCK Linux (founded around 1998) and still being around for automated builds, including cross compiling embedded systems: https://t2sde.org
Oh Rene, nice to meet you here. Of course, RockLinux is an interesting approach, especially where cross compiling makes sense (many ARM and MIPS targets).
A more generally used approach with lots of traction within the industry is Yocto, this is used for exaple by Automotive Grade Linux.
It was fun to do when learning the ins and outs of Linux. 10 years from then and I still find something new to learn about this..
Hope this won’t die as a trend since the “pita” of a Linux box these days is to create a pure minimalist bootable linux environment and then run everything on micro-containers as you need them.
Keeping all “standard” distribution packages up to date with security packages is just time consuming and usually no longer worth the time and money invested to maintain it.