One of the most important tasks of the distribution packager is to ensure that the software shipped to our users is free of security vulnerabilities. While finding and fixing the vulnerable code is usually considered upstream’s responsibility, the packager needs to ensure that all these fixes reach the end users ASAP. With the aid of central package management and dynamic linking, the Linux distributions have pretty much perfected the deployment of security fixes. Ideally, fixing a vulnerable dependency is as simple as patching a single shared library via the distribution’s automated update system.
Of course, this works only if the package in question is actually following good security practices. Over the years, many Linux distributions (at the very least, Debian, Fedora and Gentoo) have been fighting these bad practices with some success. However, today the times have changed. Today, for every 10 packages fixed, a completely new ecosystem emerges with the bad security practices at its central point. Go, Rust and to some extent Python are just a few examples of programming languages that have integrated the bad security practices into the very fabric of their existence, and recreated the same old problems in entirely new ways.
This post explains the issue packagers run into very well – and it sure does look like these newer platforms are not very good citizens. I know this isn’t related, but this gives me the same feelings and reservations as Flatpak, Snap, and similar tools.
As both a developer and Linux user I see there are pros and cons in both approaches. However, as the article is in the packager side of things I’m going to point out a few things in the opposite side. Linux package managers are really C/C++ package managers. They were designed to do that well in an era where no C package managers existed. New languages use their own package managers, which are cross-platform, easier to use, and provide a better workflow, so I think this will prevail and they will be the source of truth in the code. When the new languages arrived (starting with Ruby), most distros didn’t change its policy or explored new ways to adapt to this paradigm shift and now I think it is too late.
Also, they talk about pinning, which they say is bad. I don’t think so. I think that if a program can’t be secure if it doesn’t have a maintainer that at least change the pinned version. Otherwise, you’re just keeping zombie programs which may seem secure at first but not really. Pinning is good because it gives us reproducibility. In fact distros also want reproducibility, it’s just that they want to pin versions in their package manager, not in the upstream program. But developers want reproducibility in upstream, because it also works on other distros and OSs!
Although, the general argument of “dynamic linking is better because we can upgrade the libs without waiting for upstream” is a valid point, it is not a silver bullet because security updates that need breaking APIs/ABIs are a thing and then you still need to wait for upstream (or do a patch).
aarroyoc,
I’d say the linux package managers are language agnostic, you’ll find libraries for C, perl, php, haskell, node, ruby, etc. So I don’t really agree anything is specific to C/C++.
Obviously a lot of language devs decided to build their own language specific repos like, cpan, pear, cargo, npm, pip, etc. IMHO linux distro admins don’t necessarily have the manpower & competency to do a good job for every languages. I think it makes sense to have decentralized repos and it can be a good thing for languages to have their own local repos under their own control.
However it’s a bit unfortunate that so many languages are reinventing the repo tools over and over again. This is unnecessary and creates tons of overlap. Moreover it adds unnecessary confusion and complexity. It’s like having every language developing it’s own source control for the sake of it, there’s just no reason to do that. Not-invented-here syndrome prevents us from working together across languages unfortunately.
For those of us who use a lot of languages, these incompatible tools can be quite frustrating. Ideally we’d have a consortium with representatives from all languages to come up with a unified repo tool-set that combines all these custom repos into a unified standard. This way we could learn how to use one set of tools without having to relearn the tools for each new language, which is mostly a waste of time and effort.
He was talking figuratively and meant that the package managers are optimized for distributing precompiled, platform-dependent libraries and therefore not maybe suitable or practical for other purposes.
In some way he is correct. Actually there is a Linux distribution that has adopted the paradigm of bundling dependencies with the distributed app. That distro is NixOS.
Yea, pretty much. C and C++ are lands of anarchy and chaos. It’s something I love about them, but yeah, distro package managers exist to create order in the wilderness.
Distros never accepted the idea a minimal base should exist. Being everyone’s whole world was great when Internet wasn’t accessible, and it still is for places with spotty or expensive Internet access, but Internet is accessible for a large population. Distros just need to be the minimal needed to run, and everything else. You know, the way the BSDs do it. A distro is a great feat of engineering, but it’s really annoying for those who live in the real world and need to get work done.
I’m of the mindset distros should only ship bootstrapped interpreters for languages like Ruby, or they should only ship the resulting binary for languages like Rust or Go. The odds an application is going to fit into the tiny box defined by the distro package manager is very slim. Rust or Go can deal with the dependencies on their own.
Dynamic linking is better because of lower memory usage, smaller disk usage, and modularity. Those are the reason we have dynamic linking. Bumping the lib version isn’t one of them, and anyone who’s had to deal with the ripple effects of a library bump knows that is not a reason. This is a retcon of history. If the devs are really nice, they won’t break anything and everything will keep working, but at some point, they do have to make breaking changes. They can’t keep the baggage around, and it’s up to the people using the libraries to maintain their software.
Dealing with 3rd-party libs is a circus.
That’s fair… assuming that your programs share enough libraries to account for the lost opportunity for compiler upgrades to add new memory layout optimizations.
Given how much of the big, shared stuff will remain C or C++ for the foreseeable future, how it’s still possible to manually make a C ABI for dynamic loading, and how little actually gets shared in real life, the Rust developers decided to defer that decision to make room for the introduction of more compile-time optimizations.
(eg. They’ve already added automatic struct packing for non-
repr(C)
structs since the v1.0 compatibility freeze.)True. Or you run a program in enough numbers and frequency that it makes sense. (I only have one of those workloads these days instead of many.) Aside from modularity, I find dynamic linking is less attractive these days.
My understanding is that dynamic linking was a way to get better use out of limited resources. Dynamic linking didn’t become a thing until the late ’80s, at least with Unix, and Unix was originally all statically linked. 🙂
It makes sense. It’s not a burden to fetch fresh builds from a remote server these days, assuming, and setting up a build server is not as mystical as it once was.
The world is also moving to a model where programs are much more isolated then they used to be. Containers, for example, require all of the dependencies to be copied into them, and that eliminates most of the reasons for dynamic linking.
Products also kind of include vendored libraries in their own directory tree anyway, and that’s kind of a hacky version of static compilation.
I don’t remember reading those reservations. Does anyone have a link?
I ask because Flatpak was specifically designed to balance those two concerns, providing a stable platform for packages to depend on, while still being based on shared libraries that can be updated for security without waiting for upstream.
Also, as the “Gotta go deeper” section of Let’s Be Real About Dependencies points out, compared to ecosystems like Rust, the world of C and C++ has a lot of vendored dependencies that can’t be meaningfully broken out by distro maintainers, either because it’s a case of “every package reinvents their own X with slightly different APIs, semantics, and bugs” or because they’re single-header libraries, fundamentally designed to be vendored and obviously not broken out by any distro I’ve checked or you’d have four or five times as many packages.
Exactly. It is related, but at least in the case of Flatpak (not AppImage, not sure about Snap), it was designed to share packages “as much as possible”.
Snap and AppImage are giant blobs. They take the road of bundling everything with the application.
For those that don’t know, flatpak deals with dependencies by creating different paths for each library which the applications depends on. The path is consistent between applications, so there only needs to be one version of a library.
For example a significant portion of the very popular boost is implemented in header files, which get compiled-in to every application, even though parts of boost are compiled as shared libraries. If a security flaw is found in the header file code, it would require recompilation of every affected package.
Luke McCarthy,
Rust kind of behaves the same way. prebuilt rlibs are not binary compatible. If you update them, you have to rebuild your whole project. This posses a challenge if you want your rust program to support dynamically loaded plugins. I haven’t found a solution short of defining a C api and resorting to rust’s C compatibility.
When it comes to the difficulties of defining a stable ABI, Rust is essentially “C++ which prefers to use templates wherever possible”, with the same problems.
The impact of C++ templates on library ABI (Michał Górny, 2012)
As for dynamically loaded plugins, the best solution I’ve seen is the abi_stable crate, which abstracts Rust-to-Rust FFI on top of the C ABI for exactly that sort of use case.
ssokolow,
Thank you for the link, I’ll have a closer look at this!
There aren’t any reservations (aside from snap being proprietary to Canonicial and AppImage being flaky). They are a direct reaction to package managers not being able to version libraries. Some ways are smarter then others though.
Some interesting projects to fix this are Nix/NixOS, Guix, and ostree.
In the Rust case (and Go to a lesser extent), it’s pretty easy for the packager to check if a program uses a particular vulnerable lib version, update that lib, and rebuild/repackage. The Rust ecosystem is pretty good regarding semver, has CVE auditing tools, etc. It may not be the workflow that packagers are used to, but it’s a much better situation than what happens with C++-style vendoring.
That’s the thing isn’t it. There are lots of practices which make deploying code and bumping lib versions safer and easier. Rust and Go have put effort into making that easier.
How is recompiling a statically linked binary that bad? That’s why I use Go. Not having to worry about the system bumping a library and breaking my applications is really nice.
Go is super dogmatic about semver. CVE auditing could be better. “go audit” doesn’t exist.
I think the author was not limiting his criticism to how Go or Rust apps are bundled but also how the Rust and Go compilers themselves work and are distributed.
If there is a vulnerability in the compiler, all apps compiled with that specific compiler version are potentially vulnerable. When said vulnerability originated from a common library, usually it would suffice to update that library and all third-party apps using the library would then be fixed as well. But when Rust uses static compilation to bundle that library into the app itself, then we first need to recompile Rust and then recompile the apps written in Rust. So the steps required to fix a single vulnerability are suddenly quite complex.
The “vulnerable compiler” case is equally bad for all languages, statically or dynamically linked. You’ll have to recompile the world either way. Build caches will help.
Short version: The underlying system theories prove no system is secure and have done since before the integrated circuit was invented. Anyone waving a “proof of security” or stack of academic papers in your face claiming a system is secure is deluded. All you can do is mitigate it.
HollyB,
Not this again… If you think that “underlying system theories prove no system is secure”, then link to a proof that backs your claim, I insist.
I’m sorry you still have a problem with mathematical proofs, but you are wrong. Did you not take discrete mathematics where you learn how mathematical proofs work? It’s not some black magic, we can in fact prove software correctness.
sj87 is correct in saying “If there is a vulnerability in the compiler, all apps compiled with that specific compiler version are potentially vulnerable.” He is right, but at the same time he does not imply that all software is automatically proven to be vulnerable.
The reality is that most developers aren’t expected to prove anything, and for most projects QA testing is considered to be good enough. But proof by induction and proof by cases are very powerful methods that can in fact prove correctness in software. That is a fact. I seriously don’t understand why you would deny it unless you are trolling. Is that all this is about?
@Alfman
Insist all you like. Go away and stop replying as I’m not speaking with you.
HollyB,
In other words, you don’t have proof. That’s the thing, why keep making assertions that you know you’ll have to backtrack on if you get called out on it?
@Alfman.
No. It means what it says. We will never speak again.
End.
HollyB,
You do you, just don’t expect me to remain silent.
Not for nothing, but you’re the one who’s chosen to keep bringing up the same topic after disagreeing about SEL4.
http://www.osnews.com/story/132990/sel4-micro-kernel-working-towards-a-general-purpose-multi-server-os/
http://www.osnews.com/story/133005/a-look-at-gsm/
… and I left it at that, but you bring it up again…
http://www.osnews.com/story/133066/the-modern-packagers-security-nightmare/
If you didn’t want to discuss it, that’s fine, but then why bring it up again? It kind of appears you’ve been deliberately trolling proof of security since the sel4 topic with the intention of provoking. I honestly don’t mind letting it go, water under the bridge I say, but then you need to let it go too.
This is working on the assumption the library author is going to backport the fix and there aren’t any breaking changes due to the fix or in the version with the fix.
This scenario all depends on the library author being nice enough not to introduce breaking changes.
Dealing with third-party libraries is hard. It takes work keeping things updated, and this is where CI/CD systems and automated testing are invaluable.
I like static compilation. It breaks things down into discrete parts, and it simplifies things. It makes my life easier, and I am all about that.
I am so glad I don’t code anymore. (I will never tire of saying this.) All these snags and snafferoos regardless of system are now very firmly someone else’s problem. Hooray!
I’ve read that article and some commentary on it.
Here’s my commentary:
Package managers haven’t figured out how deal with multiple versions of a library. They’ve papered over the problem by forcing everything through a single gate and anointing a single golden library. However in the real world with devs doing work, this breaks down, and the author is crying about it. A single golden lib is a great ideal, but the real world is messy.
“pip”, “go mod”, “cargo”, perlenv, rnv, and more exist to let people work around the problems of package managers shipping old versions of libraries. An entire cottage industry of tools exists to work around package managers. Think about that.
People don’t necessarily get their software from the repo. They custom compile their own stuff or they setup custom directory trees with all of their custom dependencies. This is the entire value proposition of Docker and container. Bundle everything together, and people have an easy way to work around the limitations of the package manager.
I guarantee every ops person has a story where they setup some sort of custom library to make some application happy. I have way too much practical knowledge of how autotools, compilers, linkers, and bootstrapping toolchains from on the job experience to think otherwise.
Also, this doesn’t seem to bother OpenBSD or FreeBSD just a particular FOSS OS who has a high priests who bless and anoint software in a baseless system and has package managers which aren’t built for the way people work. FreeBSD: “pkg audit -F” will let you know which software has outstanding CVEs. OpenBSD: Downloads all of the dependencies for Go software to all be built offline. There are compromises and workarounds.
Flatpak and Nix (NixOS, Guix) are two packaging formats which acknowledge life is messy, and they do deal with the reality that there are going to be multiple different libraries on a system. They are build to let people run software.
I guess what I’m saying is, “This is life, and I’m sorry it’s not what you want it to be. Operations gets paid to deal with this mess, and I’m sorry you don’t like that… And don’t want to help them fix the problem.” That’s what we get paid to do, fix problems.
@Flatland Spider
This is a very good summary of the problem. What would help is if the problem was actually properly documented from the beginning. Almost all OS have the same problem si it’s something which can have application all over. Many of the individual problems and solutions vary from system to system and developer to developer but none of this is unique. Any solution should be built on properly defining the problems, all the pluses and minuses, and the end result should be a properly documented and workable system with accompanying documentation.
This kind of work would suit an academic as they don’t have a dog in the fight and can avoid the office politics. It would also mean various job titles couldn’t hide and would have to justify their decisions. It may solve a lot of arguing and duplication and things would get done quicker… (See also Linux kernel and Wayland et al.)
I’ve found in life a lot of job titles and random input do not “solve problems”. Control freakery and empire building can get in the way. Many white papers and reports all too often go ignored…