Apparently, the Bcachefs people are having problems with case-folding, and Linus Torvalds himself is not happy about it. Torvalds holds the only right opinion in this matter, which is that filesystems should obviously be case-sensitive.
Case-insensitive names are horribly wrong, and you shouldn’t have done them at all. The problem wasn’t the lack of testing, the problem was implementing it in the first place.
[…]Dammit. Case [in]sensitivity is a BUG. The fact that filesystem people still think it’s a feature, I cannot understand. It’s like they revere the old FAT filesystem so much that they have to recreate it – badly.
↫ Linus Torvalds on the LKML
It boggles my mind that a modern operating system like macOS still defaults to being case-insensitive (but case-preserving), and opting to install macOS the correct way, i.e. with case-sensitivity, can still lead to issues and bugs because macOS isn’t used to it. In 2025. Windows’ NTFS is at least case-sensitive, but apparently Win32 applications get all weird about it; if you have several files with identical names save for the case used, Win32 applications will only allow you to open one of them. I’m not sure how up to date that information is, though.
Regardless, the notion that Readme.txt is considered the same as readme.txt is absolutely insane, and should be one of those weird relics we got rid of back in the ’90s.
Given How Microsoft Lost the API War, my attitude is that the least worst option is to have it exist but be opt-in on a per-directory basis (basically what they’re implementing) as an analogue to how so much of Microsoft’s competitiveness is down to how, historically, they’d QA-test popular third-party apps for previous versions of the OS and bake tables of overrides into the OS to fake now-fixed bugs. (The example given in the post being SimCity assuming it could keep using memory after freeing it because that worked at the time it was undergoing pre-release testing and so wasn’t caught.)
On the other hand, case-insensivity was a “feature” because you only stored filenames in CAPITAL 8:3 format in the good ol’ days. Sure FAT32 provided the mean to have long names and non capital characters, but Readme.txt was still opening README.TXT as intended. Why would you have multiple version of the “same” file yet with different capitalization ?
Imagine the burden when requested to send a file to someone : you have 5 of them with the same name, which one you choose ? On Github, which ReAdMe.md to display by default ?
No, case-sensitivity is the bug. Humans aren’t used to thinking that case matters. Lowercase Latin letters only exist because medieval scribes were too lazy to write in uppercase as the classical Romans did. I know a lot of people who like to type in uppercase because it’s easier (for humans) to read. Some people write in capital letters. Computers should fit human expectations.
I might agree with this one.
But a good compromise is not allowing clashes, yet preserving case.
So, if there is a readme.txt the system may choose not to enable adding a Readme.txt in the same place.
But… all of this becomes extremely complicated when there are languages with different capitalization rules.
Turkish is notorious for this. And some other Balkan languages.
https://devblogs.microsoft.com/oldnewthing/20241007-00/?p=110345
lira becomes LİRA, ırak becomes IRAK for example.
Since English capitalizes i as I, and does not have small-dotless-i and capital-dotted-I things get really complicated in mixed language systems.
(and some other languages as well).
And no, UNICODE cannot fix this.
+1
Set up a firm called NIKe and try defending that at a trademark court, arguing case sensitivity.
Write a book named HaRRy PottER and see if you’ll get sued or not.
We have 26 (or 29 or I don’t know how many in French) letters, not 52 (or 58 or whatever). Isn’t that OBVIOUS? (yeah, in capital letters, so it means something else).
Agreed. Was the ACTUAL homework stored in homework directory or was it in Homework/homeworK/HomeWork/homeWork?
It would be very embarrassing to show the wrong directory to your parents/teachers…
darkhog,
Ah, so ‘Homework’ is for homework and ‘homeworK’ is for porn? I can see your debacle, haha. You might spare yourself the risk of choosing the wrong one by turning on thumbnail view 🙂
Still the question remains, what would be the added benefit of having various ‘homework” folders with different capitalization ? But to add confusion ?
Imagine explaining this over voice and not text.
“Can you open the Homework.DOC? That’s capital H, lower case o m e w o r k. full stop. Capital D, capital O, capital C”
Case sensitivity is definitely the bug, and there’s no good reason to use it.
And? The Romans also didn’t use Arabic/Indian numerals or electricity…
So bizarre how the whole concept of technological/cultural/scientific advancement escapes a few people in a blog about technology…
Thom Holwerda,
Case folding ropes in an awful lot of technical complexity inside file systems. It’s a lot easier to do a byte comparison. However from a user perspective, which may be the one that matters more on windows and macos, it doesn’t seem nearly as obvious to me why a user would vote to be able to store multiple files that differ only by letter case?
I have to ask, is your claim of “obviousness” based on implementation complexity, or is it more of a preference as a user of the operating system?
Interesting that we had an article about unicode canonicalization in filenames not long ago. Some characters can be represented using different unicode sequences, which creates a very similar dilemma of whether file systems need to convert filenames to a canonical form. Otherwise you may get file not found errors.
It doesn’t matter what the merits of case-sensitivity vs case-insensitivity are, the market has declared case-insensitivity the winner, see my comment below for details.
Let me de-boggle your mind:
1) The market has already decided on case-insensitivity by choosing the FAT filesystems as the filesystems for SD cards. The entire holy war is pointless, the market has picked the winner.
2) If you want to be able to cut and paste directories to bog-standard SD cards without getting weird “file already exists, overwrite?” messages, the source filesystem also has to be case-insensitive.
It’s one of the things I hate about Desktop Linux every time I use my employer’s Desktop Linux computer to slack off at work (and before that in uni): I will end up with those different-only-in-casing filenames for my saved funny pics (for example funny.jpg and Funny.jpg), and when I copy the files over to USB sticks or MicroSD cards I get “file already exists, overwrite?” errors.
kurkosdr,
I guess you are saying that the debate on case insensitivity is irrelevant because FAT won anyway. However it does not follow that case insensitivity was a driving factor in choosing FAT. A far bigger factor is the fact that all operating systems support FAT while the same can’t be said for anything else.
In the same sense, we are getting tariffs on account of voters having elected trump, but it does not actually mean that voters wanted tariffs.
I don’t think that’s so bad, but you’re right this would happen.
I don’t find this to be a big problem for me, but I respect your opinion. I think that dolphin (my file manager) suggests a new name when a conflicting file is encountered – I wonder if this works in your scenario.
Alfman,
Yes, FAT is definitely not ideal, but being universally available it has “won”.
But does it really matter for an operating system? For most directories under the system, they will not be directly copied to FAT anyway. Moving from /bin to A: would mean we would lose all information like owners, exec bits, and access control anyway.
It only makes sense for user directories like Desktop or Documents.
And even then, the system can offer to just ZIP all those case sensitive ones, in case there is actually a rare conflict. (ZIP is another lowest common denominator that has won, wish we had something more modern).
Anyway, there are compromises, but we cannot go back and change Linux to match “one true solution” either.
sukru,
Yeah, it’s hard to imagine there are many users that are significantly inconvenienced by case differences.. I’m guessing that I might encounter several filenames with differing cases like “readme.txt” “Readme.txt” “README.TXT”, etc in my archives, but how often am I going to copy these files into the same directory? I don’t recall it ever being a problem. Hypothetically I wouldn’t mind if the file system treated them as the same name, but it’s not that important to me. When searching, I use case insensitivity flags with find, grep, etc virtually 100% of the time (ie find -iname, grep -i, …). Given the way I use files, I can’t claim that sensitivity is that important to me, but I just don’t have a strong opinion either way. Neither the windows nor linux way bothers me. Although there is a bug in windows where I can’t rename a file to the same filename under a different letter case.
Renaming ‘README.TXT’ to ‘readme.txt’ doesn’t work. I have to rename ‘README.TXT’ to ‘README2.TXT’ and then to ‘readme.txt’, which IMHO is a windows bug.
It’s just a display-bug in explorer.
You can hit F5 to refresh the view and you will see, that the file was renamed as it should be. (just tested it on Win 10)
smashIt,
Makes sense. Thanks for testing it. 🙂
That’s retarded. It wasn’t an informed choice by people free to consume any filesystem. FAT is the filesystem of last resort. People use it because nothing else besides patent-encumbered NTFS will work on a standard Windows install. If Microsoft dropped an F2FS driver into the standard Windows install everyone would switch in about six months.
> It boggles my mind that a modern operating system like macOS still defaults to being case-insensitive (but case-preserving), and opting to install macOS the correct way, i.e. with case-sensitivity, can still lead to issues and bugs because macOS isn’t used to it
ah, bless your soul ans I’m so happy you are nowhere near decision making place 😀
I love how macOS does and it just works right., and I set all my Linux machines to mimic that.
Thom never misses a chance to bash macOS.
macOS could just stop being trash.
Thom Holwerda,
I had a mental picture of somebody using this line at a family reunion, haha.
Well, Apple DID release a Mac that looks like a trash can…
This is such a juvenile, childish view for the sole editor running an operating system news website.
It’s getting tiring to have to wade through your bias and sour-pissing every damn article.
Preach.
Bye then
I like case-insensitive.
And Microsofts PowerShell on Linux simulates case-insensitive. (The Powershell is an other thing I like, btw)
If I write in the Powershell on Linux “cd dow” and pressing the Tab-key, then it completes to “cd ./Downloads”.
“dir s*” shows me in the current directory all files and directories, which beginning with an “s” or “S”.
theuserbl,
When searching files, (including auto-complete) case insensitivity seems to be much more useful IMHO. The question is whether the filesystem itself should be case insensitive or if it’s good enough that shells and tooling can perform insensitive searches on a case sensitive FS. I don’t have a strong opinion that a file system has to be one way or the other, however it’s clearly simpler for the FS to operate on opaque bytes without any logic for unicode or letter cases.
But then again, if you want to use case insensitivity in the shell, then technically every file lookup operation has to involve a full directory scan if the file system doesn’t support case insensitive indexing..
The problem is not whether case sensitivity is good or bad in UI. The problem is that case-insensitivity is INCONSISTENT.
Linus argues that security and safety are more important than “nice vibes” – and the only way to achieve them is to be consistent.
And the only way to be consistent, at the filesystem level, is to not have any complicated rules. Because consistent case-insensitive rules don’t exist in today’s world.
And THAT makes case-insensitivity a bug. Almost as awful as NULL. And almost as hard to fix.
zde,
I’m not saying his opinion is wrong, only that it’s just one opinion. Here’s the thing, not everyone feels the same way about it. I don’t have a problem with the way linux does it and I also don’t have a problem with the way windows does it. Clearly it’s more complex to have case insensitivity, but if you ask some people they’d say it’s justified. It’s not so much a matter of right and wrong and more a matter of personal preferences.
There are many people who prefer the case insensitive behavior of windows. Writing an OS that caters to people’s preferences is not a bug. I can’t agree with you on that. The same logic you use to discredit case insensitivity in file systems would rule out case insensitivity in databases too. But just because it’s hard doesn’t mean it’s not genuinely useful to do at times. Whether it’s an FS or database, it’s not always appropriate to store the same strings with different letter case. I don’t think it needs to be one size fits all or that we should be forcing our preferences on everyone else by decree.
https://dev.mysql.com/doc/refman/8.4/en/case-sensitivity.html
What’s wrong with NULL? I guess I’m lacking the context to understand your point, but NULLs are important and meaningful in programming. C# for example supports both nullable and non-nullable types and is very good at enforcing NULL correctness even at compile time.
Just look for the video titled “Null References: The Billion Dollar Mistake – Tony Hoare”. It’s pretty nice one. And to understand the context: C. A. R. Hoare is, quite literally, THE GUY WHO INVENTED THEM.
It’s the same with with case-sensitivity: that’s something that, quite literally, did billions of dollars of damage (and would do more).
P.S. And saying “I’m not saying his opinion is wrong, only that it’s just one opinion” is bullshit. Saying “that’s just opinion” is to ignore facts. And facts are simple: case sensitivity causes MINOR INCONVENIENCE, AT MOST while case insensitivity causes ACTUAL AND SERIOUS DAMAGE. These are just facts, there are no room for opinions.
zde,
Hmm, I’ve become wary of watching hour long videos to get an answer to a question, haha. Thankfully I found a synopsis that seems to cover it well:
https://www.infoq.com/presentations/Null-References-The-Billion-Dollar-Mistake-Tony-Hoare/
Just as an aside, even if that’s true, who invented NULLs doesn’t really affect their merit. Neil Armstrong was the first person to set foot on the moon, which is an awesome accomplishment, but if he had been absent or sick that week, then in reality someone else on the backup team would have gotten credit instead. I’d say the same is true of NULLs. It’s not so much the product of any one person, because anyone doing computer science at the time would have “invented” it too.
Most of his arguments seem to be about languages he was using at the time, like C and fortran. And to that end I am in agreement with him. All of his examples from old languages have merit.
Language constraints at the time turned safety into a performance tradeoff, and I accept this made NULLs dangerous. He does mention modern languages, however IMHO he fails to acknowledge a really critical point.
It’s not just that they offer non-nullable types to replace nullable types, but that the language type systems elegantly and safety solve the nullable problem. Modern languages make NULLs well behaved. Had C (or indeed ALGOL) implemented NULLs safely from the very outset, then NULLs would not have turned into the “billion dollar mistake”.
I think we can be in agreement that NULLs inside of unsafe languages are a problem, but they’re not a problem inside of safe languages.
That’s quite an allegation…how so?
I don’t ignore facts and I’ve been very candid that case insensitivity is more complex. However the importance of case insensitivity is very subjective and the opinion that it’s not important is just one opinion and not something that’s factually right or wrong.
You have to be more specific.
I would also like to point out that just because facts may lean one way doesn’t mean opinions lean the same way. For example: does the fact that rustlang protects us from many types of faults imply that everyone prefers rustlang over C? Clearly not. Of course you’re free to disagree with them, but people do have different priorities.
“Readme.txt is considered the same as readme.txt”
It is indeed the same. Ask anyone outside of the tech sphere and they will tell you. It’s the same but looks different. They will tell you the “R” is capital. but it doesn’t change the meaning. The sentences I am writing at this very moment clearly illustrate it. The word ‘the’ is both written as ‘The’ and ‘the’. It just so happens the ‘the’ used at the beginning of a sentence has a capital ‘t’, written as a “T” because it’s, well, proper to write the word ‘the’ at the beginning of a sentence as ‘The’.
What a bizarre mindset to think otherwise.
tuaris,
Yes, that’s a valid opinion. I think you missed the point of the example though. Case insensitive file systems still manage to store the case of file names (with the exception of legacy DOS). It’s both valid and appropriate under a case insensitive file system to have the file system save the user’s desired letter casing and in principal a user should be able to change the letter case after creation.
I don’t recall if I saw this issue on a windows file system or a network share., I just remember that it didn’t work unless I used multiple renames.
“Give my Dick a hand.”
“Give my dick a hand.”
“I use iOS.”
“I use IOS.”
“I believe in God.”
“I believe in god.”
Shakespeare’s Hamlet/a hamlet; Ionic vs. ionic; mandarin vs. Mandarin; etc.
Even English has a ton capitonyms. In other languages, it’s even more meaningful. Just because you lack the linguistic knowledge to understand language properly doesn’t mean we all have to suffer.
Most file systems are case sensitive in the sense that you can write README.txt or readme.txt or Readme.txt. But is absolutelly delirious to expect to have a README.txt and a readme.txt at the same place.
Yes, case sensitivity is the “sensible” way of doing things, and the way to go. But the Bcachefs people are probably just trying to support use cases that need case insensitivity, such as Windows software being ported to Linux (or coaxed into using Linux for their storage).
Of course, one might argue that such software should instead use a user-space API for case folding. However, there are also reasons for sticking with a kernel-space implementation. Efficiency, for instance: while user-space case folding might be fine for desktop use, it might not be for a cloud service used by millions. Or inability to modify the client software (which might even only be available as a binary) to use the case-folding API.
So, yeah. Case insensitivity is an ugly hack. But I suspect it’s useful for a whole bunch of people, and would cause mayhem if someone were to banish it from the kernel.
Case insensitivity is good for people, I agree.
But for the love of God, implement it in the (GUI and much more locale-context aware) file manager and not in the file system.
How is it good, please elavorate.
sixtyfive,
This is an interesting tradeoff.
Having the file system & kernel just use opaque bytes with no logic at all clearly simplifies things a lot. However I see some consequences to the design you propose:
1) There’s no longer system-wide enforcement/consistency across processes. Even if your file manager implements case insensitivity, many programs likely will not and therefor you’ll end up with an inconsistent application of “the rules”.
2) Doing it in userspace implies significantly worse performance since you can’t just open a file anymore, you have to scan the entire directory for matches first. When done at the file system level this isn’t a problem because the files can be indexed case insensitively.
3) Atomic semantics may be different. When done in the file system, opening a file for reading/writing can be one atomic operation. However when done in a userspace process, case insensitive file matching becomes a composite operation and can no longer be atomic.
My own take is that IF case insensitivity is a desired feature, THEN the technical consequences above strongly favor a kernel implementation.
Linus is delirius. Case sensitivity in filenames? Sure.
A fundamental weakness of computer minds is they are symbol bound. They don’t know what anything really is. Making them categorize a cat.img as different to a Cat.img is a step in the wrong direction. I would desensitise beyond just text. A computer should categorise a Cat.img the same as a Chat.img or a Koshka.img . This should be supported at the lowest level possible, so at least at the OS. Imagine an OS that knew what something was. Consider what overhead that would save applications. Torvalds is proposing adding overhead to applications, making systems more brittle.
lapx432,
Not sure if your comment was sarcastic, but wow your post is very interesting food for thought. +1 🙂
It’s a serious comment. My personal theory on the breakthrough we need for AGI is that symbols are grounded, i.e. hard linked to the actual thing they represent. Like the hard links back-up software uses, except the links would somehow (a big somehow) extend not to just the original file but to the original object in the real world. This needs to be supported from the sensory hardware up through the OS so that applications become trivial. “Pick the cat up” would be a complete program to pick the cat up. I see LLMs as a side show on the way to L-ROMs, large real object models. And I just made that up :-).
I’m trying to achieve just that, but what would you use as culture neutral pivot language to avoid favoring one over another ? How the selected language would be able to describe anything even in languages that offer more nuances ? And how would you sort things in a natural way to allow programming in a natural language without anyone trying to mess with the system by renaming a file explorer to the name of an aquatic animal or a web browser to a feral feline ?
My thought is that names are tied to a shared immutable archetype. So to call a browser a feral feline would require breaking a chain that links browser to the immutable browser object representation. Such a system will have a limited core dictionary and systems can fantasize around that all they want, just like in the real world. You are not going to create a language that is invulnerable to deception, but you can create one that has some fundamental knowledge that is hard coded. Machine code is a good example. Your interloper can repoint 0xb8 to ADD but to an X86 processor it will always be MOV. This is immutable. Realistically I think some specific hardware will have to be available to any interpreter of this one truth name system.
macOS does it the right way.
But, if you’re a developer and want to compile some library from the Linux world, you could always create a case-sensitive APFS volume (with the size of the parent, like on ZFS) and place all your code there.
The customer is always right.
No matter how many Devs on the forum here refer to themselves as client programmers, they ain’t the customer!
Personally, I prefer case sensitive programming languages, and a case ignorant OS. I like strict implementations like V, that won’t even accept camel case for variables. Get rid of upper case, this isn’t 1950 and we aren’t using converted typewriters.
7bit ascii ftw and then … possible. If only we could.
You’ll prise my 6 bit ZX81 character set from my cold, dead, hand.
LOL! Mandarin laughs at your stupid “case” problem. So does Devanagari and others scripts collectively used by billions of people. The perspective here is entirely too western.
I would love to hear what the usecase is for those who think case sensitive filesystems is a good idea. For me, a case sensitive file system makes no sense, just as case sensitive sorting makes no sense – mainly because i see no workflow where i would see this as a desired property, but plenty where i don’t. Yes, i have used the hack before with upper casing a file to get it to be displayed at the top of a list, but that was just that, a hack. I don’t think we should mess with how the user named it though, so i believe Apple got it right by preserving the case. This is about collation, not presentation.
Troels,
I think most people who argue in favor of case sensitivity including Linus Torvalds are NOT making the case that users prefer their file systems to be case sensitive. Instead they’re making the case that case insensitive file systems introduce undesirable implementation complexity. So even if users never have a reason to save files that differ only in case, to them it’s not worth the complexity to make a file system that is case insensitive.
Those who are for case insensitivity are focusing on user expectations: readme.txt and Readme.txt are one and the same and the file system should treat them this way. It’s quite clear from the comments that people have different opinions about it. While I would agree in principal with those who say it’s a bad idea to have both readme.txt and Readme.txt. Likewise “C:\program files\” == “c:\Program Files\”. In practice though I’d say I’ve managed fine on both types of file systems, so I don’t have a strong opinion that it HAS to be one or the other. It’s very subjective.
There are 2 problems:
* Case (in)sensitity
* The fact that OSes relies on a name for files, that is shared with users.
Solution: each shoud have a filename for its identity, and a friendly name – possibily the same for end users.
That way:
* A file could have a case insensitive file access for users
* A filename could be translated
* A file could have 0 or more than 1 user friendly filename (for instance, you can have a “technical name” (invariant), and other friendly names – “my invoices” for “proof of payments” for HR)
Accessing a filename through its friendly name would require some operator or markup in scripts.
Coming from a DOS background, for me case-insensitivity makes sense. However, I can see some people may prefer case-sensitivity. However, using Unicode in filenames is a far bigger problem, with all the equivalance, glyphs that are equal for different code points and whatnot.
jalnl,
Indeed. This comes up from time to time in osnews articles…
https://www.osnews.com/story/134396/unicode-normalization-forms-when-o-o/
I don’t know if it was the right decision to have different ways of representing characters. But so much has been asked of unicode that it seems these issues aren’t going away, they’re going to remain with us for the long term. at issue isn’t so much the number of characters, but rather the fact that there so many are variations that overlap. And questionable decisions to add things like color modifiers. It begs the question whether a file system should even aim to be unicode compliant like filenames comprised of different color hearts. It can be impractical to use those filenames for command line tools even when they support unicode.