The article’s from 2021, but I think it’s still worth discussing.
A hard reality of C and C++ software development on Windows is that there has never been a good, native C or C++ standard library implementation for the platform. A standard library should abstract over the underlying host facilities in order to ease portable software development. On Windows, C and C++ is so poorly hooked up to operating system interfaces that most portable or mostly-portable software — programs which work perfectly elsewhere — are subtly broken on Windows, particularly outside of the English-speaking world. The reasons are almost certainly political, originally motivated by vendor lock-in, than technical, which adds insult to injury. This article is about what’s wrong, how it’s wrong, and some easy techniques to deal with it in portable software.
↫ Chris Wellons
As someone who doesn’t know how to code or program, articles like these are always difficult to properly parse. I understand the primary problem the article covers, but what I’m curious about is how much of this problem is personal – skill issue – and how much of it is a widely held belief by Windows developers and programmers. I know there’s quite a few of you in our audience, so I’d love to hear from you how you feel about this.
The author also authored his on fix, something called libwinsane, which I’m also curious about – is this the only solution, or are there more options out there?
The linked article is silly, he just blames Windows for every difference between Windows and UNIX, it is just as plausible if not more so that UNIX is the outlier. (Eg, AmigaOS works much the same as Windows as far as the aspects mentioned in the article are concerned,) The Unicode he complains about can be enabled if it is wanted but changing the default would be unwise; I certainly would not want it for my own programs,
The article is too focused on the standard C++ library and lack of utf-8 as default.
However if you use the modern C++/WinRT (derived from C++/CLI), there are simple tools to move between those worlds:
https://devblogs.microsoft.com/oldnewthing/20210922-00/
Basically most programs would not care what their strings internally contain (as long as the character set is representable). However the care needs to be done at the I/O boundaries.
Am I receiving a utf-8 request from HTTP?
Will I write this file as ASCII or utf-16?
Does the Remote Procedure Call require any conversions?
Inside those boundaries, once again, you are free to stick with “std::string” or “std::wstring” without care.
C and Unix were born as twins. Any C implementation that differs from Unix is wrong by definition.
That’s incorrect, UNIX is not even mentioned in the official C standard.