Friday, 1 January 2016

What annoys c++ programmer in Windows

If You want Your app to be a nice citizen of user's operating system, You typically use system settings. For instance, You could use localization. For instance, if You want to show 106, then You could write a simple program:
#include <iostream>
#include <locale>

using namespace std;

int main()
{
    const int mega = 1'000'000;

    locale systemLocale("");
    cout.imbue(systemLocale);
    cout << "System's locale: " << mega << endl;

}

It is in plain c++11! If You would remove apostrophes from the definition of mega constant, it would be in c++98.

One side note about creating std::locale object. It can be done in many ways, but it is worth to know about three basic ones:
  • with the default constructor, "C" locale is created
  • with single string parameter (const char* or std::string), locale corresponding to given name is created
  • with a single, empty string parameter, system's default locale object is created.
Back, to the program. When run, it should write to the console "1000000" with thousand separators in it, if they are set in the system. On my Ubuntu box, there are no thousand separators, so program's output is
System's locale: 1000000
Why there are no separators, in Polish there should be spaces, I do not know. But the program's output is at least consistent with system settings. On the other hand, in my Windows box, there are a thousand separators set. And so, the program's output should be "1 000 000". But it is not. It is:
System's locale: 1á000á000
It is almost garbage! It is like Windows is saying to You: "Be a naughty programmer: use only C locale and not the user settings.". ;) But, if users use a more recent version of Windows and run the program in PowerShell, there is a solution. He or she should set console output encoding to the system's default:
[System.Console]::OutputEncoding = [System.Text.Encoding]::Default
After doing that, the program's output is alright and consistent with system settings. On the source of the solution, $OutputEncoding to the rescue, authors said (in year 2006!) that most commands do not process UNICODE correctly and that is the reason for doing fallback to ASCII. For me, it is a poor excuse for not forcing programmers to write good software!