←back to thread

238 points GalaxySnail | 3 comments | | HN request time: 0.26s | source
1. jillesvangurp ◴[] No.40169328[source]
Not relying on flaky system defaults is a good thing. These things have a way of turning around and being different than what you assume them to be. A few years ago I was dealing with Ubuntu and some init.d scripts. One issue I ran into was that some script we used to launch Java (this was before docker) was running as root (bad, I know) and with a shell that did not set UTF-8 as the default like would be completely normal for regular users. And of course that revealed some bad APIs that we were using in Java that use the os default. Most of these things have variants that allow you to set the encoding at this point and a lot of static code checkers will warn you if you use the wrong one. But of course it only takes one place for this to start messing up content.

These days it's less of an issue but I would simply not rely on the os to get this right ever for this. Most uses of encodings other than UTF-8 are extremely likely to be unintentional at this point. And if it is intentional, you should be very explicit about it and not rely on weird indirect configuration through the OS that may or may not line up.

So, good change. Anything that breaks over this is probably better off with the simple fix added. And it's not worth leaving everything else as broken as it is with content corruption bugs just waiting to happen.

replies(2): >>40176282 #>>40196606 #
2. ok_computer ◴[] No.40176282[source]
I was using .gitignore generated by an aliased touch function in powershell. Despite my best efforts, I could not get git to respect its gitignore. Figured out the touched text file was utf-16 and basically not respected at all. Lesson learned I uuchanged a system default to utf-8 but just rely on my text editor now.
3. account42 ◴[] No.40196606[source]
Global locales were a mistake in general, not just the encoding part. printf("%f", 4.2) should not magically output different strings depending on the environment, that just causes more problems than it solves. Instead you should have to explicitly pass the local information (or relevant parts of it) to functions that you want to make locale-dependent.