Most active commenters
  • Dylan16807(3)

←back to thread

238 points GalaxySnail | 11 comments | | HN request time: 0.523s | source | bottom
Show context
nerdponx ◴[] No.40169967[source]
Default text file encoding being platform-dependent always drove me nuts. This is a welcome change.

I also appreciate that they did not attempt to tackle filesystem encoding here, which is a separate issue that drives me nuts, but separately.

replies(4): >>40171063 #>>40171211 #>>40172228 #>>40173633 #
layer8 ◴[] No.40171063[source]
Historically it made sense, when most software was local-only, and text files were expected to be in the local encoding. Not just platform-dependent, but user’s preferred locale-dependent. This is also how the C standard library operates.

For example, on Unix/Linux, using iso-8859-1 was common when using Western-European languages, and in Europe it became common to switch to iso-8859-15 after the Euro was introduced, because it contained the € symbol. UTF-8 only began to work flawlessly in the later aughts. Debian switched to it as the default with the Etch release in 2010.

replies(4): >>40172024 #>>40172052 #>>40172183 #>>40177841 #
1. Dylan16807 ◴[] No.40172052[source]
> Not just platform-dependent, but user’s preferred locale-dependent.

Historically it made sense to be locale-dependent, but even then it was annoying to be platform-dependent.

One is not a subset of the other.

replies(2): >>40172171 #>>40172645 #
2. hermitdev ◴[] No.40172171[source]
> platform-dependent.

It's 2024 and we still can't all agree on line endings. Mac vs Win vs Unix...

replies(2): >>40172265 #>>40172368 #
3. Y-bar ◴[] No.40172265[source]
Mac OS and Unix agreed about twenty years ago to use the same ending: https://superuser.com/a/439443
replies(1): >>40172390 #
4. Longhanks ◴[] No.40172368[source]
It's 2024, everything but Windows is UTF-8 \n since twenty years.
replies(1): >>40174531 #
5. Dylan16807 ◴[] No.40172390{3}[source]
By which time XP was already in the middle of releasing, so it was too late to get Windows on board.

It's too bad, with a bit more planning and an earlier realization that Unicode cannot in fact fit into 16 bits then Windows might have used UTF-8 internally.

replies(2): >>40174470 #>>40196503 #
6. layer8 ◴[] No.40172645[source]
Not sure what you mean by that with regard to encodings. The C APIs were explicitly designed to abstract from that, and together with libraries like iconv is was rather straightforward. You only needed to be aware that there is a difference between internal and external encoding, and maybe decide between char and wchar_t.
replies(1): >>40175699 #
7. jmb99 ◴[] No.40174470{4}[source]
Unless I’m mistaken, Rhapsody (released 1997) used LF, not CR. At that point it was pretty clear Mac was moving towards Unix through NeXTSTEP, meaning every OS except windows would be using LF. Microsoft would’ve had around 6 years before the release of XP, and probably would’ve had time to start the transition with Win2K at the end of 1999.
replies(1): >>40177690 #
8. int_19h ◴[] No.40174531{3}[source]
Linux was definitely not uniformly UTF-8 twenty years ago. It was one of the many available locales, but it was still common to use other encodings, and plenty of software didn't handle multibyte well in general.
9. Dylan16807 ◴[] No.40175699[source]
Not everything is C, and nothing like that saves you when you move your floppy between computers.
10. mixmastamyk ◴[] No.40177690{5}[source]
Every OS except the one that had 95% market share in late 90s. Apple was only propped up “Weekend at Bernies” style to appease regulators.
11. account42 ◴[] No.40196503{4}[source]
> and an earlier realization that Unicode cannot in fact fit into 16 bits

The Unicode consortium already realized it when they decided on Han unification, they just didn't accept it yet.