←back to thread

238 points GalaxySnail | 1 comments | | HN request time: 0s | source
Show context
a-french-anon ◴[] No.40170353[source]
Why not utf-8-sig, though? It handles optional BOMs. Had to fix a script last week that choked on it.
replies(3): >>40170707 #>>40170832 #>>40171048 #
shellac ◴[] No.40171048[source]
At this point nothing ought to be inserting BOMs in utf-8. It's not recommended, and I think choking on it is reasonable behaviour these days.
replies(3): >>40171192 #>>40173969 #>>40178398 #
1. BoingBoomTschak ◴[] No.40178398[source]
Only reason I used it was to force MSVC to understand my u8"" literals. Should've forced /utf8 in our build system, in retrospective.

For UTF-16/32, knowing the endianness doesn't seem to be a frivolous functionality. And in fact, having to use heuristics-based detection via uchardet is a big mess, some kind of header should have been standardized since the start.