←back to thread

238 points GalaxySnail | 1 comments | | HN request time: 0.205s | source
Show context
lexicality ◴[] No.40169320[source]
> Additionally, many Python developers using Unix forget that the default encoding is platform dependent. They omit to specify encoding="utf-8" when they read text files encoded in UTF-8

"forget" or possibly simply aren't made well enough aware? I genuinely thought that python would only use UTF-8 for everything unless you explicitly ask it to do otherwise.

replies(2): >>40170712 #>>40171480 #
1. aktiur ◴[] No.40171480[source]
It actually depends!

`bytes.decode` (and `str.encode`) have used UTF-8 as a default since at least Python 3.

However, the default encoding used for decoding the name of files use ` sys.getfilesystemencoding()`, which is also UTF-8 on Windows and macos, but will vary with the locale on linux (specifically with CODESET).

Finally, `open` will directly use `locale.getencoding()`.