(timkellogg.me)

75 points asicsp | 1 comments | 21 Apr 25 10:35 UTC | HN request time: 0.211s | source

Show context

nickez ◴[21 Apr 25 10:56 UTC] No.43750462[source]▶

Found an error immediately "Any lowercase character" doesn't match all Swedish lowercase characters.

comrade1234 ◴[21 Apr 25 11:05 UTC] No.43750527[source]▶

lol really? Why not? Is that true for all encodings? Is it a bug or a feature? What about a simple character set like gsm-7 Swedish?

replies(2): >>43750578 #>>43750584 #

lalaithion ◴[21 Apr 25 11:12 UTC] No.43750584[source]▶

>>43750527 #

The author says “any lowercase character” but they mean “any character between the character ‘a’ and the character ‘z’”, which happens to correspond to the lower case letters in English but doesn’t include ü, õ, ø, etc.

replies(2): >>43750991 #>>43751211 #

comrade1234 ◴[21 Apr 25 12:07 UTC] No.43750991[source]▶

>>43750584 #

I would expect [a-z] to mean any lowercase in any language, not lowercase but only a to z. So I’d get bitten by that one.

replies(1): >>43751137 #

1. deciduously ◴[21 Apr 25 12:21 UTC] No.43751137[source]▶

>>43750991 #

The letters with diacritics sort lexicographically after 'z', so it does stand to reason they wouldn't appear in that range.

↑

Regex Isn't Hard (2023)