Most active commenters
  • zobweyt(5)
  • frizlab(3)

←back to thread

71 points zobweyt | 11 comments | | HN request time: 0.001s | source | bottom
1. frizlab ◴[] No.43553870[source]
Great library!

Does it support non-English title casing?

For instance in French, title casing for “les maisons bleues” is “Les Maisons bleues” while for “des maisons bleues” it’s “Des maisons bleues”.

replies(1): >>43553910 #
2. zobweyt ◴[] No.43553910[source]
Thanks!

It does not support non-English title casing. From the documentation:

> It also works non-ascii characters. However, no inferences on the language itself is made. For instance, the digraph ij in Dutch will not be capitalized, because it is represented as two distinct Unicode characters. However, æ would be capitalized

replies(3): >>43554503 #>>43554857 #>>43555116 #
3. zvr ◴[] No.43554503[source]
Nice work, but since it does not handle anything else than strings, maybe it should be named "stringcase" or something.
replies(1): >>43554547 #
4. zobweyt ◴[] No.43554547{3}[source]
Thank you for the feedback!

I appreciate your suggestion regarding the name, but unfortunately this name was already taken, so "textcase" was chosen.

I also have ideas for adding dictionary key conversion and other features in the future that will handle more than just strings. In addition, you can use this library to convert cases of Iterable[str] using textcase.pattern

replies(1): >>43555576 #
5. frizlab ◴[] No.43554857[source]
I was talking about the specific rules that are in place for title capitalization. As you can see in my example the uppercase letters seem randomly placed for a title, but they are indeed correct. For German too there are issues where capitalization has a meaning on the word itself. That kind of things.

It looks like your library does not support it, which is understandable, it is a huge problem to tackle, but I just wanted to be sure.

replies(1): >>43554915 #
6. zobweyt ◴[] No.43554915{3}[source]
Thank you for the clarification! I understand that title capitalization can be quite complex, especially with specific rules in languages like German where capitalization can change the meaning of a word.

I guess handling these nuances falls under the broader categories of internationalization (i18n) and localization (l10n).

replies(1): >>43568972 #
7. re ◴[] No.43555116[source]
> It does not support non-English title casing

Perhaps document that clearly—it's an important restriction that the library assumes English-language strings. ("no inferences on the language itself is made" isn't quite true since the language is inferred to be English, or to at least follow English-compatible rules for casing)

replies(1): >>43555144 #
8. zobweyt ◴[] No.43555144{3}[source]
Thanks for your feedback! You're right; I should clarify that the library assumes English-language strings for casing. I'll update the documentation to make this limitation clear. I appreciate you pointing it out!
9. zvr ◴[] No.43555576{4}[source]
My issue with using "text" is that I assume that a text like "I THINK I DO" should be converted to "I think I do", not "i think i do".

And that's just in English...

If "text" is in Greek, like "Καλημέρα", the upper form should be "ΚΑΛΗΜΕΡΑ", not a juxtaposition of upper() conversions of each letter.

replies(1): >>43555617 #
10. zobweyt ◴[] No.43555617{5}[source]
Thanks for the clarification!

Yeah, there is such a problem with the naming "text" suggests something different than just a "string".

I guess handling these nuances falls under the broader categories of internationalization (i18n) and localization (l10n).

11. frizlab ◴[] No.43568972{4}[source]
Just to be excessively clear and maybe borderline annoying, this is not a simple nuance. In German the meaning of a word can actually change depending on its capitalization. Even in English, lowercasing the I is very weird.