←back to thread

237 points JnBrymn | 1 comments | | HN request time: 0.207s | source
Show context
hbarka ◴[] No.45676016[source]
Chinese writing is logographic. Could this be giving Chinese developers a better intuition for pixels as input rather than text?
replies(3): >>45676915 #>>45678830 #>>45679059 #
1. anabis ◴[] No.45676915[source]
Yeah, mapping chinese characters to linear UTF-8 space is throwing a lot of information away. Each language brings some ideas for text processing. sentencepiece inventor is Japanese, which doesn't have explicit word delimiters, for example.