←back to thread

310 points JnBrymn | 1 comments | | HN request time: 0s | source
Show context
hbarka ◴[] No.45676016[source]
Chinese writing is logographic. Could this be giving Chinese developers a better intuition for pixels as input rather than text?
replies(3): >>45676915 #>>45678830 #>>45679059 #
1. hobofan ◴[] No.45678830[source]
Yeah, that sounds quite interesting. I'm wondering whether there is a bigger gap in performance (= quality) between text-only<->vision OCR in Chinese language than in English.

There is indeed a lot of semantic information contained in the signs that should help an LLM. E.g. there is a clear visual connection between 木 (wood/tree) and 林 (forest), while an LLM that purely has to draw a connection between "tree" and "forest" would have a much harder time seeing that connection independent of whether it's fed that as text or vision tokens.