(twitter.com)

237 points JnBrymn | 1 comments | 21 Oct 25 17:43 UTC | HN request time: 0.207s | source

https://xcancel.com/karpathy/status/1980397031542989305

Show context

hbarka ◴[22 Oct 25 22:32 UTC] No.45676016[source]▶

Chinese writing is logographic. Could this be giving Chinese developers a better intuition for pixels as input rather than text?

replies(3): >>45676915 #>>45678830 #>>45679059 #

1. anabis ◴[23 Oct 25 00:40 UTC] No.45676915[source]▶

>>45676016 #

Yeah, mapping chinese characters to linear UTF-8 space is throwing a lot of information away. Each language brings some ideas for text processing. sentencepiece inventor is Japanese, which doesn't have explicit word delimiters, for example.

↑

Karpathy on DeepSeek-OCR paper: Are pixels better inputs to LLMs than text?