←back to thread

237 points JnBrymn | 1 comments | | HN request time: 0.307s | source
Show context
nl ◴[] No.45677385[source]
Kapathy's points are correct (of course).

One thing I like about text tokens though is that it learns some understanding of the text input method (particularly the QWERTY keyboard).

"Hello" and "Hwllo" are closer in semantic space than you'd think because "w" and "e" are next to each other.

This is much easier to see in hand coded spelling models, where you can get better results by including a "keybaord distance" metric along with a string distance metric.

replies(2): >>45677581 #>>45678933 #
1. swyx ◴[] No.45677581[source]
im particularly sympathetic to typo learning, which i think gets lost in the synthetic data discussion (mine here https://www.youtube.com/watch?v=yXPPcBlcF8U )

but i think in this case you can still generate typos in images and it'd be learnable. not a hard issue relevant to the OP