←back to thread

Something weird is happening with LLMs and chess

(dynomight.substack.com)

696 points crescit_eundo | 1 comments | 14 Nov 24 17:05 UTC | HN request time: 0.267s | source

Show context

azeirah ◴[14 Nov 24 22:43 UTC] No.42141993[source]▶

>>42138289 (OP) #

Maybe I'm really stupid... but perhaps if we want really intelligent models we need to stop tokenizing at all? We're literally limiting what a model can see and how it percieves the world by limiting the structure of the information streams that come into the model from the very beginning.

I know working with raw bits or bytes is slower, but it should be relatively cheap and easy to at least falsify this hypothesis that many huge issues might be due to tokenization problems but... yeah.

Surprised I don't see more research into radicaly different tokenization.

replies(14): >>42142033 #>>42142384 #>>42143197 #>>42143338 #>>42143381 #>>42144059 #>>42144207 #>>42144582 #>>42144600 #>>42145725 #>>42146419 #>>42146444 #>>42149355 #>>42151016 #

1. layer8 ◴[15 Nov 24 02:09 UTC] No.42143338[source]▶

Going from tokens to bytes explodes the model size. I can’t find the reference at the moment, but reducing the average token size induces a corresponding quadratic increase in the width (size of each layer) of the model. This doesn’t just affect inference speed, but also training speed.