Something weird is happening with LLMs and chess

(dynomight.substack.com)

696 points crescit_eundo | 2 comments | 14 Nov 24 17:05 UTC | HN request time: 0.449s | source

Show context

azeirah ◴[14 Nov 24 22:43 UTC] No.42141993[source]▶

Maybe I'm really stupid... but perhaps if we want really intelligent models we need to stop tokenizing at all? We're literally limiting what a model can see and how it percieves the world by limiting the structure of the information streams that come into the model from the very beginning.

I know working with raw bits or bytes is slower, but it should be relatively cheap and easy to at least falsify this hypothesis that many huge issues might be due to tokenization problems but... yeah.

Surprised I don't see more research into radicaly different tokenization.

replies(14): >>42142033 #>>42142384 #>>42143197 #>>42143338 #>>42143381 #>>42144059 #>>42144207 #>>42144582 #>>42144600 #>>42145725 #>>42146419 #>>42146444 #>>42149355 #>>42151016 #

aithrowawaycomm ◴[14 Nov 24 23:25 UTC] No.42142384[source]▶

>>42141993 #

FWIW I think most of the "tokenization problems" are in fact reasoning problems being falsely blamed on a minor technical thing when the issue is much more profound.

E.g. I still see people claiming that LLMs are bad at basic counting because of tokenization, but the same LLM counts perfectly well if you use chain-of-thought prompting. So it can't be explained by tokenization! The problem is reasoning: the LLM needs a human to tell it that a counting problem can be accurately solved if they go step-by-step. Without this assistance the LLM is likely to simply guess.

replies(6): >>42142733 #>>42142807 #>>42143239 #>>42143800 #>>42144596 #>>42146428 #

meroes ◴[15 Nov 24 03:46 UTC] No.42143800[source]▶

>>42142384 #

At a certain level they are identical problems. My strongest piece of evidence is that I get paid as an RLHF'er to find ANY case of error, including "tokenization". You know how many errors an LLM gets in the simplest grid puzzles, with CoT, with specialized models that don't try to "one-shot" problems, with multiple models, etc?

My assumption is that these large companies wouldn't pay hundreds of thousands of RLHF'ers through dozens of third party companies livable wages if tokenization errors were just that.

replies(1): >>42149054 #

1. 1propionyl ◴[15 Nov 24 17:39 UTC] No.42149054[source]▶

>>42143800 #

> hundreds of thousands of RLHF'ers through dozens of third party companies

Out of curiosity, what are these companies? And where do they operate.

I'm always interested in these sorts of "hidden" industries. See also: outsourced Facebook content moderation in Kenya.

replies(1): >>42159108 #

2. meroes ◴[16 Nov 24 20:38 UTC] No.42159108[source]▶

>>42149054 (TP) #

Scale AI is a big one who owns companies who do this as well, such as Outlierai.

There are many other AI trainer job companies though. A lot of it is gig work but the pay is more than the vast majority of gig jobs.

↑