←back to thread

365 points lawrenceyan | 3 comments | | HN request time: 0.73s | source
Show context
sksxihve ◴[] No.41881015[source]
Isn't generating the training data by running stockfish on all the board positions for all the games just encoding the search tree into the transformer model?

So increasing the number of parameters to the model would allow it to encode more of the search tree and give better performance, which doesn't seem all that interesting.

replies(1): >>41881152 #
1. mewpmewp2 ◴[] No.41881152[source]
How could it be possible to encode a search tree like this though.
replies(1): >>41882578 #
2. timmg ◴[] No.41882578[source]
Imagine you collected a billion unique, feasible board positions (all positions is intractable, but most possible positions are impractical) and the best nest move for each. That "best next move" is the result of a tree search.

Now use a transformer to "compress" that information into its model. It sounds like that is approximately what is going on here. Certainly, the model is likely to generalize some aspects of the data (just like LLMs do). But for the most part, the model encodes the information from the Stockfish evaluation.

(This is just my guess of what we are seeing.)

replies(1): >>41883000 #
3. sksxihve ◴[] No.41883000[source]
Exactly, the title says "without search" but in the paper it says "without explicit search", having a system learn to play chess at a grandmaster-level without any search for play or training would be far more impressive, what this does seems pretty obvious that it would work.