←back to thread

166 points lawrenceyan | 4 comments | | HN request time: 0s | source
1. hlfshell ◴[] No.41873501[source]
I did a talk about this! (And also wrote up about my talk here[1]). This paper is a great example of both knowledge distillation. It's less of a paper about chess and more about how complicated non linear search functions - complete with whatever tuning experts can prepare - can be distilled into a (quasi-linear, if it's a standardized input like chess) transformer model.

[1]: https://hlfshell.ai/posts/deepmind-grandmaster-chess-without...

replies(1): >>41874003 #
2. janalsncm ◴[] No.41874003[source]
I think the vs. humans result should be taken with a huge grain of salt. These are blitz games, and their engine’s elo was far higher against humans than against other bots. So it’s likely that time was a factor, where humans are likely to flag (run out of time) or blunder in low time situations.

It’s still very cool that they could learn a very good eval function that doesn’t require search. I would’ve liked the authors to throw out the games where the Stockfish fallback kicked in though. Even for a human, mate in 2 vs mate in 10 is the difference between a win and a draw/loss on time.

I also would’ve liked to see a head to head with limited search depth Stockfish. That would tell us approximately how much of the search tree their eval function distilled.

replies(1): >>41874038 #
3. hlfshell ◴[] No.41874038[source]
The reason the time (blitz) games make sense is because the distilled functionality is of a 50ms Stockfish eval function. The engine likely would perform worse as only the human would benefit from the additional time.

As for limited search tree I like the idea! I think it's tough to measure, since the time it takes to perform search across various depths vary wildly based on the complexity of the position. I feel like you would have to compile a dataset of specific positions identified to require significant depth of search to find a "good" move.

replies(1): >>41874149 #
4. janalsncm ◴[] No.41874149{3}[source]
My point is that if the computer never flags it will have an inherent advantage in low time controls. If not, why not just test it in hyperbullet games? Games where humans flag in a drawn or winning position need to be excluded, otherwise it’s unclear what this is even measuring.

And limited depth games would not have been difficult to run. You can run a limited search Stockfish on a laptop using the UCI protocol: https://github.com/official-stockfish/Stockfish/wiki/UCI-%26...