How to Build a Chess Engine and Fail

If you have a program which always makes valid moves and gives up when it has lost you wrote a proper chess playing program. It may play badly, but it plays.

replies(1): >>42189228 #

6. janalsncm ◴[19 Nov 24 22:35 UTC] No.42188822[source]▶

>>42187475 (TP) #

Your question sounds flippant but it’s actually quite deep. Why aren’t LLMs good at chess? The answer is likely that to be good at chess at some level requires search (and probably some ordering constraint on your evaluation function too that I’m not smart enough to figure out).

LLMs aren’t searching. They are memorizing. This explains their extremely poor performance on out of domain positions, whereas Stockfish can easily understand them.

replies(1): >>42192469 #

7. ElFitz ◴[19 Nov 24 23:34 UTC] No.42189228{3}[source]▶

>>42188073 #

That could now be achieved by precomputing all valid moves and using outlines[0] or Structured Outputs[1] to constrain the output.

[0]: https://github.com/dottxt-ai/outlines

[1]: https://openai.com/index/introducing-structured-outputs-in-t...

replies(1): >>42190437 #

8. PaulHoule ◴[20 Nov 24 02:53 UTC] No.42190437{4}[source]▶

>>42189228 #

… or just a valid move checker that prompts it to try again if it fails to make a valid move.

replies(1): >>42226520 #

9. deadbabe ◴[20 Nov 24 10:18 UTC] No.42192469[source]▶

>>42188822 #

I think chess is a great example to use to demonstrate that LLMs don’t actually know anything, and will be entirely unsuitable to replace humans on anything except thoroughly solved and documented problems.

10. ElFitz ◴[24 Nov 24 07:33 UTC] No.42226520{5}[source]▶

>>42190437 #

Yes, but LLMs are expensive to run.