Something weird is happening with LLMs and chess

(dynomight.substack.com)

696 points crescit_eundo | 1 comments | 14 Nov 24 17:05 UTC | HN request time: 0.246s | source

Show context

niobe ◴[15 Nov 24 00:40 UTC] No.42142885[source]▶

I don't understand why educated people expect that an LLM would be able to play chess at a decent level.

It has no idea about the quality of it's data. "Act like x" prompts are no substitute for actual reasoning and deterministic computation which clearly chess requires.

replies(20): >>42142963 #>>42143021 #>>42143024 #>>42143060 #>>42143136 #>>42143208 #>>42143253 #>>42143349 #>>42143949 #>>42144041 #>>42144146 #>>42144448 #>>42144487 #>>42144490 #>>42144558 #>>42144621 #>>42145171 #>>42145383 #>>42146513 #>>42147230 #

viraptor ◴[15 Nov 24 01:17 UTC] No.42143060[source]▶

>>42142885 #

This is a puzzle given enough training information. LLM can successfully print out the status of the board after the given moves. It can also produce a not-terrible summary of the position and is able to list dangers at least one move ahead. Decent is subjective, but that should beat at least beginners. And the lowest level of stockfish used in the blog post is lowest intermediate.

I don't know really what level we should be thinking of here, but I don't see any reason to dismiss the idea. Also, it really depends on whether you're thinking of the current public implementations of the tech, or the LLM idea in general. If we wanted to get better results, we could feed it way more chess books and past game analysis.

replies(2): >>42143139 #>>42143871 #

1. shric ◴[15 Nov 24 04:05 UTC] No.42143871[source]▶

>>42143060 #

Stockfish level 1 is well below "lowest intermediate".

A friend of mine just started playing chess a few weeks ago and can beat it about 25% of the time.

It will hang pieces, and you can hang your own queen and there's about a 50% chance it won't be taken.

↑