Something weird is happening with LLMs and chess

(dynomight.substack.com)

696 points crescit_eundo | 2 comments | 14 Nov 24 17:05 UTC | HN request time: 0.458s | source

Show context

niobe ◴[15 Nov 24 00:40 UTC] No.42142885[source]▶

I don't understand why educated people expect that an LLM would be able to play chess at a decent level.

It has no idea about the quality of it's data. "Act like x" prompts are no substitute for actual reasoning and deterministic computation which clearly chess requires.

replies(20): >>42142963 #>>42143021 #>>42143024 #>>42143060 #>>42143136 #>>42143208 #>>42143253 #>>42143349 #>>42143949 #>>42144041 #>>42144146 #>>42144448 #>>42144487 #>>42144490 #>>42144558 #>>42144621 #>>42145171 #>>42145383 #>>42146513 #>>42147230 #

viraptor ◴[15 Nov 24 01:17 UTC] No.42143060[source]▶

>>42142885 #

This is a puzzle given enough training information. LLM can successfully print out the status of the board after the given moves. It can also produce a not-terrible summary of the position and is able to list dangers at least one move ahead. Decent is subjective, but that should beat at least beginners. And the lowest level of stockfish used in the blog post is lowest intermediate.

I don't know really what level we should be thinking of here, but I don't see any reason to dismiss the idea. Also, it really depends on whether you're thinking of the current public implementations of the tech, or the LLM idea in general. If we wanted to get better results, we could feed it way more chess books and past game analysis.

replies(2): >>42143139 #>>42143871 #

grugagag ◴[15 Nov 24 01:33 UTC] No.42143139[source]▶

>>42143060 #

LLMs like GPT aren’t built to play chess, and here’s why: they’re made for handling language, not playing games with strict rules and strategies. Chess engines, like Stockfish, are designed specifically for analyzing board positions and making the best moves, but LLMs don’t even "see" the board. They’re just guessing moves based on text patterns, without understanding the game itself.

Plus, LLMs have limited memory, so they struggle to remember previous moves in a long game. It’s like trying to play blindfolded! They’re great at explaining chess concepts or moves but not actually competing in a match.

replies(5): >>42143316 #>>42143409 #>>42143940 #>>42144497 #>>42150276 #

1. codebolt ◴[15 Nov 24 06:53 UTC] No.42144497[source]▶

>>42143139 #

> they’re made for handling language, not playing games with strict rules and strategies

Here's the opposite theory: Language encodes objective reasoning (or at least, it does some of the time). A sufficiently large ANN trained on sufficiently large amounts of text will develop internal mechanisms of reasoning that can be applied to domains outside of language.

Based on what we are currently seeing LLMs do, I'm becoming more and more convinced that this is the correct picture.

replies(1): >>42144685 #

2. wruza ◴[15 Nov 24 07:36 UTC] No.42144685[source]▶

>>42144497 (TP) #

I share this idea but from the different perspective. It doesn’t develop these mechanisms, but casts a high-dimensional-enough shadow of their effect on itself. This vaguely explains why the more deep Gell-Mann-wise you are the less sharp that shadow is, because specificity cuts off “reasoning” hyperplanes.

It’s hard to explain emerging mechanisms because of the nature of generation, which is one-pass sequential matrix reduction. I say this while waving my hands, but listen. Reasoning is similar to Turing complete algorithms, and what LLMs can become through training is similar to limited pushdown automata at best. I think this is a good conceptual handle for it.

“Line of thought” is an interesting way to loop the process back, but it doesn’t show that much improvement, afaiu, and still is finite.

Otoh, a chess player takes as much time and “loops” as they need to get the result (ignoring competitive time limits).

↑