(dynomight.substack.com)

696 points crescit_eundo | 1 comments | 14 Nov 24 17:05 UTC | HN request time: 0.222s | source

Show context

niobe ◴[15 Nov 24 00:40 UTC] No.42142885[source]▶

I don't understand why educated people expect that an LLM would be able to play chess at a decent level.

It has no idea about the quality of it's data. "Act like x" prompts are no substitute for actual reasoning and deterministic computation which clearly chess requires.

replies(20): >>42142963 #>>42143021 #>>42143024 #>>42143060 #>>42143136 #>>42143208 #>>42143253 #>>42143349 #>>42143949 #>>42144041 #>>42144146 #>>42144448 #>>42144487 #>>42144490 #>>42144558 #>>42144621 #>>42145171 #>>42145383 #>>42146513 #>>42147230 #

computerex ◴[15 Nov 24 00:55 UTC] No.42142963[source]▶

>>42142885 #

Question here is why gpt-3.5-instruct can then beat stockfish.

replies(4): >>42142975 #>>42143081 #>>42143181 #>>42143889 #

lukan ◴[15 Nov 24 01:40 UTC] No.42143181[source]▶

>>42142963 #

Cheating (using a internal chess engine) would be the obvious reason to me.

replies(2): >>42143214 #>>42165535 #

1. nske ◴[17 Nov 24 17:40 UTC] No.42165535[source]▶

>>42143181 #

But in that case there shouldn't be any invalid moves, ever. Another tester found gpt-3.5-turbo-instruct to be suggesting at least one illegal move in 16% of the games (source: https://blog.mathieuacher.com/GPTsChessEloRatingLegalMoves/ )

↑

Something weird is happening with LLMs and chess