Something weird is happening with LLMs and chess

(dynomight.substack.com)

696 points crescit_eundo | 3 comments | 14 Nov 24 17:05 UTC | HN request time: 0.769s | source

Show context

niobe ◴[15 Nov 24 00:40 UTC] No.42142885[source]▶

I don't understand why educated people expect that an LLM would be able to play chess at a decent level.

It has no idea about the quality of it's data. "Act like x" prompts are no substitute for actual reasoning and deterministic computation which clearly chess requires.

replies(20): >>42142963 #>>42143021 #>>42143024 #>>42143060 #>>42143136 #>>42143208 #>>42143253 #>>42143349 #>>42143949 #>>42144041 #>>42144146 #>>42144448 #>>42144487 #>>42144490 #>>42144558 #>>42144621 #>>42145171 #>>42145383 #>>42146513 #>>42147230 #

1. chipdart ◴[15 Nov 24 07:22 UTC] No.42144621[source]▶

>>42142885 #

> I don't understand why educated people expect that an LLM would be able to play chess at a decent level.

The blog post demonstrates that a LLM plays chess at a decent level.

The blog post explains why. It addresses the issue of data quality.

I don't understand what point you thought you were making. Regardless of where you stand, the blog post showcases a surprising result.

You stress your prior unfounded belief, you were presented with data that proves it wrong, and your reaction was to post a comment with a thinly veiled accusation of people not being educated when clearly you are the one that's off.

To make matters worse, this topic is also about curiosity. Which has a strong link with intelligence and education. And you are here criticizing others on those grounds in spite of showing your defitic right at the first sentence.

This blog post was a great read. Very surprising, engaging, and thought provoking.

replies(1): >>42156892 #

2. wibwobble12333 ◴[16 Nov 24 15:29 UTC] No.42156892[source]▶

>>42144621 (TP) #

The only service performing well is a closed source one that could simply use a real chess engine for questions that look like chess, for marketing purposes. There’s nothing thought provoking about a bunch of engineers doing “experiments” against a service, other than how sad it is to debase themselves in this way.

replies(1): >>42157193 #

3. chipdart ◴[16 Nov 24 16:27 UTC] No.42157193[source]▶

>>42156892 #

> The only service performing well is a closed source one that could simply use a real chess engine for questions that look like chess, for marketing purposes.

That conspiracy theory holds no traction in reality. This blog post is so far the only reference to using LLMs to play chess. The "closed-source" model (whatever that is) is an older version that does worse than the newer version. If your conspiracy theory had any bearing in reality how come this fictional "real chess engine" was only used in a single release? Unbelievable.

Back in reality, it is well known that newer models that are made available to the public are adapted to business needs by constraining their capabilities and limit liability.

↑