It has no idea about the quality of it's data. "Act like x" prompts are no substitute for actual reasoning and deterministic computation which clearly chess requires.
It has no idea about the quality of it's data. "Act like x" prompts are no substitute for actual reasoning and deterministic computation which clearly chess requires.
If some mental model says that LLMs should be bad at chess, then it fails to explain why we have LLMs playing strong chess. If another mental model says the inverse, then it fails to explain why so many of these large models fail spectacularly at chess.
Clearly, there's more going on here.
In this scope, my mental model is that LLMs would be good at modern style long form chess, but would likely be easy to trip up with certain types of move combinations that most humans would not normally use. My prediction is that once found they would be comically susceptible to these patterns.
Clearly, we have no real basis for saying it is "good" or "bad" at chess, and even using chess performance as an measurement sample is a highly biased decision, likely born out of marketing rather than principle.