Instead they're barely able to eek out wins against a bot that plays completely random moves: https://maxim-saplin.github.io/llm_chess/
https://dynomight.substack.com/p/chess
Discussion here: https://news.ycombinator.com/item?id=42138289
Open AI, Anthropic and the like simply don't care much about their LLMs playing chess. That or post training is messing things up.
I mean, surely there's a reason you decided to mention 3.5 turbo instruct and not.. 3.5 turbo? Or any other model? Even the ones that came after? It's clearly a big outlier, at least when you consider "LLMs" to be a wide selection of recent models.
If you're saying that LLMs/transformer models are capable of being trained to play chess by training on chess data, I agree with you.
I think AstroBen was pointing out that LLMs, despite having the ability to solve some very impressive mathematics and programming tasks, don't seem to generalize their reasoning abilities to a domain like chess. That's surprising, isn't it?