LLMs don’t do reasoning or exploration, but they write text based on precious text. So to us it may seem playing, but is really a smart guesswork based on previous games. It’s like Kasparov writing moves without imagining the actual placement.
What would be interesting is to see whether a model, given only the rules, will play. I bet it won’t.
At this moment it’s replaying by memory but definitely not chasing goals. There’s no such think as forward attention yet, and beam search is expensive enough, so one would prefer to actually fallback to classic chess algos.
OpenAI has never done anything except conversational agents.
“In the summer of 2018, simply training OpenAI's Dota 2 bots required renting 128,000 CPUs and 256 GPUs from Google for multiple weeks.”
Tell me you haven't been following this field without telling me you haven't been following this field[0][1][2]?
[0]: https://github.com/openai/gym
[1]: https://openai.com/index/jukebox/
[2]: https://openai.com/index/openai-five-defeats-dota-2-world-ch...