All of the LLM models tested playing chess performed terribly bad against Stockfish engine except gpt-3.5-turbo-instruct, which is a closed OpenAI model.