←back to thread

688 points crescit_eundo | 2 comments | | HN request time: 0.419s | source
1. a_wild_dandan ◴[] No.42148917[source]
Important testing excerpts:

- "...for the closed (OpenAI) models I tried generating up to 10 times and if it still couldn’t come up with a legal move, I just chose one randomly."

- "I ran all the open models (anything not from OpenAI, meaning anything that doesn’t start with gpt or o1) myself using Q5_K_M quantization"

- "...if I gave a prompt like “1. e4 e5 2. ” (with a space at the end), the open models would play much, much worse than if I gave a prompt like “1 e4 e5 2.” (without a space)"

- "I used a temperature of 0.7 for all the open models and the default for the closed (OpenAI) models."

Between the tokenizer weirdness, temperature, quantization, random moves, and the chess prompt, there's a lot going on here. I'm unsure how to interpret the results. Fascinating article though!

replies(1): >>42158124 #
2. NohatCoder ◴[] No.42158124[source]
Ah, buried in the post-article part. I was wondering how all of the models were seemingly capable of making legal moves, since last I saw something about LLMs playing Chess they were very much not capable of that.