←back to thread

695 points crescit_eundo | 1 comments | | HN request time: 0.199s | source
Show context
fabiospampinato ◴[] No.42145891[source]
It's probably worth to play around with different prompts and different board positions.

For context this [1] is the board position the model is being prompted on.

There may be more than one weird thing about this experiment, for example giving instructions to the non-instruction tuned variants may be counter productive.

More importantly let's say you just give the model the truncated PGN, does this look like a position where white is a grandmaster level player? I don't think so. Even if the model understood chess really well it's going to try to predict the most probable move given the position at hand, if the model thinks that white is a bad player, and the model is good at understanding chess, it's going to predict bad moves as the more likely ones because that would better predict what is most likely to happen here.

[1]: https://i.imgur.com/qRxalgH.png

replies(4): >>42146161 #>>42147006 #>>42147866 #>>42150105 #
Closi ◴[] No.42146161[source]
Agree with this. A few prompt variants:

* What if you allow the model to do Chain of Thought (explicitly disallowed in this experiment)

* What if you explain the board position at each step to the model in the prompt, so it doesn't have to calculate/estimate it internally.

replies(1): >>42149903 #
1. int_19h ◴[] No.42149903[source]
They also tested GPT-o1, which is always CoT. Yet it is still worse.