(dynomight.substack.com)

696 points crescit_eundo | 1 comments | 14 Nov 24 17:05 UTC | HN request time: 0.231s | source

Show context

snickerbockers ◴[15 Nov 24 08:31 UTC] No.42144943[source]▶

Does it ever try an illegal move? OP didn't mention this and I think it's inevitable that it should happen at least once, since the rules of chess are fairly arbitrary and LLMs are notorious for bullshitting their way through difficult problems when we'd rather they just admit that they don't have the answer.

replies(2): >>42145004 #>>42145793 #

sethherr ◴[15 Nov 24 08:42 UTC] No.42145004[source]▶

>>42144943 #

Yes, he discusses using a grammar to restrict to only legal moves

replies(4): >>42147380 #>>42148708 #>>42150800 #>>42152205 #

topaz0 ◴[15 Nov 24 14:47 UTC] No.42147380[source]▶

>>42145004 #

Still an interesting direction of questioning. Maybe could be rephrased as "how much work is the grammar doing"? Are the results with the grammar very different than without? If/when a grammar is not used (like in the openai case), how many illegal moves does it try on average before finding a legal one?

replies(3): >>42147422 #>>42150017 #>>42151815 #

1. gs17 ◴[15 Nov 24 22:14 UTC] No.42151815[source]▶

>>42147380 #

I'd be more interested in what the distribution of grammar-restricted predictions looks like compared to moves Stockfish says are good.

↑

Something weird is happening with LLMs and chess