(dynomight.substack.com)

696 points crescit_eundo | 2 comments | 14 Nov 24 17:05 UTC | HN request time: 0.491s | source

1. ynniv ◴[14 Nov 24 22:57 UTC] No.42142141[source]▶

I don't think one model is statistically significant. As people have pointed out, it could have chess specific responses that the others do not. There should be at least another one or two, preferably unrelated, "good" data points before you can claim there is a pattern. Also, where's Claude?

replies(1): >>42142225 #

2. famouswaffles ◴[14 Nov 24 23:07 UTC] No.42142225[source]▶

>>42142141 (TP) #

There are other transformers that have been trained on chess text that play chess fine (just not as good as 3.5 Turbo instruct with the exception of the "grandmaster level without search" paper).

↑

Something weird is happening with LLMs and chess