←back to thread

82 points meetpateltech | 1 comments | | HN request time: 0.203s | source
Show context
RayVR ◴[] No.45311094[source]
A faster model that outperforms its slower version on multiple benchmarks? Can anyone explain why that makes sense? Are they simply retraining on the benchmark tests?
replies(4): >>45311127 #>>45311184 #>>45311402 #>>45311754 #
1. NitpickLawyer ◴[] No.45311127[source]
> Can anyone explain why that makes sense?

Can be anything from different arch, more data, RL, etc. It's probably RL. In recent months top tier labs seem to have "cracked" RL to a level not seen yet in open models, and by a large margin.