A faster model that outperforms its slower version on multiple benchmarks? Can anyone explain why that makes sense? Are they simply retraining on the benchmark tests?
Just two different models branded under similar names. That's it. Grok 4 is not the slower version of Grok 4 Fast, just like gpt-4 is not the slower version of gpt-4o.