←back to thread

579 points paulpauper | 1 comments | | HN request time: 0.202s | source
1. OtherShrezzing ◴[] No.43604605[source]
Assuming that the models getting better at SWE benchmarks and math tests would translate into positive outcomes in all other domains could be an act of spectacular hubris by the big frontier labs, which themselves are chock-full of mathematicians and software engineers.