←back to thread

196 points zmccormick7 | 1 comments | | HN request time: 0.248s | source
Show context
asdev ◴[] No.45387765[source]
I don't think intelligence is increasing. Arbitrary benchmarks don't reflect real world usage. Even with all the context it could possibly have, these models still miss/hallucinate things. Doesn't make them useless, but saying context is the bottleneck is incorrect.
replies(3): >>45388096 #>>45388362 #>>45398947 #
1. Jweb_Guru ◴[] No.45398947[source]
Gemini 2.5 Pro is okay if you ask it to work on a very tiny problem. That's about it for me, the other models don't even create a convincing facsimile of reasoning.