Most active commenters
  • int_19h(3)

←back to thread

555 points maheshrijal | 15 comments | | HN request time: 0.989s | source | bottom
1. jdross ◴[] No.43707849[source]
The pace of notable releases across the industry right now is unlike any time I remember since I started doing this in the early 2000's. And it feels like it's accelerating
replies(3): >>43707964 #>>43708571 #>>43712041 #
2. emp17344 ◴[] No.43707964[source]
Not really. We’re definitely in the incremental improvement stage at this point. Certainly no indication that progress is “accelerating”.
replies(3): >>43708074 #>>43708367 #>>43712868 #
3. nwienert ◴[] No.43708074[source]
ChatGPT 3 : iPhone 1

A bunch of models later, we're about on the iPhone 4-5 now. Feels about right.

replies(1): >>43711992 #
4. Workaccount2 ◴[] No.43708367[source]
Integration is accelerating rapidly. Even if model development froze today, we would still probably have ~5 years of adoption and integration before it started to level off.
replies(1): >>43709182 #
5. qoez ◴[] No.43708571[source]
Lots of releases but very little actual performance increases
replies(1): >>43708812 #
6. int_19h ◴[] No.43708812[source]
Sonnet and Gemini saw fairly substantial perf increases recenly
replies(1): >>43709040 #
7. mchusma ◴[] No.43709040{3}[source]
Love Sonnet but 3.7 is not obviously an improvement over 3.5 in my real world usage. Gemini 2.5 pro is great, has replaced most others for me (Grok I use for things that require realtime answers)
replies(2): >>43710308 #>>43711956 #
8. littlestymaar ◴[] No.43709182{3}[source]
You are both correct. It feels like the tech itself is kinda plateauing but it's still massively under-used. It will take a decade or more before the deployment starts slowing down.
9. BriggyDwiggs42 ◴[] No.43710308{4}[source]
It does a lot better on philosophy questions.
10. int_19h ◴[] No.43711956{4}[source]
Are you comparing it with or without thinking? I'd say it's a fairly big improvement in long thinking mode.
11. int_19h ◴[] No.43711992{3}[source]
It's more like GPT-3 is the Manchester Baby, and we're somewhere around IBM 700 series right now. Still a long way to go to iPhone, as much as the industry likes to pretend otherwise.
replies(1): >>43725631 #
12. achierius ◴[] No.43712041[source]
How is this a notable release? It's strictly worse than Gemini 2.5 on coding &c, and only an iterative improvement over their own models. The only thing that struck me as particularly interesting was the native visual reasoning.
replies(1): >>43712778 #
13. og_kalu ◴[] No.43712778[source]
It's not worse on coding. SWE Bench, Aider, live bench coding all show noticeably better results.
14. adncors ◴[] No.43712868[source]
But we're seeing incremental improvements every two months, so...
15. nwienert ◴[] No.43725631{4}[source]
Both were big consumer commercial breakouts and far better than predecessors. And several years later both see only iterative improvements.

Neither apply to your analogy.