OpenAI o3 and o4-mini | slacker news

1. jdross ◴[16 Apr 25 17:12 UTC] No.43707849[source]▶

The pace of notable releases across the industry right now is unlike any time I remember since I started doing this in the early 2000's. And it feels like it's accelerating

replies(3): >>43707964 #>>43708571 #>>43712041 #

2. emp17344 ◴[16 Apr 25 17:21 UTC] No.43707964[source]▶

>>43707849 (TP) #

Not really. We’re definitely in the incremental improvement stage at this point. Certainly no indication that progress is “accelerating”.

replies(3): >>43708074 #>>43708367 #>>43712868 #

3. nwienert ◴[16 Apr 25 17:27 UTC] No.43708074[source]▶

>>43707964 #

ChatGPT 3 : iPhone 1

A bunch of models later, we're about on the iPhone 4-5 now. Feels about right.

replies(1): >>43711992 #

4. Workaccount2 ◴[16 Apr 25 17:50 UTC] No.43708367[source]▶

>>43707964 #

Integration is accelerating rapidly. Even if model development froze today, we would still probably have ~5 years of adoption and integration before it started to level off.

replies(1): >>43709182 #

5. qoez ◴[16 Apr 25 18:08 UTC] No.43708571[source]▶

>>43707849 (TP) #

Lots of releases but very little actual performance increases

replies(1): >>43708812 #

6. int_19h ◴[16 Apr 25 18:34 UTC] No.43708812[source]▶

>>43708571 #

Sonnet and Gemini saw fairly substantial perf increases recenly

replies(1): >>43709040 #

7. mchusma ◴[16 Apr 25 18:55 UTC] No.43709040{3}[source]▶

>>43708812 #

Love Sonnet but 3.7 is not obviously an improvement over 3.5 in my real world usage. Gemini 2.5 pro is great, has replaced most others for me (Grok I use for things that require realtime answers)

replies(2): >>43710308 #>>43711956 #

8. littlestymaar ◴[16 Apr 25 19:08 UTC] No.43709182{3}[source]▶

>>43708367 #

You are both correct. It feels like the tech itself is kinda plateauing but it's still massively under-used. It will take a decade or more before the deployment starts slowing down.

9. BriggyDwiggs42 ◴[16 Apr 25 20:57 UTC] No.43710308{4}[source]▶

>>43709040 #

It does a lot better on philosophy questions.

10. int_19h ◴[17 Apr 25 00:55 UTC] No.43711956{4}[source]▶

>>43709040 #

Are you comparing it with or without thinking? I'd say it's a fairly big improvement in long thinking mode.

11. int_19h ◴[17 Apr 25 01:00 UTC] No.43711992{3}[source]▶

>>43708074 #

It's more like GPT-3 is the Manchester Baby, and we're somewhere around IBM 700 series right now. Still a long way to go to iPhone, as much as the industry likes to pretend otherwise.

replies(1): >>43725631 #

12. achierius ◴[17 Apr 25 01:08 UTC] No.43712041[source]▶

>>43707849 (TP) #

How is this a notable release? It's strictly worse than Gemini 2.5 on coding &c, and only an iterative improvement over their own models. The only thing that struck me as particularly interesting was the native visual reasoning.

replies(1): >>43712778 #

13. og_kalu ◴[17 Apr 25 03:20 UTC] No.43712778[source]▶

>>43712041 #

It's not worse on coding. SWE Bench, Aider, live bench coding all show noticeably better results.

14. adncors ◴[17 Apr 25 03:34 UTC] No.43712868[source]▶

>>43707964 #

But we're seeing incremental improvements every two months, so...

15. nwienert ◴[18 Apr 25 06:46 UTC] No.43725631{4}[source]▶

>>43711992 #

Both were big consumer commercial breakouts and far better than predecessors. And several years later both see only iterative improvements.

Neither apply to your analogy.