(www.lesswrong.com)

579 points paulpauper | 1 comments | 06 Apr 25 18:01 UTC | HN request time: 0.212s | source

Show context

gundmc ◴[06 Apr 25 18:48 UTC] No.43603886[source]▶

This was published the day before Gemini 2.5 was released. I'd be interested if they see any difference with that model. Anecdotally, that is the first model that really made me go wow and made a big difference for my productivity.

replies(4): >>43603928 #>>43603961 #>>43604159 #>>43610218 #

1. usaar333 ◴[06 Apr 25 19:22 UTC] No.43604159[source]▶

>>43603886 #

Ya, I find this hard to imagine aging well. Gemini 2.5 solved (at least much better than) multiple real world systems questions I've had in the past that other models could not. Its visual reasoning also jumped significantly on charts (e.g. planning around train schedules)

Even Sonnet 3.7 was able to do refactoring work on my codebase sonnet 3.6 could not.

Really not seeing the "LLMs not improving" story

↑

Recent AI model progress feels mostly like bullshit