←back to thread

413 points martinald | 2 comments | | HN request time: 0s | source
Show context
blauditore ◴[] No.46198377[source]
These kind of future prediction posts keep coming, and I'm tired of them. Reality is always more boring, less extreme, and slower at changing, because there are too many factors involved, and the authors never account for everything.

Maybe we should collect all of these predictions, then go back in 5-10 years and see if anyone was actually right.

replies(2): >>46198882 #>>46199383 #
tobyjsullivan ◴[] No.46198882[source]
Despite a couple forward-looking statements, I didn’t read this as a prediction. It seems more of a subjective/anecdotal assessment of where things are in December 2025. (Yes, with some conjecture about the implications for next year.)

Overall, it echos my experience with Claude Opus 4.5 in particular. We’ve passed a threshold (one of several, no doubt).

replies(2): >>46201487 #>>46202546 #
aeonfox ◴[] No.46201487[source]
Just to test out the OP articles theory, I was about to write some unit tests. I decided to let Opus 4.5 have a go. It did a pretty good job, but I spent probably as much time parsing what it had done as I would have writing the code from scratch. I still needed to clean it up, and of course, unsurprisingly, it had made a few tests that only really exercised the mocking it had made. A kind of mistake I wouldn't be caught dead sending in for peer review.

I'm glad the OP feels fine just letting Opus do whatever it wants without a pause to look under the covers, and perhaps we all have to learn to stop worrying and love the LLM? But I think really, here and now, we're witness to just another hype article written by a professional blogger and speaker, who's highly motivated to write engagement bait like this.

replies(3): >>46201651 #>>46204231 #>>46204469 #
1. benjiro ◴[] No.46204469[source]
That is the thing ... How long ago did we get Agent mode. Like in CoPilot that thing is only 7 months old.

Things evolve faster then people realize... Agent mode, then came mcp servers, sub agents, now its rag databases allowing the LLMs to get data directly.

The development of LLMS looks slow but with each iteration, things get improved. As yourself, what will have been the result of those same tests you ran, 21 months ago, with Claude 3.0? How about Claude 4.0, that is only 8 months ago.

Right now Opus 4.5 is darn functional. The issue is more often not the code that it write, but more often it get stuck on "its too complex, let me simplify it", with the biggest issue often being context capacity.

LLMs are still bad at deeper tasks, but compared to the last LLMs, the jumps have been enormous. What about a year from now? Two years? I have a hard time believing that Claude 3 was not even 2 years but just 21 month ago. And we considered that a massive jump up, useful for working on a single file... Now we are throwing it entire codebases and is darn good at debugging, editing etc.

Do i like the results? No, there are lots of times that the results are not what "i wanted", but that is often a result of my own prompting being too generic.

LLMs are never going to really replace experience programmers, but boy is the progress scary.

replies(1): >>46211262 #
2. aeonfox ◴[] No.46211262[source]
I can't say my opinion has changed. It didn't give me results that more exciting or useful than Sonnet. Is it worth 3x price per token? I'm not so sure.

(It wasn't clear in my comment, but I already use agents for my code. I just think the OPs claims are overblown.)