←back to thread

Alignment is capability

(www.off-policy.com)
106 points drctnlly_crrct | 2 comments | | HN request time: 0.5s | source
Show context
xnorswap ◴[] No.46192597[source]
I've only been using it a couple of weeks, but in my opinion, Opus 4.5 is the biggest jump in tech we've seen since ChatGPT 3.5.

The difference between juggling Sonnet 4.5 / Haiku 4.5 and just using Opus 4.5 for everything is night & day.

Unlike Sonnet 4.5 which merely had promise at being able to go off and complete complex tasks, Opus 4.5 seems genuinely capable of doing so.

Sonnet needed hand-holding and correction at almost every step. Opus just needs correction and steering at an early stage, and sometimes will push back and correct my understanding of what's happening.

It's astonished me with it's capability to produce easy to read PDFs via Typst, and has produced large documents outlining how to approach very tricky tech migration tasks.

Sonnet would get there eventually, but not without a few rounds of dealing with compilation errors or hallucinated data. Opus seems to like to do "And let me just check my assumptions" searches which makes all the difference.

replies(5): >>46192783 #>>46192922 #>>46193718 #>>46194371 #>>46196267 #
airstrike ◴[] No.46192783[source]
I'm not so sure. Opus 4.1 was more capable than 4.5, but it was too damn expensive and slow.

Opus 4.5 is like a cheaper, faster Opus 4.1. It's so much cheaper, in fact, that the weekly limits on Claude Code now apply to Sonnet, not to Opus, as they phased out 4.1 in favor of 4.5.

replies(1): >>46193458 #
1. chrisweekly ◴[] No.46193458[source]
Capable how?
replies(1): >>46194257 #
2. airstrike ◴[] No.46194257[source]
Able to independently find bugs, think through a complex codebase, better at "big picture" thinking and planning bigger edits, in my experience.