I'm absolutely right

(absolutelyright.lol)

648 points yoavfr | 1 comments | 05 Sep 25 12:36 UTC | HN request time: 0.201s | source

Show context

trjordan ◴[05 Sep 25 13:54 UTC] No.45138620[source]▶

OK, so I love this, because we all recognize it.

It's not fully just a tic of language, though. Responses that start off with "You're right!" are alignment mechanisms. The LLM, with its single-token prediction approach, follows up with a suggestion that much more closely follows the user's desires, instead of latching onto it's own previous approach.

The other tic I love is "Actually, that's not right." That happens because once agents finish their tool-calling, they'll do a self-reflection step. That generates the "here's what I did response" or, if it sees an error, the "Actually, ..." change in approach. And again, that message contains a stub of how the approach should change, which allows the subsequent tool calls to actually pull that thread instead of stubbornly sticking to its guns.

The people behind the agents are fighting with the LLM just as much as we are, I'm pretty sure!

replies(11): >>45138772 #>>45138812 #>>45139686 #>>45139852 #>>45140141 #>>45140233 #>>45140703 #>>45140713 #>>45140722 #>>45140723 #>>45141393 #

libraryofbabel ◴[05 Sep 25 16:14 UTC] No.45140233[source]▶

>>45138620 #

> The LLM, with its single-token prediction approach, follows up with a suggestion that much more closely follows the user's desires, instead of latching onto it's own previous approach.

Maybe? How would we test that one way or the other? If there’s one thing I’ve learned in the last few years, it’s that reasoning from “well LLMs are based on next-token prediction, therefore <fact about LLMs>” is a trap. The relationship between the architecture and the emergent properties of the LLM is very complex. Case in point: I think two years ago most of us would have said LLMs would never be able to do what they are able to do now (actually effective coding agents) precisely because they were trained on next token prediction. That turned out to be false, and so I don’t tend to make arguments like that anymore.

> The people behind the agents are fighting with the LLM just as much as we are

On that, we agree. No doubt anthropic has tried to fine-tune some of this stuff out, but perhaps it’s deeply linked in the network weights to other (beneficial) emergent behaviors in ways that are organically messy and can’t be easily untangled without making the model worse.

replies(2): >>45140484 #>>45140568 #

1. adastra22 ◴[05 Sep 25 16:33 UTC] No.45140484[source]▶

>>45140233 #

I don’t think there is any basis for GP’s hypothesis that this is related to the cursor being closer to the user’s example. The attention mechanism is position independent by default and actually has to have the token positions shoehorned in.

↑