I'm absolutely right

(absolutelyright.lol)

648 points yoavfr | 1 comments | 05 Sep 25 12:36 UTC | HN request time: 0s | source

Show context

trjordan ◴[05 Sep 25 13:54 UTC] No.45138620[source]▶

OK, so I love this, because we all recognize it.

It's not fully just a tic of language, though. Responses that start off with "You're right!" are alignment mechanisms. The LLM, with its single-token prediction approach, follows up with a suggestion that much more closely follows the user's desires, instead of latching onto it's own previous approach.

The other tic I love is "Actually, that's not right." That happens because once agents finish their tool-calling, they'll do a self-reflection step. That generates the "here's what I did response" or, if it sees an error, the "Actually, ..." change in approach. And again, that message contains a stub of how the approach should change, which allows the subsequent tool calls to actually pull that thread instead of stubbornly sticking to its guns.

The people behind the agents are fighting with the LLM just as much as we are, I'm pretty sure!

replies(11): >>45138772 #>>45138812 #>>45139686 #>>45139852 #>>45140141 #>>45140233 #>>45140703 #>>45140713 #>>45140722 #>>45140723 #>>45141393 #

al_borland ◴[05 Sep 25 16:51 UTC] No.45140722[source]▶

>>45138620 #

In my experience, once it starts telling me I’m right, we’re already going downhill and it rarely gets better from there.

replies(4): >>45141151 #>>45143167 #>>45145334 #>>45146082 #

lemming ◴[06 Sep 25 00:27 UTC] No.45145334[source]▶

>>45140722 #

Yeah, I want a feature which stops my agent as soon as it says anything even vaguely like: "let me try another approach". Right after that is when the wheels start falling off, tests get deleted, etc. That phrase is a sure sign the agent should (but never does) ask me for guidance.

replies(1): >>45147506 #

1. al_borland ◴[06 Sep 25 08:11 UTC] No.45147506[source]▶

>>45145334 #

I’ve found even giving guidance at this point doesn’t help, as it fundamentally doesn’t get it.

I was down one of these rabbit holes with it once while having it write a relatively simple bash script. Something I had written by hand previously in Python, but wanted a bash version and also wanted to see what AI could do.

It was 98% there, but couldn’t get that last 2% to save its life. Eventually I went through the code myself, found the bug, and I told it exactly what the bug was and where it was at; it was an off-by-one error. Even when spoon feeding it, it couldn’t fix it and I ended up doing it myself just to get it over with.

↑