←back to thread

321 points distantprovince | 2 comments | | HN request time: 0.438s | source
Show context
majormajor ◴[] No.44617506[source]
LLMs are very very good at adding words in a way that looks "well written" (to our current mental filters) without adding meaning or value.

I wonder how long it will be before LLM-text trademarks become seen as a sign of bad writing or laziness instead? And then maybe we'll have an arms race of stylistic changes.

---

Completely agree with the author:

Earlier this week I asked Claude to summarize a bunch of code files since I was looking for a bug. It wrote paragraphs and had 3 suggestions. But when I read it, I realized it was mostly super generic and vague. The conditions that would be required to trigger the bug in those ways couldn't actually exist, but it put a lot of words around the ideas. I took longer to notice that they were incorrect suggestions as a result.

I told it "this won't happen those ways [because blah blah blah]" and it gave me the "you are correct!" compliment-dance and tried again. One new suggestion and a claimed reason about how one of its original suggestions might be right. The new suggestion seemed promising, but I wasn't entirely convinced. Tried again. It went back to the first three suggestions - the "here's why that won't happen" was still in the context window, but it hit some limit of its model. Like it was trying to reconcile being reinforcement-learning'd into "generate something that looks like a helpful answer" with "here is information in the context window saying the text I want to generate is wrong" and failing. We got into a loop.

It was a rare bug so we'll see if the useful-seeming suggestion was right or not but I don't know yet. Added some logging around it and some other stuff too.

The counterfactuals are hard to evaluate:

* would I have identified that potential change quicker without asking it? Or at all?

* would I have identified something else that it didn't point out?

* what if I hadn't noticed the problems with some other suggestions and spent a bunch of time chasing them?

The words:information ratio was a big problem in spotting the issues.

So was the "text completion" aspect of "if you're asking about a problem here, there must be a solution I can offer" RL-seeming aspect of its generated results. It didn't seem to be truly evaluating the code then deciding so much as saying "yes, I will definitely tell you there are things we can change, here are some that seem plausible."

Imagine if my coworker had asked me the question and I'd just copy-pasted Claude's first crap attempt to them in response? Rude as hell.

replies(1): >>44617636 #
1. drewvlaz ◴[] No.44617636[source]
One of the largest issues I've experienced is LLMs being too agreeable.

I don't want my theories parroted back to me on why something went wrong. I want to have ideas challenged in a way that forces me to think and hopefully lead me to a new perspective that I otherwise would have missed.

Perhaps a large portion of people do enjoy the agreeableness, but this becomes a problem not only because I think there are larger societal issues that stem from this echo-chamber like environmental but also simply that companies training these models may interpret agreeableness as somehow better and something that should be optimized for.

replies(1): >>44617960 #
2. scarface_74 ◴[] No.44617960[source]
That’s simple - after it tries to be helpful and agreeable I just ask for a “devils advocate” response. I have a much longer prompt I use sometimes involve being a “sparring partner”.

And I go back and forth sometimes between correcting its devils advocate responses and “steel man” responses.