The "confident idiot" problem: Why AI needs hard rules, not vibe checks

(steerlabs.substack.com)

323 points steerlabs | 1 comments | 04 Dec 25 20:48 UTC | HN request time: 0.205s | source

Show context

toddmorey ◴[08 Dec 25 13:29 UTC] No.46191994[source]▶

Confident idiot: I’m exploring using LLM for diagram creation.

I’ve found after about 3 prompts to edit an image with Gemini, it will respond randomly with an entirely new image. Another quirk is it will respond “here’s the image with those edits” with no edits made. It’s like a toaster that will catch on fire every eighth or ninth time.

I am not sure how to mitigate this behavior. I think maybe an LLM as a judge step with vision to evaluate the output before passing it on to the poor user.

replies(5): >>46193250 #>>46193673 #>>46194370 #>>46194578 #>>46195816 #

codazoda ◴[08 Dec 25 18:26 UTC] No.46195816[source]▶

>>46191994 #

I had a similar result trying to create 16 similarly styled images. After half a dozen it just started kicking out the same image over and over again no matter what the prompt said. Even the “thinking” looked right, but the image was just a repeat. I don’t know if this is some type of context limitation or what.

I got around it by using a new prompt/context for each image. This required some rethinking about how to make them match. What I did was create a sprite sheet with the first prompt and then only replaced (edited) the second prompt.

I still got some consistency problems because there were a few important details left out of my sprite sheet. Next time I think I’ll create those individually and then attach them as context for additional prompts.

replies(1): >>46197315 #

1. toddmorey ◴[08 Dec 25 20:37 UTC] No.46197315[source]▶

>>46195816 #

Oh smart. This is good guidance. Yeah fascinating how longer running context causes these side effects, especially the repeated image with no changes bug.

↑