The "confident idiot" problem: Why AI needs hard rules, not vibe checks

I wrote about something like this a couple months ago: https://thelisowe.substack.com/p/relentless-vibe-coding-part.... Even started building a little library to prove out the concept: https://github.com/Mockapapella/containment-chamber

Spoiler: there won't be a part 2, or if there is it will be with a different approach. I wrote a followup that summarizes my experiences trying this out in the real world on larger codebases: https://thelisowe.substack.com/p/reflections-on-relentless-v...

tl;dr I use a version of it in my codebases now, but the combination of LLM reward hacking and the long tail of verfiers in a language (some of which don't even exist! Like accurately detecting dead code in Python (vulture et. al can't reliably do this) or valid signatures for property-based tests) make this problem more complicated than it seems on the surface. It's not intractable, but you'd be writing many different language-specific libraries. And even then, with all of those verifiers in place, there's no guarantee that when working in different sized repos it will produce a consistent quality of code.