(steerlabs.substack.com)

323 points steerlabs | 1 comments | 04 Dec 25 20:48 UTC | HN request time: 0.001s | source

Show context

steerlabs ◴[05 Dec 25 01:23 UTC] No.46155755[source]▶

OP here. I wrote this because I got tired of agents confidently guessing answers when they should have asked for clarification (e.g. guessing "Springfield, IL" instead of asking "Which state?" when asked "weather in Springfield").

I built an open-source library to enforce these logic/safety rules outside the model loop: https://github.com/imtt-dev/steer

replies(3): >>46191943 #>>46191953 #>>46196578 #

1. condiment ◴[08 Dec 25 13:24 UTC] No.46191943[source]▶

>>46155755 #

This approach kind of reminds me of taking an open-book test. Performing mandatory verification against a ground truth is like taking the test, then going back to your answers and looking up whether they match.

Unlike a student, the LLM never arrives at a sort of epistemic coherence, where they know what they know, how they know it, and how true it's likely to be. So you have to structure every problem into a format where the response can be evaluated against an external source of truth.

↑

The "confident idiot" problem: Why AI needs hard rules, not vibe checks