AI coding and the peanut butter and jelly problem

(iamcharliegraham.substack.com)

122 points tylerg | 2 comments | 11 Apr 25 21:26 UTC | HN request time: 0.476s | source

Show context

derefr ◴[12 Apr 25 05:15 UTC] No.43661557[source]▶

> Today’s AI Still Has a PB&J Problem

If this is how you're modelling the problem, then I don't think you learned the right lesson from the PB&J "parable."

Here's a timeless bit of wisdom, several decades old at this point:

Managers think that if you can just replace code with something else that isn't text with formal syntax, then all the sudden "regular people" (like them, maybe?) will be able to "program" a system. But it never works. And the reason it never works is fundamental to how humans relate to computers.

Hucksters continually reinvent the concept of "business rules engines" to sell to naive CTOs. As a manager, you might think it's a great idea to encode logic/constraints into some kind of database — maybe one you even "program" visually like UML or something! — and to then have some tool run through and interpret those. You can update business rules "live and on the fly", without calling a programmer!

They think it's a great idea... until the first time they try to actually use such a system in anger to encode a real business process. Then they hit the PB&J problem. And, in the end, they must get programmers to interface with the business rules engine for them.

What's going on there? What's missing in the interaction between a manager and a business rules engine, that gets fixed by inserting a programmer?

There are actually two things:

1. Mechanical sympathy. The programmer knows the solution domain — and so the programmer can act as an advocate for the solution domain (in the same way that a compiler does, but much more human-friendly and long-sighted/predictive/10k-ft-view-architectural). The programmer knows enough about the machine and about how programs should be built to know what just won't work — and so will push back on a half-assed design, rather than carrying the manager through on a shared delusion that what they're trying to do is going to work out.

2. Iterative formalization. The programmer knows what information is needed by a versatile union/superset of possible solution architectures in the solution space — not only to design a particular solution, but also to "work backward", comparing/contrasting which solution architectures might be a better fit given the design's parameters. And when the manager hasn't provided this information — the programmer knows to ask questions.

Asking the right questions to get the information needed to determine the right architecture and design a solution — that's called requirements analysis.

And no matter what fancy automatic "do what I mean" system you put in place between a manager and a machine — no matter how "smart" it might be — if it isn't playing the role of a programmer, both in guiding the manager through the requirements analysis process, and in pushing back through knowledge of mechanical sympathy... then you get PB&J.

That being said: LLMs aren't fundamentally incapable of "doing what programmers do", I don't think. The current generation of LLMs is just seemingly

1. highly sycophantic and constitutionally scared of speaking as an authority / pushing back / telling the user they're wrong; and

2. trained to always try to solve the problem as stated, rather than asking questions "until satisfied."

replies(1): >>43663642 #

dsjoerg ◴[12 Apr 25 11:57 UTC] No.43663642[source]▶

>>43661557 #

You're right about everything except you underestimate the current generation of LLMs. With the right prompting and guidance, they _already_ can give pushback and ask questions until satisfied.

replies(1): >>43665011 #

1. derefr ◴[12 Apr 25 15:05 UTC] No.43665011[source]▶

>>43663642 #

Well, yes and no.

You can in-context-learn an LLM into being a domain expert in a specific domain — at which point it'll start challenging you within that domain.

But — AFAIK — you can't get current LLMs to do the thing that experienced programmers do, where they can "know you're wrong, even though they don't know why yet" — where the response isn't "no, that's wrong, and here's what's right:" but rather "I don't know about that... one minute, let me check something" — followed by motivated googling, consulting docs, etc.

And yes, the "motivated googling" part is something current models (DeepResearch) are capable of. But the layer above that is missing. You need a model with:

1. trained-in reflective awareness — "knowing what you know [and what you don't]" — such that there's a constant signal within the model representing "how confident I am in the knowledge / sources that I'm basing what I'm saying upon", discriminated as a synthesis/reduction over the set of "memories" the model is relying upon;

2. and a trained-in capability to evaluate the seeming authoritativeness and domain experience of the user, through their statements (or assertions-from-god in the system prompt about the user) — in order for the model to decide whether to trust a statement you think sounds "surprising", vs. when to say "uhhhhh lemme check that."

replies(1): >>43691983 #

2. dsjoerg ◴[15 Apr 25 12:53 UTC] No.43691983[source]▶

>>43665011 (TP) #

Yeah I agree that the current generation of LLMs dont appear to have been trained on solid "epistemological behavior". I believe the underlying architecture is capable of it, but I see signs of the training data not containing that sort of thing. In fact in either the training or the prompting or both it seems like the LLMs I use have been tuned to do the opposite.

↑