[imprecise thinking]
v <--- LLMs do this for you
[specific and exact commands]
v
[computers]
v
[specific and exact output]
v <--- LLMs do this for you
[contextualized output]
In many cases, you don't want or need that. In some, you do. Use right tool for the job, etc.To your point, which I think is separate but related, that IS a case where LLMs are good at producing specific and exact commands. The models + the right prompt are pretty reliable at tool calling by themselves, because you give them a list of specific and exact things they can do. And they can be fully specific and exact at inference time with constrained output (although you may still wish it called a different tool.)
What it's trying to communicate is, in general, a human operating a computer has to turn their imprecise thinking into "specific and exact commands", and subsequently, understand the "specific and exact output" in whatever terms they're thinking off, prioritizing and filtering out data based on situational context. LLMs enter the picture in two places:
1) In many situations, they can do the "imprecise thinking" -> "specific and exact commands" step for the user;
2) In many situations, they can do the "specific and exact output" -> contextualized output step for the user;
In such scenarios, LLMs are not replacing software, they're being slotted as intermediary between user and classical software, so the user can operate closer to what's natural for them, vs. translating between it and rigid computer language.
This is not applicable everywhere, but then, this is also not the only way LLMs are useful - it's just one broad class of scenarios in which they are.
The model's output is a probability for every token. Constrained output is a feature of the inference engine. With a strict schema the inference engine can ignore every token that doesn't adhere to the schema and select the top token that does adhere to the schema.
This would be great if LLMs did not tend to output nonsense. Truly it would be grand. But they do. So it isn't. It's wasting resources hoping for a good outcome and risking frustration, misapprehensions, prompt injection attacks... It's non-deterministic algorithms hoping P=NP, except instead of branching at every decision you're doing search by tweaking vectors whose values you don't even know and whose influence on the outcome is impossible to foresee.
Sure, a VC subsidized LLM is a great way to make CVs in LaTeX (I do it all the time), translating text, maybe even generating some code if you know what you need and can describe it well. I will give you that. I even created a few - very mediocre - songs. Am I contradicting myself? I don't think I am, because I would love to live in a hotel if I only had to pay a tiny fraction of the cost. But I would still think that building hotels would be a horrible way to address the housing crisis in modern metropolises.
I didn't mean it to be condescending - though I can see how it can come across as such. FWIW, I opted for a diagram after I typed half a page worth of "normal" text and realized I'm still not able to elucidate my point - so I deleted it and drew something matching my message more closely.
> This would be great if LLMs did not tend to output nonsense. Truly it would be grand. But they do. So it isn't.
I find this critique to be tiring at this point - it's just as wrong as assuming LLMs work perfectly and all is fine. Both views are too definite, too binary. In reality, LLMs are just non-deterministic - that is, they have an error rate. How big it is, and how small can it get in practice for a given tasks - those are the important questions.
Pretty much every aspect of computing is only probabilistically correct - either because the algorithm is explicitly so (UUIDs and primality testing, for starters), or just because it runs on real hardware, and physics happen. Most people get away with pretending that our systems are either correct or not, but that's only possible because the error rate is low enough. But it's never that low by accident - it got pushed there by careful design at every level, hardware and software. LLMs are just another probabilistically correct system that, over time, we'll learn how to use in ways that gets the error rate low enough to stop worrying about it.
How can we get there - now, that is an interesting challenge.
LLMs are cool technology sure. There's a lot of cool things in the ML space. I love it.
But don't pretend like the context of this conversation isn't the current hype and that it isn't reaching absurd levels.
So yeah we're all tired. Tired of the hype, of pushing LLMs, agents, whatever, as some sort of silver bullet. Tired of the corporate smoke screen around it. NLP is still a hard problem, we're nowhere near solving it, and bolting it on everything is not a better idea now than it was before transformers and scaling laws.
On the other hand my security research business is booming and hey the rational thing for me to say is: by all means keep putting NLP everywhere.
Those are the big challenges of housing. Not just how many units there are, but what they are, and how much the "how many" is plain cheating.