I don't get it at all.
[imprecise thinking]
v <--- LLMs do this for you
[specific and exact commands]
v
[computers]
v
[specific and exact output]
v <--- LLMs do this for you
[contextualized output]
In many cases, you don't want or need that. In some, you do. Use right tool for the job, etc.To your point, which I think is separate but related, that IS a case where LLMs are good at producing specific and exact commands. The models + the right prompt are pretty reliable at tool calling by themselves, because you give them a list of specific and exact things they can do. And they can be fully specific and exact at inference time with constrained output (although you may still wish it called a different tool.)
What it's trying to communicate is, in general, a human operating a computer has to turn their imprecise thinking into "specific and exact commands", and subsequently, understand the "specific and exact output" in whatever terms they're thinking off, prioritizing and filtering out data based on situational context. LLMs enter the picture in two places:
1) In many situations, they can do the "imprecise thinking" -> "specific and exact commands" step for the user;
2) In many situations, they can do the "specific and exact output" -> contextualized output step for the user;
In such scenarios, LLMs are not replacing software, they're being slotted as intermediary between user and classical software, so the user can operate closer to what's natural for them, vs. translating between it and rigid computer language.
This is not applicable everywhere, but then, this is also not the only way LLMs are useful - it's just one broad class of scenarios in which they are.
The model's output is a probability for every token. Constrained output is a feature of the inference engine. With a strict schema the inference engine can ignore every token that doesn't adhere to the schema and select the top token that does adhere to the schema.