A non-anthropomorphized view of LLMs

I have the technical knowledge to know how LLMs work, but I still find it pointless to not anthropomorphize, at least to an extent.

The language of "generator that stochastically produces the next word" is just not very useful when you're talking about, e.g., an LLM that is answering complex world modeling questions or generating a creative story. It's at the wrong level of abstraction, just as if you were discussing an UI events API and you were talking about zeros and ones, or voltages in transistors. Technically fine but totally useless to reach any conclusion about the high-level system.

We need a higher abstraction level to talk about higher level phenomena in LLMs as well, and the problem is that we have no idea what happens internally at those higher abstraction levels. So, considering that LLMs somehow imitate humans (at least in terms of output), anthropomorphization is the best abstraction we have, hence people naturally resort to it when discussing what LLMs can do.

These anthropomorphizations are best described as metaphors when used by people to describe LLMs in common or loose speech. We already use anthropomorphic metaphors when talking about computers. LLMs, like all computation, are a matter of simulation; LLMs can appear to be conversing without actually conversing. What distinguishes the real thing from the simulation is the cause of the appearance of an effect. Problems occur when people forget these words are being used metaphorically, as if they were univocal.

Of course, LLMs are multimodal and used to simulate all sorts of things, not just conversation. So there are many possible metaphors we can use, and these metaphors don't necessarily align with the abstractions you might use to talk about LLMs accurately. This is like the difference between "synthesizes text" (abstraction) and "speaks" (metaphor), or "synthesizes images" (abstraction) and "paints" (metaphor). You can use "speaks" or "paints" to talk about the abstractions, of course.