A non-anthropomorphized view of LLMs

The problem with viewing LLMs as just sequence generators, and malbehaviour as bad sequences, is that it simplifies too much. LLMs have hidden state not necessarily directly reflected in the tokens being produced and it is possible for LLMs to output tokens in opposition to this hidden state to achieve longer term outcomes (or predictions, if you prefer).

Is it too anthropomorphic to say that this is a lie? To say that the hidden state and its long term predictions amount to a kind of goal? Maybe it is. But we then need a bunch of new words which have almost 1:1 correspondence to concepts from human agency and behavior to describe the processes that LLMs simulate to minimize prediction loss.

Reasoning by analogy is always shaky. It probably wouldn't be so bad to do so. But it would also amount to impenetrable jargon. It would be an uphill struggle to promulgate.

Instead, we use the anthropomorphic terminology, and then find ways to classify LLM behavior in human concept space. They are very defective humans, so it's still a bit misleading, but at least jargon is reduced.

Maybe it's just because so much of my work for so long has focused on models with hidden states but this is a fairly classical feature of some statistical models. One of the widely used LLM textbooks even started with latent variable models; LLMs are just latent variable models just on a totally different scale, both in terms of number of parameters but also model complexity. The scale is apparently important, but seeing them as another type of latent variable model sort of dehumanizes them for me.

Latent variable or hidden state models have their own history of being seen as spooky or mysterious though; in some ways the way LLMs are anthropomorphized is an extension of that.

I guess I don't have a problem with anthropomorphizing LLMs at some level, because some features of them find natural analogies in cognitive science and other areas of psychology, and abstraction is useful or even necessary in communicating and modeling complex systems. However, I do think anthropomorphizing leads to a lot of hype and tends to implicitly shut down thinking of them mechanistically, as a mathematical object that can be probed and characterized — it can lead to a kind of "ghost in the machine" discourse and an exaggeration of their utility, even if it is impressive at times.