A non-anthropomorphized view of LLMs

> In contrast to an LLM, given a human and a sequence of words, I cannot begin putting a probability on "will this human generate this sequence".

I think that's a bit pessimistic. I think we can say for instance that the probability that a person will say "the the the of of of arpeggio halcyon" is tiny compared to the probability that they will say "I haven't been getting that much sleep lately". And we can similarly see that lots of other sequences are going to have infinitesimally low probability. Now, yeah, we can't say exactly what probability that is, but even just using a fairly sizable corpus as a baseline you could probably get a surprisingly decent estimate, given how much of what people say is formulaic.

The real difference seems to be that the manner in which humans generate sequences is more intertwined with other aspects of reality. For instance, the probability of a certain human saying "I haven't been getting that much sleep lately" is connected to how much sleep they have been getting lately. For an LLM it really isn't connected to anything except word sequences in its input.

I think this is consistent with the author's point that we shouldn't apply concepts like ethics or emotions to LLMs. But it's not because we don't know how to predict what sequences of words humans will use; it's rather because we do know a little about how to do that, and part of what we know is that it is connected with other dimensions of physical reality, "human nature", etc.

This is one reason I think people underestimate the risks of AI: the performance of LLMs lulls us into a sense that they "respond like humans", but in fact the Venn diagram of human and LLM behavior only intersects in a relatively small area, and in particular they have very different failure modes.