A non-anthropomorphized view of LLMs

(addxorrol.blogspot.com)

475 points zdw | 1 comments | 06 Jul 25 22:26 UTC | HN request time: 0.245s | source

Show context

barrkel ◴[06 Jul 25 23:14 UTC] No.44485012[source]▶

The problem with viewing LLMs as just sequence generators, and malbehaviour as bad sequences, is that it simplifies too much. LLMs have hidden state not necessarily directly reflected in the tokens being produced and it is possible for LLMs to output tokens in opposition to this hidden state to achieve longer term outcomes (or predictions, if you prefer).

Is it too anthropomorphic to say that this is a lie? To say that the hidden state and its long term predictions amount to a kind of goal? Maybe it is. But we then need a bunch of new words which have almost 1:1 correspondence to concepts from human agency and behavior to describe the processes that LLMs simulate to minimize prediction loss.

Reasoning by analogy is always shaky. It probably wouldn't be so bad to do so. But it would also amount to impenetrable jargon. It would be an uphill struggle to promulgate.

Instead, we use the anthropomorphic terminology, and then find ways to classify LLM behavior in human concept space. They are very defective humans, so it's still a bit misleading, but at least jargon is reduced.

replies(7): >>44485190 #>>44485198 #>>44485223 #>>44486284 #>>44487390 #>>44489939 #>>44490075 #

positron26 ◴[07 Jul 25 02:46 UTC] No.44486284[source]▶

>>44485012 #

> Is it too anthropomorphic to say that this is a lie?

Yes. Current LLMs can only introspect from output tokens. You need hidden reasoning that is within the black box, self-knowing, intent, and motive to lie.

I rather think accusing an LLM of lying is like accusing a mousetrap of being a murderer.

When models have online learning, complex internal states, and reflection, I might consider one to have consciousness and to be capable of lying. It will need to manifest behaviors that can only emerge from the properties I listed.

I've seen similar arguments where people assert that LLMs cannot "grasp" what they are talking about. I strongly suspect a high degree of overlap between those willing to anthropomorphize error bars as lies while declining to award LLMs "grasping". Which is it? It can think or it cannot? (objectively, SoTA models today cannot yet.) The willingness to waffle and pivot around whichever perspective damns the machine completely belies the lack of honesty in such conversations.

replies(1): >>44486303 #

lostmsu ◴[07 Jul 25 02:51 UTC] No.44486303[source]▶

>>44486284 #

> Current LLMs can only introspect from output tokens

The only interpretation of this statement I can come up with is plain wrong. There's no reason LLM shouldn't be able to introspect without any output tokens. As the GP correctly says, most of the processing in LLMs happens over hidden states. Output tokens are just an artefact for our convenience, which also happens to be the way the hidden state processing is trained.

replies(3): >>44486324 #>>44487399 #>>44487619 #

positron26 ◴[07 Jul 25 02:57 UTC] No.44486324[source]▶

>>44486303 #

There are no recurrent paths besides tokens. How may I introspect something if it is not an input? I may not.

replies(3): >>44487610 #>>44488622 #>>44488738 #

1. barrkel ◴[07 Jul 25 10:09 UTC] No.44488622[source]▶

>>44486324 #

The recurrence comes from replaying tokens during autoregression.

It's as if you have a variable in a deterministic programming language, only you have to replay the entire history of the program's computation and input to get the next state of the machine (program counter + memory + registers).

Producing a token for an LLM is analogous to a tick of the clock for a CPU. It's the crank handle that drives the process.

↑