A non-anthropomorphized view of LLMs

(addxorrol.blogspot.com)

475 points zdw | 1 comments | 06 Jul 25 22:26 UTC | HN request time: 0s | source

Show context

barrkel ◴[06 Jul 25 23:14 UTC] No.44485012[source]▶

The problem with viewing LLMs as just sequence generators, and malbehaviour as bad sequences, is that it simplifies too much. LLMs have hidden state not necessarily directly reflected in the tokens being produced and it is possible for LLMs to output tokens in opposition to this hidden state to achieve longer term outcomes (or predictions, if you prefer).

Is it too anthropomorphic to say that this is a lie? To say that the hidden state and its long term predictions amount to a kind of goal? Maybe it is. But we then need a bunch of new words which have almost 1:1 correspondence to concepts from human agency and behavior to describe the processes that LLMs simulate to minimize prediction loss.

Reasoning by analogy is always shaky. It probably wouldn't be so bad to do so. But it would also amount to impenetrable jargon. It would be an uphill struggle to promulgate.

Instead, we use the anthropomorphic terminology, and then find ways to classify LLM behavior in human concept space. They are very defective humans, so it's still a bit misleading, but at least jargon is reduced.

replies(7): >>44485190 #>>44485198 #>>44485223 #>>44486284 #>>44487390 #>>44489939 #>>44490075 #

positron26 ◴[07 Jul 25 02:46 UTC] No.44486284[source]▶

>>44485012 #

> Is it too anthropomorphic to say that this is a lie?

Yes. Current LLMs can only introspect from output tokens. You need hidden reasoning that is within the black box, self-knowing, intent, and motive to lie.

I rather think accusing an LLM of lying is like accusing a mousetrap of being a murderer.

When models have online learning, complex internal states, and reflection, I might consider one to have consciousness and to be capable of lying. It will need to manifest behaviors that can only emerge from the properties I listed.

I've seen similar arguments where people assert that LLMs cannot "grasp" what they are talking about. I strongly suspect a high degree of overlap between those willing to anthropomorphize error bars as lies while declining to award LLMs "grasping". Which is it? It can think or it cannot? (objectively, SoTA models today cannot yet.) The willingness to waffle and pivot around whichever perspective damns the machine completely belies the lack of honesty in such conversations.

replies(1): >>44486303 #

lostmsu ◴[07 Jul 25 02:51 UTC] No.44486303[source]▶

>>44486284 #

> Current LLMs can only introspect from output tokens

The only interpretation of this statement I can come up with is plain wrong. There's no reason LLM shouldn't be able to introspect without any output tokens. As the GP correctly says, most of the processing in LLMs happens over hidden states. Output tokens are just an artefact for our convenience, which also happens to be the way the hidden state processing is trained.

replies(3): >>44486324 #>>44487399 #>>44487619 #

positron26 ◴[07 Jul 25 02:57 UTC] No.44486324[source]▶

>>44486303 #

There are no recurrent paths besides tokens. How may I introspect something if it is not an input? I may not.

replies(3): >>44487610 #>>44488622 #>>44488738 #

hackinthebochs ◴[07 Jul 25 10:32 UTC] No.44488738[source]▶

>>44486324 #

Important attention heads or layers within an LLM can be repeated giving you an "unrolled" recursion.

replies(1): >>44488792 #

positron26 ◴[07 Jul 25 10:41 UTC] No.44488792[source]▶

>>44488738 #

An unrolled loop in a feed-forward network is all just that. The computation is DAG.

replies(1): >>44488860 #

hackinthebochs ◴[07 Jul 25 10:51 UTC] No.44488860{3}[source]▶

>>44488792 #

But the function of an unrolled recursion is the same as a recursive function with bounded depth as long as the number of unrolled steps match. The point is whatever function recursion is supposed to provide can plausibly be present in LLMs.

replies(1): >>44489307 #

positron26 ◴[07 Jul 25 11:46 UTC] No.44489307{4}[source]▶

>>44488860 #

And then during the next token, all of that bounded depth is thrown away except for the token of output.

You're fixating on the pseudo-computation within a single token pass. This is very limited compared to actual hidden state retention and the introspection that would enable if we knew how to train it and do online learning already.

The "reasoning" hack would not be a realistic implementation choice if the models had hidden state and could ruminate on it without showing us output.

replies(1): >>44489453 #

1. hackinthebochs ◴[07 Jul 25 12:03 UTC] No.44489453{5}[source]▶

>>44489307 #

Sure. But notice "ruminate" is different than introspect, which was what your original comment was about.

↑