←back to thread

A non-anthropomorphized view of LLMs

(addxorrol.blogspot.com)
475 points zdw | 4 comments | | HN request time: 0s | source
Show context
Al-Khwarizmi ◴[] No.44487564[source]
I have the technical knowledge to know how LLMs work, but I still find it pointless to not anthropomorphize, at least to an extent.

The language of "generator that stochastically produces the next word" is just not very useful when you're talking about, e.g., an LLM that is answering complex world modeling questions or generating a creative story. It's at the wrong level of abstraction, just as if you were discussing an UI events API and you were talking about zeros and ones, or voltages in transistors. Technically fine but totally useless to reach any conclusion about the high-level system.

We need a higher abstraction level to talk about higher level phenomena in LLMs as well, and the problem is that we have no idea what happens internally at those higher abstraction levels. So, considering that LLMs somehow imitate humans (at least in terms of output), anthropomorphization is the best abstraction we have, hence people naturally resort to it when discussing what LLMs can do.

replies(18): >>44487608 #>>44488300 #>>44488365 #>>44488371 #>>44488604 #>>44489139 #>>44489395 #>>44489588 #>>44490039 #>>44491378 #>>44491959 #>>44492492 #>>44493555 #>>44493572 #>>44494027 #>>44494120 #>>44497425 #>>44500290 #
grey-area ◴[] No.44487608[source]
On the contrary, anthropomorphism IMO is the main problem with narratives around LLMs - people are genuinely talking about them thinking and reasoning when they are doing nothing of that sort (actively encouraged by the companies selling them) and it is completely distorting discussions on their use and perceptions of their utility.
replies(13): >>44487706 #>>44487747 #>>44488024 #>>44488109 #>>44489358 #>>44490100 #>>44491745 #>>44493260 #>>44494551 #>>44494981 #>>44494983 #>>44495236 #>>44496260 #
fenomas ◴[] No.44488109[source]
When I see these debates it's always the other way around - one person speaks colloquially about an LLM's behavior, and then somebody else jumps on them for supposedly believing the model is conscious, just because the speaker said "the model thinks.." or "the model knows.." or whatever.

To be honest the impression I've gotten is that some people are just very interested in talking about not anthropomorphizing AI, and less interested in talking about AI behaviors, so they see conversations about the latter as a chance to talk about the former.

replies(4): >>44488326 #>>44489402 #>>44489673 #>>44492369 #
Wowfunhappy ◴[] No.44492369[source]
As I write this, Claude Code is currently opening and closing various media files on my computer. Sometimes it plays the file for a few seconds before closing it, sometimes it starts playback and then seeks to a different position, sometimes it fast forwards or rewinds, etc.

I asked Claude to write a E-AC3 audio component so I can play videos with E-AC3 audio in the old version of QuickTime I really like using. Claude's decoder includes the ability to write debug output to a log file, so Claude is studying how QuickTime and the component interact, and it's controlling QuickTime via Applescript.

Sometimes QuickTime crashes, because this ancient API has its roots in the classic Mac OS days and is not exactly good. Claude reads the crash logs on its own—it knows where they are—and continues on its way. I'm just sitting back and trying to do other things while Claude works, although it's a little distracting that something else is using my computer at the same time.

I really don't want to anthropomorphize these programs, but it's just so hard when it's acting so much like a person...

replies(1): >>44495063 #
godelski ◴[] No.44495063[source]
Would it help you to know that trial and error is a common tactic by machines? Yes, humans do it too, but that doesn't mean the process isn't mechanical. In fact, in computing we might call this a "brute force" approach. You don't have to cover the entire search space to brute force something, and it certainly doesn't mean you can't have optimization strategies and need to grid search (e.g. you can use Bayesian methods, multi-armed bandit approaches, or a whole world of things).

I would call "fuck around and find out" a rather simple approach. It is why we use it! It is why lots of animals use it. Even very dumb animals use it. Though, we do notice more intelligent animals use more efficient optimization methods. All of this is technically hypothesis testing. Even a naive grid search. But that is still in the class of "fuck around and find out" or "brute force", right?

I should also mention two important things.

1) as a human we are biased to anthropomorphize. We see faces in clouds. We tell stories of mighty beings controlling the world in an effort to explain why things happen. This is anthropomorphization of the universe itself!

2) We design LLMs (and many other large ML systems) to optimize towards human preference. This reinforces an anthropomorphized interpretation.

The reason for doing this (2) is based on a naive assumption[0]: If it looks like a duck, swims like a duck, and quacks like a duck, then it *probably* is a duck. But the duck test doesn't rule out a highly sophisticated animatronic. It's a good rule of thumb, but wouldn't it also be incredibly naive to assume that it *is* a duck? Isn't the duck test itself entirely dependent on our own personal familiarity with ducks? I think this is important to remember and can help combat our own propensity for creating biases.

[0] It is not a bad strategy to build in that direction. When faced with many possible ways to go, this is a very reasonable approach. The naive part is if you assume that it will take you all the way to making a duck. It is also a perilous approach because you are explicitly making it harder for you to evaluate. It is, in the fullest sense of the phrase, "metric hacking."

replies(1): >>44495413 #
Wowfunhappy ◴[] No.44495413[source]
It wasn't a simple brute force. When Claude was working this morning, it was pretty clearly only playing a file when it actually needed to see packets get decoded, otherwise it would simply open and close the document. Similarly, it would only seek or fast forward when it was debugging specific issues related to those actions. And it even "knew" which test files to open for specific channel layouts.

Yes this is still mechanical in a sense, but then I'm not sure what behavior you wouldn't classify as mechanical. It's "responding" to stimuli in logical ways.

But I also don't quite know where I'm going with this. I don't think LLMs are sentient or something, I know they're just math. But it's spooky.

replies(1): >>44495544 #
godelski ◴[] No.44495544{3}[source]

  > It wasn't a simple brute force.
I think you misunderstood me.

"Simple" is the key word here, right? You agree that it is still under the broad class of "brute force"?

I'm not saying Claude is naively brute forcing. In fact, with lack of interpretibility of these machines it is difficult to say what kind of optimization it is doing and how complex that it (this was a key part tbh).

My point was to help with this

  > I really don't want to anthropomorphize these programs, but it's just so hard when it's acting so much like a person...
Which requires you to understand how some actions can be mechanical. You admitted to cognitive dissonance (something we all do and I fully agree is hard not to do) and wanting to fight it. We're just trying to find some helpful avenues to do so.

  > It's "responding" to stimuli in logical ways.
And so too can a simple program, right? A program can respond to user input and there is certainly a logic path it will follow. Our non-ML program is likely going to have a deterministic path (there is still probabilistic programming...), but that doesn't mean it isn't logic, right?

But the real question here, which you have to ask yourself (constantly) is "how do I differentiate a complex program that I don't understand from a conscious entity?" I guarantee you that you don't have the answer (because no one does). But isn't that a really good reason to be careful about anthropomorphizing it?

That's the duck test.

How do you determine if it is a real duck or a highly sophisticated animatronic?

If you anthropomorphize, you rule out the possibility that it is a highly sophisticated animatronic and you *MUST* make the assumption that you are not only an expert, but a perfect, duck detector. But simultaneously we cannot rule out that it is a duck, right? Because, we aren't a perfect duck detector *AND* we aren't an expert in highly sophisticated animatronics (especially of the duck kind).

Remember, there are not two answers to every True-False question, there are three. Every True-False question either has an answer of "True", "False", or "Indeterminate". So don't naively assume it is binary. We all know the Halting Problem, right? (also see my namesake or quantum physics if you want to see such things pop up outside computing)

Though I agree, it can be very spooky. But that only increases the importance of trying to develop mental models that help us more objectively evaluate things. And that requires "indeterminate" be a possibility. This is probably the best place to start to combat the cognitive dissonance.

replies(1): >>44497330 #
donkeybeer ◴[] No.44497330{4}[source]
I have no idea why some people take so much offense to rhe fact humans are just another machine, there's no reason why another machine can't surpass it here as in all other aveneus machines have already. Many of the reasons people give for llms not being conscious are just as applicable to humans too.
replies(2): >>44497561 #>>44498411 #
1. godelski ◴[] No.44498411{5}[source]
I don't think the question is if humans are a machine or not but rather what is meant by machine. Most people interpret it as meaning deterministic and thus having no free will. That's probably not what you're trying to convey so might not be the best word to use.

But the question is what is special about the human machine? What is special about the animal machine? These are different from all the machines we have built. Is it complexity? Is it indeterministic? Is it more? Certainly these machines have feelings, and we need to account for them when interacting with them.

Though we're getting well off topic from determining if a duck is a duck or is a machine (you know what I mean by this word and that I don't mean a normal duck)

replies(1): >>44508550 #
2. donkeybeer ◴[] No.44508550[source]
What is indeterminism here? I am not sure the question having or not having free will has any impact of how to make human machines. We are just as in the dark about the future, if we have free will or not. I am not certain of any physical problem in which free will or lack of it plays a role. I could be wrong. So its probably an interesting question but rather pointless.

Even with the everyday machines and programs we have, we can make it behave based on random input taken for example from physical noise. It doesn't suddenly make it a special or different type of machine.

replies(1): >>44515911 #
3. godelski ◴[] No.44515911[source]
I think you missed my point. I agree, being a machine doesn't mean you don't have free will. I agree, free will is orthogonal to being (or not being) a machine.

But that's not what my comment was about.

My comment was about *what the average person interprets*.

You asked why people take offense to being called a machine, and I'm trying to explain that. But to understand this we have to understand that there isn't a singular objective way to interpret statements. We can agree that language is fuzzy, right?

So let me try to translate, again.

You say: "People are machines"

(Many) People hear: "People are mechanical automata, running pre-defined routines"

I hear you, this is not what you are trying to communicate. That's not what you want them to hear. But if you want them to hear what you actually mean it is very helpful to understand that some people will hear something different.

Why do they hear the other thing? Because they don't have intimate familiarity with machines and how general that word is. *You have a better understanding of what a machine is than most people.* That's likely the cause for miscommunication.

When they think of a machine they think of things like a car, a computer, a blender, a TV, an oven, or a multitude of other similar things. Even if some of these use probabilistic programming, the average person is not going to know what probabilistic programming even is. They just see something mechanical. Deterministic.

I'm sure you know this, but it is worth reiterating. Communication has 3 main components: What you intend to communicate, the words/gestures/etc you use to communicate, and what the other person hears. Unfortunately (fortunately?) we can't communicate telepathically, so don't forget that the person you're talking to can have a reasonable interpretation that is significantly different from what you intended to say.

replies(1): >>44516027 #
4. donkeybeer ◴[] No.44516027{3}[source]
Oh of course. I feel I should have been clearer that I meant among technical persons, not average randoms.

When talking about people who are not mathematicians or computer scientists, on average, yes absolutely they hear something like that when told humans are machines.