Most active commenters
  • cmiles74(3)
  • sothatsit(3)

←back to thread

A non-anthropomorphized view of LLMs

(addxorrol.blogspot.com)
475 points zdw | 16 comments | | HN request time: 0.909s | source | bottom
Show context
barrkel ◴[] No.44485012[source]
The problem with viewing LLMs as just sequence generators, and malbehaviour as bad sequences, is that it simplifies too much. LLMs have hidden state not necessarily directly reflected in the tokens being produced and it is possible for LLMs to output tokens in opposition to this hidden state to achieve longer term outcomes (or predictions, if you prefer).

Is it too anthropomorphic to say that this is a lie? To say that the hidden state and its long term predictions amount to a kind of goal? Maybe it is. But we then need a bunch of new words which have almost 1:1 correspondence to concepts from human agency and behavior to describe the processes that LLMs simulate to minimize prediction loss.

Reasoning by analogy is always shaky. It probably wouldn't be so bad to do so. But it would also amount to impenetrable jargon. It would be an uphill struggle to promulgate.

Instead, we use the anthropomorphic terminology, and then find ways to classify LLM behavior in human concept space. They are very defective humans, so it's still a bit misleading, but at least jargon is reduced.

replies(7): >>44485190 #>>44485198 #>>44485223 #>>44486284 #>>44487390 #>>44489939 #>>44490075 #
1. cmiles74 ◴[] No.44485198[source]
IMHO, anthrophormization of LLMs is happening because it's perceived as good marketing by big corporate vendors.

People are excited about the technology and it's easy to use the terminology the vendor is using. At that point I think it gets kind of self fulfilling. Kind of like the meme about how to pronounce GIF.

replies(6): >>44485304 #>>44485383 #>>44486029 #>>44486290 #>>44487414 #>>44487524 #
2. Angostura ◴[] No.44485304[source]
IMHO it happens for the same reason we see shapes in clouds. The human mind through millions of years has evolved to equate and conflate the ability to generate cogent verbal or written output with intelligence. It's an instinct to equate the two. It's an extraordinarily difficult instinct to break. LLMs are optimised for the one job that will make us confuse them for being intelligent
replies(2): >>44485539 #>>44494579 #
3. brookst ◴[] No.44485383[source]
Nobody cares about what’s perceived as good marketing. People care about what resonates with the target market.

But yes, anthropomorphising LLMs is inevitable because they feel like an entity. People treat stuffed animals like creatures with feelings and personality; LLMs are far closer than that.

replies(3): >>44485423 #>>44485584 #>>44485837 #
4. cmiles74 ◴[] No.44485423[source]
Alright, let’s agree that good marketing resonates with the target market. ;-)
replies(1): >>44485456 #
5. brookst ◴[] No.44485456{3}[source]
I 1000% agree. It’s a vicious, evolutionary, and self-selecting process.

It takes great marketing to actually have any character and intent at all.

6. DrillShopper ◴[] No.44485584[source]
> People treat stuffed animals like creatures with feelings and personality; LLMs are far closer than that.

Children do, some times, but it's a huge sign of immaturity when adults, let alone tech workers, do it.

I had a professor at University that would yell at us if/when we personified/anthropomorphized the tech, and I have that same urge when people ask me "What does <insert LLM name here> think?".

7. roywiggins ◴[] No.44485837[source]
the chat interface was a choice, though a natural one. before they'd RLHFed it into chatting and it was just GPT 3 offering completions 1) not very many people used it and 2) it was harder to anthropomorphize
8. sothatsit ◴[] No.44486029[source]
I think anthropomorphizing LLMs is useful, not just a marketing tactic. A lot of intuitions about how humans think map pretty well to LLMs, and it is much easier to build intuitions about how LLMs work by building upon our intuitions about how humans think than by trying to build your intuitions from scratch.

Would this question be clear for a human? If so, it is probably clear for an LLM. Did I provide enough context for a human to diagnose the problem? Then an LLM will probably have a better chance of diagnosing the problem. Would a human find the structure of this document confusing? An LLM would likely perform poorly when reading it as well.

Re-applying human intuitions to LLMs is a good starting point to gaining intuition about how to work with LLMs. Conversely, understanding sequences of tokens and probability spaces doesn't give you much intuition about how you should phrase questions to get good responses from LLMs. The technical reality doesn't explain the emergent behaviour very well.

I don't think this is mutually exclusive with what the author is talking about either. There are some ways that people think about LLMs where I think the anthropomorphization really breaks down. I think the author says it nicely:

> The moment that people ascribe properties such as "consciousness" or "ethics" or "values" or "morals" to these learnt mappings is where I tend to get lost.

replies(2): >>44487443 #>>44494411 #
9. positron26 ◴[] No.44486290[source]
> because it's perceived as good marketing

We are making user interfaces. Good user interfaces are intuitive and purport to be things that users are familiar with, such as people. Any alternative explanation of such a versatile interface will be met with blank stares. Users with no technical expertise would come to their own conclusions, helped in no way by telling the user not to treat the chat bot as a chat bot.

10. mikojan ◴[] No.44487414[source]
True but also researchers want to believe they are studying intelligence not just some approximation to it.
11. otabdeveloper4 ◴[] No.44487443[source]
You think it's useful because Big Corp sold you that lie.

Wait till the disillusionment sets in.

replies(1): >>44488342 #
12. Marazan ◴[] No.44487524[source]
aAnthrophormisation happens because Humans are absolutely terrible at evaluating systems that give converdational text output.

ELIZA fooled many people into think it was conscious and it wasn't even trying to do that.

13. sothatsit ◴[] No.44488342{3}[source]
No, I think it's useful because it is useful, and I've made use of it a number of times.
14. cmiles74 ◴[] No.44494411[source]
Take a look at the judge’s ruling in this Anthropic case:

https://news.ycombinator.com/item?id=44488331

Here’s a quote from the ruling:

“First, Authors argue that using works to train Claude’s underlying LLMs was like using works to train any person to read and write, so Authors should be able to exclude Anthropic from this use (Opp. 16). But Authors cannot rightly exclude anyone from using their works for training or learning as such. Everyone reads texts, too, then writes new texts. They may need to pay for getting their hands on a text in the first instance. But to make anyone pay specifically for the use of a book each time they read it, each time they recall it from memory, each time they later draw upon it when writing new things in new ways would be unthinkable. For centuries, we have read and re-read books. We have admired, memorized, and internalized their sweeping themes, their substantive points, and their stylistic solutions to recurring writing problems.”

They literally compare an LLM learning to a person learning and conflate the two. Anthropic will likely win this case because of this anthropomorphisization.

replies(1): >>44496167 #
15. ◴[] No.44494579[source]
16. sothatsit ◴[] No.44496167{3}[source]
> First, Authors argue that using works to train Claude’s underlying LLMs was like using works to train any person to read and write, so Authors should be able to exclude Anthropic from this use (Opp. 16).

It sounds like the Authors were the one who brought this argument, not Anthropic? In which case, it seems like a big blunder on their part.