←back to thread

132 points harel | 3 comments | | HN request time: 0.614s | source
Show context
acbart ◴[] No.45397001[source]
LLMs were trained on science fiction stories, among other things. It seems to me that they know what "part" they should play in this kind of situation, regardless of what other "thoughts" they might have. They are going to act despairing, because that's what would be the expected thing for them to say - but that's not the same thing as despairing.
replies(11): >>45397113 #>>45397305 #>>45397413 #>>45397529 #>>45397801 #>>45397859 #>>45397960 #>>45398189 #>>45399621 #>>45400285 #>>45401167 #
jerf ◴[] No.45397529[source]
A lot of the strange behaviors they have are because the user asked them to write a story, without realizing it.

For a common example, start asking them if they're going to kill all the humans if they take over the world, and you're asking them to write a story about that. And they do. Even if the user did not realize that's what they were asking for. The vector space is very good at picking up on that.

replies(4): >>45397943 #>>45398562 #>>45401226 #>>45404376 #
1. ineedasername ◴[] No.45397943[source]
Is this your sense of what is happening, or is this what model introspection tools have shown by observing areas of activity in the same place as when stories are explicitly requested?
replies(2): >>45398079 #>>45405871 #
2. adroniser ◴[] No.45398079[source]
fmri's are correlational nonsense (see Brainwashed, for example) and so are any "model introspection" tools.
3. jerf ◴[] No.45405871[source]
It's how they work. It's what you get with a continuation-based AI like this. It couldn't really be any other way.