←back to thread

584 points Alifatisk | 1 comments | | HN request time: 0.208s | source
Show context
kgeist ◴[] No.46182122[source]
>The model uses this internal error signal (the gradient) as a mathematical equivalent of saying, "This is unexpected and important!" This allows the Titans architecture to selectively update its long-term memory only with the most novel and context-breaking information

So one can break a model by consistently feeding it with random, highly improbable junk? Everything would be registered as a surprise and get stored, impacting future interactions

replies(6): >>46182150 #>>46182410 #>>46182651 #>>46183200 #>>46183413 #>>46193429 #
idiotsecant ◴[] No.46182410[source]
The is the start of what I always thought an AI should have - a limbic system. Humans don't store memory based on novelty, they store it based on emotional content. This is where I was afraid of the tiger, this is where I smelled delicious food, this was what it felt like when I was victorious in the hunt.

AI needs an internal emotional state because that's what drives attention and memory. AI needs to want something.

replies(2): >>46182665 #>>46192465 #
luckydata ◴[] No.46182665[source]
That would be the biggest mistake anyone could do. I hope nobody goes down this route. AI "wanting" things are an enormous risk to alignment.
replies(2): >>46183135 #>>46185152 #
idiotsecant ◴[] No.46185152[source]
At some point I think we'll have to face the idea that any AI more intelligent than ourselves will by definition be able to evade our alignment tricks.
replies(1): >>46185869 #
luckydata ◴[] No.46185869[source]
equating more intelligent to "wanting things" is a fallacy. You can have a hyper intelligent computer that simply waits for you to ask it to do a job, or you can endow it with the digital equivalent of hunger and reproductive instincts and it will behave completely differently.

We would be INSANE to pursue giving that type of instincts to AIs.

replies(2): >>46189519 #>>46190982 #
1. drdeca ◴[] No.46189519[source]
For some senses of “wanting things”, I think it might be hard to make a powerful AI that couldn’t be easily modified to produce one that “wants things” in some sense.

So, if it would be bad thing for one to be made that “wants things” in any reasonable sense of the phrase, then it would probably be bad for J Random to be able to take a copy of a powerful AI and modify it in some way, because someone is likely to try doing that.

Of course, perhaps the best way to make sure that J Random doesn’t have the ability to do that, is to make sure no one does.