←back to thread

584 points Alifatisk | 1 comments | | HN request time: 0s | source
Show context
kgeist ◴[] No.46182122[source]
>The model uses this internal error signal (the gradient) as a mathematical equivalent of saying, "This is unexpected and important!" This allows the Titans architecture to selectively update its long-term memory only with the most novel and context-breaking information

So one can break a model by consistently feeding it with random, highly improbable junk? Everything would be registered as a surprise and get stored, impacting future interactions

replies(6): >>46182150 #>>46182410 #>>46182651 #>>46183200 #>>46183413 #>>46193429 #
bethekidyouwant ◴[] No.46182651[source]
In what world can you not always break the response of an AI by feeding it a bunch of random junk?
replies(3): >>46182745 #>>46182845 #>>46186503 #
kgeist ◴[] No.46182845[source]
I mean, currently LLMs are stateless and you can get rid of all the poisoned data by just starting a new conversation (context). And OP introduces "long-term memory" where junk will accumulate with time
replies(2): >>46182912 #>>46184405 #
1. dmix ◴[] No.46182912[source]
In something like Cursor if it messes something up your can click 'undo'. I'd imagine a small snapshot would only persisted to the memory if you keep it's output and even then it's mostly just a summary.

There's probably lots of small signals of "the user is happy with the output" plus the longer the history the more it will converge on the middle of being what you want. Including when the user says "don't do [x]" which override past stuff.