←back to thread

584 points Alifatisk | 1 comments | | HN request time: 0s | source
Show context
kgeist ◴[] No.46182122[source]
>The model uses this internal error signal (the gradient) as a mathematical equivalent of saying, "This is unexpected and important!" This allows the Titans architecture to selectively update its long-term memory only with the most novel and context-breaking information

So one can break a model by consistently feeding it with random, highly improbable junk? Everything would be registered as a surprise and get stored, impacting future interactions

replies(6): >>46182150 #>>46182410 #>>46182651 #>>46183200 #>>46183413 #>>46193429 #
bethekidyouwant ◴[] No.46182651[source]
In what world can you not always break the response of an AI by feeding it a bunch of random junk?
replies(3): >>46182745 #>>46182845 #>>46186503 #
kgeist ◴[] No.46182845[source]
I mean, currently LLMs are stateless and you can get rid of all the poisoned data by just starting a new conversation (context). And OP introduces "long-term memory" where junk will accumulate with time
replies(2): >>46182912 #>>46184405 #
soerxpso ◴[] No.46184405{3}[source]
I believe you're misunderstanding what the OP means about "long-term" memory. From what I can tell, it's not actively modifying the weights of the underlying model, it just "remembers" things from a high number of tokens into the past of its context. The point is that this allows it to remember something it read ~200 pages ago in a very long context window, not that it can remember something from one session into another clean session.
replies(1): >>46186798 #
AlexCoventry ◴[] No.46186798{4}[source]
This model has fast weights, which actually are modified during inference.
replies(1): >>46187397 #
1. energy123 ◴[] No.46187397{5}[source]
Marketplace for fast weights inbound