←back to thread

Google Titans architecture, helping AI have long-term memory

(research.google)

584 points Alifatisk | 1 comments | 07 Dec 25 12:23 UTC | HN request time: 0.218s | source

Show context

kgeist ◴[07 Dec 25 14:57 UTC] No.46182122[source]▶

>>46181231 (OP) #

>The model uses this internal error signal (the gradient) as a mathematical equivalent of saying, "This is unexpected and important!" This allows the Titans architecture to selectively update its long-term memory only with the most novel and context-breaking information

So one can break a model by consistently feeding it with random, highly improbable junk? Everything would be registered as a surprise and get stored, impacting future interactions

replies(6): >>46182150 #>>46182410 #>>46182651 #>>46183200 #>>46183413 #>>46193429 #

1. andy12_ ◴[07 Dec 25 17:33 UTC] No.46183413[source]▶

This is an oversimplification of what Titans does. The model performs nested learned, where the model learns during inference, and during training the model weights learn _how and what_ to learn during inference. If the input contains junk of irrelevant information, the model most likely learned during training to assign low surprise query and key embeddings to those tokens, because learning those junk tokens would have hurt the overall ability of the model to predict subsequent next tokens (and thus, it would have had increased the training loss).