←back to thread

584 points Alifatisk | 1 comments | | HN request time: 0.205s | source
Show context
nubg ◴[] No.46181850[source]
Very interesting. Is it correct for me to imagine it as some kind of "LoRA" thats continuously adapted as the model goes through its day?

If so, could there perhaps be a step where the LoRA is merged back into the main model?

That would be like sleeping :-)

replies(2): >>46181941 #>>46183818 #
1. andy12_ ◴[] No.46183818[source]
Kind-of. You could theoretically use LoRA for this, in fact, but it probably wouldn't have enough capacity to make it a proper substitute of the attention mechanism. Instead a full MLP is trained as input chunks get processed.