←back to thread

161 points belleville | 1 comments | | HN request time: 0.211s | source
Show context
gwern ◴[] No.43677261[source]
https://www.reddit.com/r/MachineLearning/comments/1jsft3c/r_...

I'm still not quite sure how to think of this. Maybe as being like unrolling a diffusion model, the equivalent of BPTT for RNNs?

replies(2): >>43677696 #>>43684636 #
1. ActorNightly ◴[] No.43684636[source]
I think we need to start thinking about one shot training. I.e instead of context into LLM, you should be able to tell it a fact, and it will encode that fact into the updated weights.