Honest feedback - I was really excited when I read the opening. However, I did not come away from this without a greater understanding than I already had.
For reference, my initial understanding was somewhat low: basically I know a) what embedding is basically b) transformers work by matrix multiplication, and c) it's something like a multi-threaded Markov chain generator with the benefit of prior-trained embeddings
replies(8):