←back to thread

566 points PaulHoule | 3 comments | | HN request time: 0.875s | source
1. thelastbender12 ◴[] No.44490388[source]
The speed here is super impressive! I am curious - are there any qualitative ways in which modeling text using diffusion differs from that using autoregressive models? The kind of problems it works better on, creativity, and similar.
replies(1): >>44491026 #
2. orbital-decay ◴[] No.44491026[source]
One works in the coarse-to-fine direction, another works start-to-end. Which means different directionality biases, at least. Difference in speed, generalization, etc. is less clear and needs to be proven in practice, as fundamentally they are closer than it seems. Diffusion models have some well-studied shortcuts to trade speed for quality, but nothing stops you from implementing the same for the other type.
replies(1): >>44494575 #
3. ekunazanu ◴[] No.44494575[source]
I once read that diffusion is essentially just autoregression in the frequency domain. Honestly, that comparison didn’t seem too far off.