←back to thread

454 points nathan-barry | 1 comments | | HN request time: 0.206s | source
1. briandw ◴[] No.45645540[source]
I love seeing these simple experiments. Easy to read through quickly and understand a bit more of the principles.

One of my stumbling blocks with text diffusers is that ideally you wouldn’t treat the tokens as discrete but rather probably fields. Image diffusers have the natural property that a pixel is a continuous value. You can smoothly transition from one color to another. Not so with tokens. In this case they just do a full replacement. You can’t add noise to a token, you have to work in the embedding space. But how can you train embeddings directly? I found a bunch of different approaches that have been tried but they are all much more complicated than the image based diffusion process.