(nathan.rs)

454 points nathan-barry | 1 comments | 20 Oct 25 14:31 UTC | HN request time: 0.206s | source

Show context

rafaelero ◴[20 Oct 25 16:11 UTC] No.45645557[source]▶

The problem with this approach to text generation is that it's still not flexible enough. If during inference the model changes its mind and wants to output something considerably different it can't because there are too many tokens already in place.

replies(3): >>45645633 #>>45646473 #>>45647311 #

nodja ◴[20 Oct 25 17:18 UTC] No.45646473[source]▶

>>45645557 #

That's not true, you could just have looked at the first gif animation in the OP and seen that tokens disappear, the only part that stays untouched is the prompt, adding noise is part of the diffusion process and the code that does it is even posted in the article (ctrl+f "def diffusion_collator").

replies(1): >>45647760 #

1. rafaelero ◴[20 Oct 25 18:59 UTC] No.45647760[source]▶

>>45646473 #

Looks like you are correct.

↑

BERT is just a single text diffusion step