(nathan.rs)

454 points nathan-barry | 1 comments | 20 Oct 25 14:31 UTC | HN request time: 0.248s | source

1. blurbleblurble ◴[20 Oct 25 18:00 UTC] No.45647025[source]▶

I'm more excited about approaches like this one:

https://openreview.net/forum?id=c05qIG1Z2B

They're doing continuous latent diffusion combined with autoregressive transformer-based text generation. The autoencoder and transformer are (or can be) trained in tandem.

↑

BERT is just a single text diffusion step