Mercury: Ultra-fast language models based on diffusion

(arxiv.org)

568 points PaulHoule | 1 comments | 07 Jul 25 12:31 UTC | HN request time: 0.207s | source

Show context

chc4 ◴[07 Jul 25 13:19 UTC] No.44490102[source]▶

Using the free playground link, and it is in fact extremely fast. The "diffusion mode" toggle is also pretty neat as a visualization, although I'm not sure how accurate it is - it renders as line noise and then refines, while in reality presumably those are tokens from an imprecise vector in some state space that then become more precise until it's only a definite word, right?

replies(3): >>44490131 #>>44490209 #>>44492011 #

icyfox ◴[07 Jul 25 16:30 UTC] No.44492011[source]▶

>>44490102 #

Some text diffusion models use continuous latent space but they historically haven't done that well. Most the ones we're seeing now typically are trained to predict actual token output that's fed forward into the next time series. The diffusion property comes from their ability to modify previous timesteps to converge on the final output.

I have an explanation about one of these recent architectures that seems similar to what Mercury is doing under the hood here: https://pierce.dev/notes/how-text-diffusion-works/

replies(1): >>44495285 #

1. chc4 ◴[07 Jul 25 22:34 UTC] No.44495285[source]▶

>>44492011 #

Oh neat, thanks! The OP is surprisingly light on details on how it actually works and is mostly benchmarks, so this is very appreciated :)

↑