←back to thread

392 points diyer22 | 1 comments | | HN request time: 0s | source

I invented Discrete Distribution Networks, a novel generative model with simple principles and unique properties, and the paper has been accepted to ICLR2025!

Modeling data distribution is challenging; DDN adopts a simple yet fundamentally different approach compared to mainstream generative models (Diffusion, GAN, VAE, autoregressive model):

1. The model generates multiple outputs simultaneously in a single forward pass, rather than just one output. 2. It uses these multiple outputs to approximate the target distribution of the training data. 3. These outputs together represent a discrete distribution. This is why we named it "Discrete Distribution Networks".

Every generative model has its unique properties, and DDN is no exception. Here, we highlight three characteristics of DDN:

- Zero-Shot Conditional Generation (ZSCG). - One-dimensional discrete latent representation organized in a tree structure. - Fully end-to-end differentiable.

Reviews from ICLR:

> I find the method novel and elegant. The novelty is very strong, and this should not be overlooked. This is a whole new method, very different from any of the existing generative models. > This is a very good paper that can open a door to new directions in generative modeling.

Show context
serf ◴[] No.45537539[source]
isn't this kind of like an 80% vq-vae?
replies(1): >>45538108 #
1. diyer22 ◴[] No.45538108[source]
No, DDN and VQ-VAE are clearly different.

Similarities: - Both map data to a discrete latent space.

Differences: - VQ-VAE needs an external prior over code indices (e.g. PixelCNN or a hierarchical prior) to model distribution. DDN builds its own hierarchical discrete distribution and can even act as the prior for a VQ-VAE-like system. - DDN’s K outputs are features that change with the input; VQ-VAE’s codebook is a set of independent parameters (embeddings) that remain fixed regardless of the input. - VQ-VAE produces a 2-D grid of code indices; DDN yields a 1-D/tree-structured latent. - VQ-VAE needs Straight-Through Estimator. - DDN supports zero-shot conditional generation.

So I’d call them complementary rather than “80 % the same.” (See the paper’s “Connections to VQ-VAE.”)