Defining Statistical Models in Jax?

(statmodeling.stat.columbia.edu)

118 points hackandthink | 2 comments | 13 Oct 24 15:20 UTC | HN request time: 0.471s | source

Show context

JHonaker ◴[17 Oct 24 13:28 UTC] No.41869481[source]▶

I'm very excited by the work being put in to make Bayesian inference more manageable. It's in a spot that feels very similar to deep learning circa mid-2010s when Caffe, Torch, and hand-written gradients were the options. We can do it, but doing anything more complicated than common model structures like hierarchical Gaussian linear models requires dropping out of the nice places and into the guts.

I've had a lot of success with Numpyro (a JAX library), and used quite a lot of tools that are simpler interfaces to Stan. I've also had to write quite a few model-specific things from scratch by hand (more for sequential Monte Carlo than MCMC). I'm very excited for a world where PPLs become scalable and easier to use /customize.

> I think there is a good chance that normalizing flow-based variational inference will displace MCMC as the go-to method for Bayesian posterior inference as soon as everyone gets access to good GPUs.

Wow. This is incredibly surprising. I'm only tangentially aware of normalizing flows, but apparently I need to look at the intersection of them and Bayesian statistics now! Any sources from anyone would be most appreciated!

replies(3): >>41869632 #>>41869638 #>>41874672 #

sarosh ◴[17 Oct 24 13:46 UTC] No.41869632[source]▶

>>41869481 #

Defer to other experts, but (briefly) normalizing flows are a method for constructing complex distributions by transforming a probability density through a series of invertible transformations. Normalizing flows are trained using a plain log-likelihood function, and they are capable of exact density evaluation and efficient sampling. See:

Danilo Rezende and Shakir Mohamed. Variational inference with normalizing flows. In ICML, 2015. Link: https://bigdata.duke.edu/wp-content/uploads/2022/08/1505.057...

Laurent Dinh, David Krueger, and Yoshua Bengio. Nice: Non-linear independent components estimation. In ICLR Workshop, 2015. Link: https://arxiv.org/pdf/1410.8516

And for your direct question, the following paper "Efficient Bayesian Sampling Using Normalizing Flows to Assist Markov Chain Monte Carlo Methods" appears upon a superficial glance to be relevant. Link: https://arxiv.org/pdf/2107.08001

replies(2): >>41869811 #>>41870592 #

1980phipsi ◴[17 Oct 24 15:31 UTC] No.41870592[source]▶

>>41869632 #

So it's like converting a normal distribution to log normal (and then back). But a more general way of thinking about it.

Where does the name "normalizing flows" come from?

replies(1): >>41871586 #

1. hotstickyballs ◴[17 Oct 24 17:18 UTC] No.41871586[source]▶

>>41870592 #

It comes from the Jacobian which you can get from auto diff. It measures how much distortion the function created and normalizes it so that you can integrate correctly without blowing up gradients

replies(1): >>41874986 #

2. theGnuMe ◴[17 Oct 24 23:47 UTC] No.41874986[source]▶

>>41871586 (TP) #

I mean the whole thing sounds like a deep neural network…

↑