Bayesian Statistics: The three cultures

(statmodeling.stat.columbia.edu)

309 points luu | 2 comments | 26 Jul 24 17:15 UTC | HN request time: 0.403s | source

Show context

tfehring ◴[26 Jul 24 19:59 UTC] No.41081746[source]▶

The author is claiming that Bayesians vary along two axes: (1) whether they generally try to inform their priors with their knowledge or beliefs about the world, and (2) whether they iterate on the functional form of the model based on its goodness-of-fit and the reasonableness and utility of its outputs. He then labels 3 of the 4 resulting combinations as follows:

    ┌───────────────┬───────────┬──────────────┐
    │               │ iteration │ no iteration │
    ├───────────────┼───────────┼──────────────┤
    │ informative   │ pragmatic │ subjective   │
    │ uninformative │     -     │ objective    │
    └───────────────┴───────────┴──────────────┘

My main disagreement with this model is the empty bottom-left box - in fact, I think that's where most self-labeled Bayesians in industry fall:

- Iterating on the functional form of the model (and therefore the assumed underlying data generating process) is generally considered obviously good and necessary, in my experience.

- Priors are usually uninformative or weakly informative, partly because data is often big enough to overwhelm the prior.

The need for iteration feels so obvious to me that the entire "no iteration" column feels like a straw man. But the author, who knows far more academic statisticians than I do, explicitly says that he had the same belief and "was shocked to learn that statisticians didn’t think this way."

replies(3): >>41081867 #>>41082105 #>>41084103 #

klysm ◴[26 Jul 24 20:15 UTC] No.41081867[source]▶

>>41081746 #

The no iteration thing is very real and I don’t think it’s even for particularly bad reasons. We iterate on models to make them better, by some definition of better. It’s no secret that scientific work is subject to rather perverse incentives around thresholds of significance and positive results. Publish or perish. Perverse incentives lead to perverse statistics.

The iteration itself is sometimes viewed directly as a problem. The “garden of forking paths”, where the analysis depends on the data, is viewed as a direct cause for some of the statistical and epistemological crises in science today.

Iteration itself isn’t inherently bad. It’s just that the objective function usually isn’t what we want from a scientific perspective.

To those actually doing scientific work, I suspect iterating on their models feels like they’re doing something unfaithful.

Furthermore, I believe a lot of these issues are strongly related to the flawed epistemological framework which many scientific fields seem to have converged: p<0.05 means it’s true, otherwise it’s false.

edit:

Perhaps another way to characterize this discomfort is by the number of degrees of freedom that the analyst controls. In a Bayesian context where we are picking priors either by belief or previous data, the analyst has a _lot_ of control over how the results come out the other end.

I think this is why fields have trended towards a set of ‘standard’ tests instead of building good statistical models. These take most of the knobs out of the hands of the analyst, and generally are more conservative.

replies(3): >>41081904 #>>41082486 #>>41082720 #

slashdave ◴[26 Jul 24 20:19 UTC] No.41081904[source]▶

>>41081867 #

In particle physics, it was quite fashionable (and may still be) to iterate on blinded data (data deliberated altered by a secret, random number, and/or relying entirely on Monte Carlo simulation).

replies(2): >>41082107 #>>41082159 #

1. klysm ◴[26 Jul 24 20:41 UTC] No.41082107[source]▶

>>41081904 #

Interesting I wasn’t aware of that. Another thing I’ve only briefly read about is registering studies in advance, and quite literally preventing iteration.

replies(1): >>41085898 #

2. disgruntledphd2 ◴[27 Jul 24 11:36 UTC] No.41085898[source]▶

>>41082107 (TP) #

Given the set of scientific publication assumptions (predominantly p<=0.05) this can easily allow one to find whatever proof you were looking for, which is problematic.

That being said, it's completely fair to use cross-validation and then run models on train, iterate with test and then finally calculate p-values with validation.

The problem with that approach is that you need to collect much, much more data than people generally would. Given that most statistical tests were developed for a small data world, this can often work but in some cases (medicine, particularly) it's almost impossible and you need to rely on the much less useful bootstrapping or LOO-CV approaches.

I guess the core problem is that the methods of statistical testing assume no iteration, but actually understanding data requires iteration, so there's a conflict here.

If the scientific industry was OK with EDAs being published to try to tease out work for future experimental studies then we'd see more of this, but it's hard to get an EDA published so everyone does the EDA, and then rewrites the paper as though they'd expected whatever they found from the start, which is the worst of both worlds.

↑