Bayesian Statistics: The three cultures

(statmodeling.stat.columbia.edu)

309 points luu | 1 comments | 26 Jul 24 17:15 UTC | HN request time: 0.258s | source

Show context

thegginthesky ◴[26 Jul 24 17:55 UTC] No.41080693[source]▶

I miss the college days where professors would argue endlessly on Bayesian vs Frequentist.

The article is very well succinct and even explains why even my Bayesian professors had different approaches to research and analysis. I never knew about the third camp, Pragmatic Bayes, but definitely is in line with a professor's research that was very through on probability fit and the many iteration to get the prior and joint PDF just right.

Andrew Gelman has a very cool talk "Andrew Gelman - Bayes, statistics, and reproducibility (Rutgers, Foundations of Probability)", which I highly recommend for many Data Scientists

replies(4): >>41080841 #>>41080979 #>>41080990 #>>41087094 #

spootze ◴[26 Jul 24 18:11 UTC] No.41080841[source]▶

>>41080693 #

Regarding the frequentist vs bayesian debates, my slightly provocative take on these three cultures is

- subjective Bayes is the strawman that frequentist academics like to attack

- objective Bayes is a naive self-image that many Bayesian academics tend to possess

- pragmatic Bayes is the approach taken by practitioners that actually apply statistics to something (or in Gelman’s terms, do science)

replies(3): >>41081070 #>>41081400 #>>41083494 #

DebtDeflation ◴[26 Jul 24 19:15 UTC] No.41081400[source]▶

>>41080841 #

A few things I wish I knew when took Statistics courses at university some 25 or so years ago:

- Statistical significance testing and hypothesis testing are two completely different approaches with different philosophies behind them developed by different groups of people that kinda do the same thing but not quite and textbooks tend to completely blur this distinction out.

- The above approaches were developed in the early 1900s in the context of farms and breweries where 3 things were true - 1) data was extremely limited, often there were only 5 or 6 data points available, 2) there were no electronic computers, so computation was limited to pen and paper and slide rules, and 3) the cost in terms of time and money of running experiments (e.g., planting a crop differently and waiting for harvest) were enormous.

- The majority of classical statistics was focused on two simple questions - 1) what can I reliably say about a population based on a sample taken from it and 2) what can I reliably about the differences between two populations based on the samples taken from each? That's it. An enormous mathematical apparatus was built around answering those two questions in the context of the limitations in point #2.

replies(2): >>41081784 #>>41084820 #

1. ivan_ah ◴[26 Jul 24 20:04 UTC] No.41081784[source]▶

>>41081400 #

That was a nice summary.

The data-poor and computation-poor context of old school statistics definitely biased the methods towards the "recipe" approach scientists are supposed to follow mechanically, where each recipe is some predefined sequence of steps, justified based on an analytical approximations to a sampling distribution (given lots of assumptions).

In modern computation-rich days, we can get away from the recipes by using resampling methods (e.g. permutation tests and bootstrap), so we don't need the analytical approximation formulas anymore.

I think there is still room for small sample methods though... it's not like biological and social sciences are dealing with very large samples.

↑