←back to thread

437 points Vinnl | 6 comments | | HN request time: 0.617s | source | bottom
Show context
explodes ◴[] No.43984193[source]
Wouldn't it be nice if policy changes were accompanied by an A/B testing plan to evaluate their impact? I have always thought so. I have also seen a major pitfall of A/B testing that real humans can hand-pick and slice data to make it sound as positive or negative as wanted. Nonetheless, the more data the better.
replies(9): >>43984911 #>>43984957 #>>43985195 #>>43985261 #>>43985635 #>>43986010 #>>43986433 #>>43990930 #>>43996358 #
sc68cal ◴[] No.43985261[source]
We already had A/B testing of congestion pricing. The A test was without congestion pricing in NYC, and has been tested for decades.
replies(3): >>43985550 #>>43985682 #>>43990138 #
1. bunderbunder ◴[] No.43985550[source]
That's not an A/B test because it has no way of controlling for broader economic trends over time. How do you figure out if what you're seeing is because of that one thing that changed, or the enormous list of other things that also changed around the same time?

A more valid design would be randomly assigning some cities to institute congestion pricing, and other cities to not have it. Obviously not feasible in practice, but that's at least the kind of thing to strive toward when designing these kinds of studies.

replies(3): >>43985801 #>>43991066 #>>43994453 #
2. jannyfer ◴[] No.43985801[source]
That would be a bad design for an A/B study (and NYC congestion pricing is not a “study” anyway), because cities are few and not alike and have an enormous list of other things that are different. What NYC equivalent would you pick?

In any case, not every policy change needs to be an academic exercise.

replies(1): >>43986301 #
3. bunderbunder ◴[] No.43986301[source]
Yup, that is indeed a part of the problem. You'll notice I did say, "Obviously not feasible in practice."

I've got a textbook on field experiments that refers to these kinds of questions as FUQ - acronym for "Fundamentally Unanswerable Questions". You can collect suggestive evidence, but firmly establishing cause and effect is something you've just got to let go of.

4. JumpCrisscross ◴[] No.43991066[source]
> randomly assigning some cities to institute congestion pricing, and other cities to not have it

Cities are stupidly heterogenous. These data wouldn't be more meaningful than comparing cities with congestion pricing to those without. (And comparing them from their congestion eras.)

replies(1): >>44006238 #
5. sorcerer-mar ◴[] No.43994453[source]
Everyone knows how you can conduct good experiments in a land of frictionless spherical cows.
6. bunderbunder ◴[] No.44006238[source]
What you're telling me here is that you aren't aware of what the randomization is for in randomized controlled trials.

"Our treatment units are stupidly heterogeneous" is exactly the problem it solves. A century's worth of developing increasingly sophisticated statistical techniques for making do without random assignment has thus far failed to accomplish anything than provisional mitigations that are notoriously easy to use incorrectly in practice.