←back to thread

437 points Vinnl | 2 comments | | HN request time: 0.44s | source
Show context
explodes ◴[] No.43984193[source]
Wouldn't it be nice if policy changes were accompanied by an A/B testing plan to evaluate their impact? I have always thought so. I have also seen a major pitfall of A/B testing that real humans can hand-pick and slice data to make it sound as positive or negative as wanted. Nonetheless, the more data the better.
replies(9): >>43984911 #>>43984957 #>>43985195 #>>43985261 #>>43985635 #>>43986010 #>>43986433 #>>43990930 #>>43996358 #
Calwestjobs ◴[] No.43985195[source]
test A - before

test B - after

what are you talking about ?

replies(3): >>43985239 #>>43985257 #>>43985425 #
shadowgovt ◴[] No.43985257[source]
Generally, that's considered to introduce counfounding factors on the time axis ("did we see improvement because we changed something or because flu season hit and people stayed home") that you'd prefer to mitigate by running your A and B simultaneously.

But in the absence of the ability to run them simultaneously, "A is before and B is after" can be a fine proxy. Of course, if B is worse, it'd be nice if you could only subject, say, 5% of your population to it before you just slam the slider to 100% and hit everyone with it.

replies(1): >>43986182 #
1. Calwestjobs ◴[] No.43986182[source]
yes, but how the hell he proposes to make A/B testing of "whole Manhattan policy"? build another Manhattan just for test? makes no sense. whole manhattan is important. not 5%. so no 5%. a/b test can be done only for things which affect one person, like for example GUI etc, big group under test but effect on individuals,

in such big scale a/b test is tool to deceive, not to get to right conclusion

replies(1): >>43987238 #
2. shadowgovt ◴[] No.43987238[source]
It is, indeed, much easier to do A/B testing online in environments you control than IRL.

(Purely hypothetically: one could identify 10% of the island as operating under the new rules and compare outcomes. This is politically fraught on multiple levels and also gives messy spatial results.)