←back to thread

817 points dynm | 1 comments | | HN request time: 0.215s | source
Show context
mg ◴[] No.43307263[source]
This is great. The author defines their own metrics, is doing their own A/B tests and publishes their interpretation plus the raw data. Imagine a world where all health blogging was like that.

Personally, I have not published any results yet, but I have been doing this type of experiments for 4 years now. And collected 48874 data points so far. I built a simple system to do it in Vim:

https://www.gibney.org/a_syntax_for_self-tracking

I also built a bunch of tooling to analyze the data.

I think that mankind could greatly benefit from more people doing randomized studies on their own. Especially if we find a way to collectively interpret the data.

So I really applaud the author for conducting this and especially for providing the raw data.

Reading through the article and the comments here on HN, I wish there was more focus on the interpretation of the experiment. Pretty much all comments here seem to be anecdotal.

Let's look at the author's interpretation. Personally, I find that part a bit short.

They calculated 4 p-values and write:

    Technically, I did find two significant results.
I wonder what "Technically" means here. Are there "significant results" that are "better" than just "technically significant results"?

Then they continue:

    Of course, I don’t think this
    means I’ve proven theanine is harmful.
So what does it mean? What was the goal of collecting the data? What would the interpretation have been if the data would show a significant positive effect of Theanine?

It's great that they offer the raw data. I look forward to taking a look at it later today.

replies(14): >>43307304 #>>43307775 #>>43307806 #>>43307937 #>>43308201 #>>43308318 #>>43308320 #>>43308521 #>>43308854 #>>43309271 #>>43310099 #>>43320433 #>>43333903 #>>43380374 #
1. gus_massa ◴[] No.43320433[source]
>> Technically, I did find two significant results.

The problem is that he is comparing two very different things, the level when he took the pill and the level one hour after that. So it's not surprising that they are different. Let's imagine a very very very stupid experiment, where the problem is more obvious.

Does Coca Cola or Pepsi improve luck? N=1000000, double blind randomized controlled trial.

1) Each subject flips a coin. tail=0, head=1.

2) They drink a glas of soda, 50% Coke or 50% Pepsi, that is served in a hidden place and nor the subject or the experimenter know which one.

3) They roll a dice (an usual one, D6)

Results:

* Average before Coke = 0.5002

* Average after Coke = 3.5005

* Average before Pepsi = 0.5004

* Average after Pepsi = 3.5003

So the conclusion is that Coke improves the average (p<1E-a-lot) and Pepsi improves the average (p<1E-a-lot). Both are "technically" statistically significant (but it's caused by a horrible experiment design).

Unsurprisingly, the difference in the average after drinking Coke or Pepsi is not statistically significant (p<.something).

(I'm too lazy to run a simulation now, but it's not difficult to get realistic averages and p values.)

In conclusion, the useful result is the comparison of the anxiety after taking both drugs, not the difference of before and after taking them.

As the article says:

>> So I propose a new rule: Blind trial or GTFO.