←back to thread

817 points dynm | 1 comments | | HN request time: 0.252s | source
Show context
mg ◴[] No.43307263[source]
This is great. The author defines their own metrics, is doing their own A/B tests and publishes their interpretation plus the raw data. Imagine a world where all health blogging was like that.

Personally, I have not published any results yet, but I have been doing this type of experiments for 4 years now. And collected 48874 data points so far. I built a simple system to do it in Vim:

https://www.gibney.org/a_syntax_for_self-tracking

I also built a bunch of tooling to analyze the data.

I think that mankind could greatly benefit from more people doing randomized studies on their own. Especially if we find a way to collectively interpret the data.

So I really applaud the author for conducting this and especially for providing the raw data.

Reading through the article and the comments here on HN, I wish there was more focus on the interpretation of the experiment. Pretty much all comments here seem to be anecdotal.

Let's look at the author's interpretation. Personally, I find that part a bit short.

They calculated 4 p-values and write:

    Technically, I did find two significant results.
I wonder what "Technically" means here. Are there "significant results" that are "better" than just "technically significant results"?

Then they continue:

    Of course, I don’t think this
    means I’ve proven theanine is harmful.
So what does it mean? What was the goal of collecting the data? What would the interpretation have been if the data would show a significant positive effect of Theanine?

It's great that they offer the raw data. I look forward to taking a look at it later today.

replies(14): >>43307304 #>>43307775 #>>43307806 #>>43307937 #>>43308201 #>>43308318 #>>43308320 #>>43308521 #>>43308854 #>>43309271 #>>43310099 #>>43320433 #>>43333903 #>>43380374 #
matthewdgreen ◴[] No.43308318[source]
This is an N=1 trial. Dressing your N=1 trial up with lots of pseudo controls and pseudo blinding and data collection does not make it better. In fact: putting this much effort into any medication trial makes it much more likely that you’re going to be incentivized to find effects that don’t exist. I think it’s nice that the author admits that they found nothing, but statistically, worthless drugs show effects in much better-designed trials than this one: it’s basically a coin toss.
replies(12): >>43308423 #>>43308440 #>>43309081 #>>43309263 #>>43309513 #>>43309704 #>>43309722 #>>43309838 #>>43311651 #>>43313233 #>>43315219 #>>43322581 #
robwwilliams ◴[] No.43308440[source]
Complete injustice to this lovely study. Why do you say unblinded? Why do you insult a time series study as “dressing up with lots of data”? Would you rather see less data? Or are you volunteering to be test subject #2? Show us how to do it right Dr. M.!

In my opinion this is an exemplary N=1 study that is well designed and thoughtfully executed. Deserve accolades, not derision. And the author even recognizes possible improvements.

Unlike most large high N clinical trials this is a high resolution longitudinal trial, and it is perfectly “controlled” for genetic difference (none), well controlled for environment, and there is only one evaluator.

Compare this to the messy and mostly useless massive studies of human depression reviewed by Jonathan Flint.

https://pubmed.ncbi.nlm.nih.gov/36702864/

replies(6): >>43309100 #>>43309447 #>>43309575 #>>43312415 #>>43313221 #>>43316813 #
jrootabega ◴[] No.43309575[source]
If he said unblinded at some point, it could have been because the study author looked into the cup to determine which substance had been taken too soon. The subject should have had no knowledge of what was taken until the entire 16-month trial was over.

We should avoid extreme polarization of our judgments in general. The study deserves some amount of praise for things it did somewhat well (like the method of blinding which is clever, but not applicable to everyone), and criticism for things it did not do well, such as designing your own study methodology for your own mood. That alone will affect the results. Simply RUNNING an experiment can affect your mood because it's interesting (or even maybe frustrating). The subject probably felt pride and satisfaction whenever they used their pill selection technique, which could improve mood on its own. Neither accolades nor complete derision are appropriate, although trying to claim too strong a result from this study is kinda deserving of derision if you claim to be science-minded.

The study was well-meaning and displayed cleverness.

replies(1): >>43310475 #
robwwilliams ◴[] No.43310475[source]
And that is exactly the point made in the target post by the author. He explicitly raised that criticism himself. Double kudos for self-criticism. You will not find many conventional science publications pointing out: “Shucks, we could have done this a better”.
replies(1): >>43310551 #
jrootabega ◴[] No.43310551[source]
The ancestor post is neither a "Complete injustice" nor "derision" nor an "insult", and it doesn't warrant a hostile mocking reply. Its tone could have been gentler, but it wasn't that bad. And the study doesn't really deserve "accolades", it deserves to be recognized for whatever it does well. Such polarization of tone and vocabulary doesn't accomplish much, and I'll even propose that it actually prevents good things from happening. It is good that the author is aware of, and acknowledges, the problems in the study. What other studies and journals have done wrong doesn't make the author or study more deserving of praise.

Also, you asked why he said "unblinded", and I think you now have the answer to that.

replies(1): >>43316199 #
1. robwwilliams ◴[] No.43316199[source]
Yes, perhaps. But please tell me you have read the original post. It is thoughtful, self-deprecatory, careful, well analyzed, and upfront about limitations and possible improvements.

Re-reading such a negative critique of a solid home-brew experiment is unwarranted. There are several word here worth red flags.

>This is an N=1 trial. Dressing your N=1 trial up with lots of pseudo controls and pseudo blinding and data collection does not make it better. In fact: putting this much effort into any medication trial makes it much more likely that you’re going to be incentivized to find effects that don’t exist. I think it’s nice that the author admits that they found nothing, but statistically, worthless drugs show effects in much better-designed trials than this one: it’s basically a coin toss.