←back to thread

256 points hirundo | 4 comments | | HN request time: 0.823s | source
1. fauxpause_ ◴[] No.35520905[source]
So I’m just glossing through the paper linked in the article. There is a chart showing scores per year stratified by education.

The overall trend is down, sure. But… year by year variance across education groups seems to be highly correlated. Like in 2011 there’s a positive bump in performance equally in Grad students and high school students. And in 2014 everyone dropped.

Like… come on. That seems like an enormous red flag to the validity of the measurement. The year by year variance (normalized against the trend) should be random across groups unless there is something specifically different about that year. Presuming people didn’t briefly all get smarter in 2011, it’s crazy that all groups tested better.

The only plausible answer is that the test was easier in 2011 right? And if that’s true, how confident can we really be that it isn’t just testing variance?

Edit: that is to say, per the central limit theorem, the sample means of each sub group per year should be normally distributed around the mean. Let’s assume our null hypothesis is the reverse Flynn decline. We should expect that for a given year that some groups will do worse than the trend prediction and some groups will do better. Since we instead see that all groups do about the same relative to the prediction, we should infer that our data is not a random sampling of scores that are described by the mean of that trend line. In fact, the scores seem to be better described by other unseen factors that are specific to each year.

replies(2): >>35521779 #>>35526220 #
2. ◴[] No.35521779[source]
3. theptip ◴[] No.35526220[source]
Is the test different year-on-year? I thought they were somewhat standardized instruments (or at least standard question pools drawn from randomly). Not my area of expertise though.
replies(1): >>35526751 #
4. fauxpause_ ◴[] No.35526751[source]
I’m not clear on that either. I think they use different tests but try to calibrate them to be similar year over year with a shared question bank?