Most active commenters
  • tjnaylor(3)
  • geraldalewis(3)

←back to thread

256 points hirundo | 18 comments | | HN request time: 0.212s | source | bottom
1. Galanwe ◴[] No.35519824[source]
Can someone actually explain how IQ tests work? By work, I mean how are the tests engineered, and the results computed.

Long time ago someone explained to me that the engineering of IQ tests was actually drafted from a very large pool of (regularly updated) questions, where statistical significance was extracted to form a _core symposium_ of questions to sample from. Also, the IQ score itself was normalized to be normally distributed centered at 100.

With this understanding, I was under the impression that IQ was a relative measure, at a specific point in time, of one's placement in the distribution.

Which meant to me that IQ cannot "drop" across a population, the mean will always be 100. And IQ scores cannot be compared on a time series basis, since they are only cross sectional measures.

Is that all wrong? Is there some truth to it?

replies(7): >>35519844 #>>35520383 #>>35520430 #>>35520609 #>>35520643 #>>35520745 #>>35537495 #
2. zaptheimpaler ◴[] No.35519844[source]
There is a raw score underlying any given IQ test that is an absolute value. It might just be as simple as the number of questions you get right. When testing a population, these scores form a normal distribution. We then scale the raw scores so that the mean/median or center of the distribution becomes an IQ of 100. So the raw scores can be compared across time and can vary, even though the IQ cannot as you said.
replies(1): >>35519897 #
3. Galanwe ◴[] No.35519897[source]
Hum, but my understanding was that the whole point of the normalization was that the raw scores are not in a scale that is meaningful outside of the symposium which they were placed in?

Does it really make sense to compare raw scores from different tests? If that were the case then the normalization step would be useless, we would have an absolute measure of intelligence.

replies(2): >>35520091 #>>35520576 #
4. whatshisface ◴[] No.35520091{3}[source]
A new test that was substantially different from an earlier test would have to be compared against it to make sure it was measuring the same stuff, and I have seen a few studies checking how well different tests were correlated.
5. btilly ◴[] No.35520383[source]
An IQ test is a relative measure of where you are relative to the population at the point of time where the test was normalized.

That point of time is somewhere in the past. And when tests are renormalized, there is a conversion of "this score on the old test is that score on the new test". This allows for comparisons of IQ over time, across different versions of the same test. This is how the Flynn effect was first discovered.

Tests usually have solid conversions between them, you can can compare across different tests as well. Such conversions allow more verification of the Flynn effect.

If you go back far enough, you will find tests for children measuring mental age vs physical age, taking a ratio, and multiplying by 100. They fell on a distribution that was close enough to normal centered at 100 with a standard deviation of 15 or 16 that adult tests were developed to match them.

Since we don't know all the factors behind the Flynn effect in the first place, we also don't know all the factors behind why it might be reversing now.

6. outlace ◴[] No.35520430[source]
My understanding at a high level is that an IQ test is basically made by generating a bunch of cognitive tasks that putatively test different aspects of cognitive functioning (e.g. verbal reasoning, visuospatial reasoning, attention, working memory, etc.). For most people, performance on one type of cognitive task (e.g. verbal reasoning) is highly correlated with performance on the other types of cognitive tasks.

This allows you to model the test as a hierarchical statistical model with some general intelligence factor (denoted G) at the top and then specific cognitive tasks branch off from there. You can then infer what G is just by statistical inference on the "branches" (the performance on the individual cognitive tasks); similar to how you might infer someone's height if you only had access to their leg and arm lengths, as these are highly correlated with each other and also with height.

I believe IQ scores are always population normed to have a mean of 100 but unnormalized scores are likely available to compare across time.

replies(1): >>35520983 #
7. zaptheimpaler ◴[] No.35520576{3}[source]
Well yes, if you look at the paper summary they are comparing the scores for an overlapping set of questions, all from one question bank used in many tests if i understand correctly.
8. tjnaylor ◴[] No.35520609[source]
https://www.youtube.com/watch?v=UBc7qBS1Ujo&t=5877s

It's a 2 hour video essay that covers psychometry generally as a lens to understanding a book called the Bell Curve that was a flashpoint for questions about the validity of IQ science generally (but most especially how it applies to race). It took a good chunk of my Sunday to get through it, but it was really enjoyable and gave me a ton of insight into a ton of buzzwords and studies I had heard of but couldn't really dig my teeth into.

I think with this topic, getting a briefer, less nuanced summary than something like this would be a mistake because of how much misunderstanding of these topics permeates popular culture. The video also provides a number of studies and books to keep going beyond the relatively breif 2 hours of content it provides.

It uses the famed/infamous book the Bell Curve as a case study and delves into how they were originally created, how they are updated, how the term hereditery when used in genetics means something that is sometimes counter-intuitive to the definition used in popular culture (for example whether someone wears earings has high heritability, whereas having 2 arms has effectively zero herritability) the statistical meaninfullness of factorization (G-factor) of domains of IQ into a single numerical value, how these domains came to be defined, the current state of understanding regarding the local vs enviornmental source of IQ for individuals, how the Flynn effect was observed, etc.

But to answer a bit of your earlier question. When IQ tests are created, they create a set of questions, test it on a sample group, and set the average value to 100 and higher/lower scores depending on what the distribution of correct answers is. The Flynn effect happened because researchers noticed while the average for new tests always is set at 100, people scoring 100 on a more recent test were generally scoring even higher on previous years tests. The article of the reverse Flynn effect is a little bit sensationalist because as it mentions while some areas (like Spatial Reasoning) are improving, others are apparently starting to get lower. This calls into question a bit the idea of a G-Factor which is an assumption that their is a common factor of intelligence that covaries across all IQ domains (spatial reasoning, reaction time, etc... ) which is the theoretical reasoning behind IQ being meaningfully represented as a single numerical value rather than a multi-dimensional value.

replies(3): >>35520879 #>>35521488 #>>35525655 #
9. geraldalewis ◴[] No.35520643[source]
I think this video by `Shaun` is informative and well-researched (articles and books are cited). It's long; the creator isn't especially succinct, but it's as entertaining as the material allows, and the subject's complex enough to warrant a video that's a couple of hours long. https://youtu.be/UBc7qBS1Ujo
replies(1): >>35521561 #
10. ramblenode ◴[] No.35520745[source]
IQ is a construct representing intelligence. The aim is to refine this construct so that it is a) externally consistent with what constitutes our understanding of human intelligence and b) internally consistent so that various things we deem high or low IQ are consistent with other things we deem high or low IQ. The things here are items, which are question:answer pairs.

IQ is an example of factor analysis [0] where an unobserved "general intelligence factor" g is derived from observed items. The items are chosen so that responses correlate with each other and with g (basically, if the same person took two IQ tests with different questions then the score should be about the same). There may be some intermediate factors like verbal ability or spatial ability. The items here will be chosen so that they correlate only with items of the same factor--e.g. verbal items correlate with verbal items but do not correlate with spatial items.

A raw score on the test is not meaningful; individuals are compared against the population of test takers to determine their rank in the population. First, raw scores of the population are normalized so that the mean is 100 and the standard deviation is 15 (this is arbitrary; it's just the scale they use). Then an individual test taker can be compared with the population on this scale. The percentile rank can be obtained directly from the IQ score and vice versa.

You have some questions about the shifting mean of 100. In practice the normed distribution is computed from a norming group rather than recomputing the norm after ever test. A particular population can shift from the norming group (either over time or because it's a group with different characteristics) which is where things like the Flynn Effect come from. So a lot depends on the norming group.

I hope that answered some of your questions.

[0] https://en.wikipedia.org/wiki/Factor_analysis

replies(1): >>35533798 #
11. geraldalewis ◴[] No.35520931{3}[source]
He went through the trouble of citing his sources, you should too.
replies(1): >>35521492 #
12. geraldalewis ◴[] No.35520983[source]
> For most people, performance on one type of cognitive task (e.g. verbal reasoning) is highly correlated with performance on the other types of cognitive tasks.

It's cool that that's your understanding, but you're wading into territory that gets people sterilized and killed. I have a lot of trust in science, but not here. This video was informative for me: https://youtu.be/UBc7qBS1Ujo

For example, here's an IQ test; let's say its given to 5,000,000 Canadians, and 3,000 Texans (1):

  * What is the capital city of Canada?
  * Which Canadian province is the largest by land area?
  * Who is considered the "Father of Medicare" in Canada?
  * Name the two official languages of Canada.
  * Which Canadian team won the Stanley Cup in 2017?
  * What is the national sport of Canada?
  * What is the name of Canada's national anthem?
  * Name the famous Canadian dish made with fries, cheese curds, and gravy.
  * Who was the first Prime Minister of Canada?
  * In which Canadian city is the CN Tower located?
I would expect a normal distribution for Canadians, but for the Texans to score in the bottom quintile (regardless of "general intelligence").

(1) I asked ChatGPT to come up with this test.

(edit: formatting)

replies(1): >>35521056 #
13. chongli ◴[] No.35521056{3}[source]
This is why IQ tests switched to Raven’s progressive matrices [1] a long time ago. RPM avoids these issues of cultural bias.

[1] https://en.wikipedia.org/wiki/Raven's_Progressive_Matrices?w...

14. tjnaylor ◴[] No.35521116{3}[source]
Do you have a specific example(s) of nonsense being presented in the video?
replies(1): >>35521494 #
15. watwut ◴[] No.35525655[source]
Lets start by Bell Curve not being an actual science.
replies(1): >>35529038 #
16. tjnaylor ◴[] No.35529038{3}[source]
The video pretty thoroughly establishes just that.
17. naijaboiler ◴[] No.35533798[source]
IQ measures something but that something is definitely intelligence, no matter what people try to tell you. Let's all just stop pretending it measures intelligence
18. MagicMoonlight ◴[] No.35537495[source]
It’s a hard test that doesn’t require specific knowledge and has a short time limit to prevent bruteforcing.

The questions are in difficulty order and they’re all worth the same so because of the time limit you won’t be able to answer them all correctly. The quicker you are at solving problems, the higher the score you will get.

Your score then shows how you compare to the population percentage wise and that’s your IQ. The average score would give you an average IQ.