←back to thread

Learning to Learn

(kevin.the.li)
320 points jklm | 1 comments | | HN request time: 0s | source
Show context
dinobones ◴[] No.41910980[source]
I've been wanting to try this approach for learning a language.

In English for example, learning the 800 most common words, you can understand 75% of the language: https://www.bbc.com/news/world-44569277.

I'd love to start fresh on a new language, take 800 new words, try to learn 10 a day, and see where I get after 3 months. Can I really understand 75% of text if I have perfect recall of those 800 words?

replies(9): >>41911047 #>>41911098 #>>41911173 #>>41911321 #>>41911390 #>>41912168 #>>41912979 #>>41913128 #>>41924943 #
joshdavham ◴[] No.41911321[source]
> Can I really understand 75% of text if I have perfect recall of those 800 words?

This thing you're talking about is called 'word coverage'. It's the percentage of words you know in a given text. I've created lots of word coverage graphs in the past, and, as research has shown, you won't really be understanding much until you reach the high 90s in terms of word coverage. The famous number for being able to read English texts extensively requires a word coverage of around 98%. And while it depends on the text, in order to reach 98%, you generally need to know around the top 5k words in a language.

Funny enough, when you understand 75% of the words in a text, you subjectively feel like you're understanding like 10% of what's going on.

replies(2): >>41913229 #>>41917894 #
creamyhorror ◴[] No.41917894[source]
Yep, 75% coverage is too low for significant comprehension. You normally need 95% for decent comprehension and 98% for comfortable reading.

The coverage required in Japanese (my target language) seems something like the most frequent 15,000 words (depending on the definition of word) are required for 98% coverage. At 12,000 words it becomes viable to read with some comprehension and semi-frequent dictionary lookups.

Also, interestingly, you need about 2x the number of words in Japanese as English to reach 87% coverage:

"It has been reported that 2,000 high-frequent English words cover 87% of tokens (Nation, 1990). In case of Japanese, 4,024 SUWs are required to cover 87% of tokens." (Text Readability and Word Distribution in Japanese, Satoshi Sato)

replies(1): >>41920000 #
1. joshdavham ◴[] No.41920000[source]
You sound like my kind of nerd!

You might wanna check out this analysis I did last week: https://cij-analysis.streamlit.app/

I do a little bit of Japanese word coverage analysis in it, among other things.