Happy to answer any questions here. I kept my analysis really high level for a general audience but since this is HN, we can get a bit nerdy :D
Happy to answer any questions here. I kept my analysis really high level for a general audience but since this is HN, we can get a bit nerdy :D
I think in a language with a lot of similar sounds or even homophones, longer words are easier. For a beginner Chinese speaker that knows both words, hearing "chē" will probably be ambiguous, but "chūzūchē" will be parsed immediately.
I don’t think the ‘longer equals harder’ pattern holds for every language. I actually reached out to the head teacher at CIJ when I first made this analysis and she said the same.
Much of the beginner videos make use of visual hints like you say (images, props, etc), and none of these were taken into account in my analysis.
I do think it could be cool to do a 'visual' analysis of CI in the future where you attempt to measure how much context is present (or not) in each video and see what insights you could draw from that.
I will note that the transcripts (and parsing scripts) are not included in the repo. The transcripts are not my intellectual property so I can't share it (and the parsing scripts are a bit of a dumpster fire).
Avoiding unknown vocabulary, or including just a small amount that can be inferred from context; avoiding rare grammatical rules; avoiding stuffing too many clauses into sentences, keeping them short.
Just like a language has a large vocabulary of words of which only a subset is common, a similar observation holds for the grammar rules. Some are used only in very formal/erudite speech or writing. Also, just like your active vocab is not as large as the vocab you understand, the same goes for grammar: you don't wield as many constructs as you grow.
Semantically, avoiding obscure cultural references, culturally rooted unstraightforward metaphors, figures of speech or idioms.
Avoiding difficult topics. E.g. "I have a pen" vs. explaining Karl Popper's logical positivism.
It's much easier to acquire the "household" dialect of a language than to be able to understand news about politics, scientific papers, or literary essays.