←back to thread

456 points ph4evers | 2 comments | | HN request time: 0.42s | source

I've been working on a little side project that combines Duolingo-like listening comprehension exercises with real content .

Every video is transcribed to get much better transcripts than the closed captions. I filter on high quality transcripts, and afterwards a LLM selects only plausible segments for the exercises. This seems to work well for quality control and seems to be reliable enough for these short exercises.

Would love your thoughts!

1. Miraltar ◴[] No.43544697[source]
That's neat ! Although I got an issue on the Finnish challenge, when I drag the (correct) word "koho" it transforms into the (incorrect) word "koko". I thought I missclicked and tried the whole challenge again but I reproduced it despite being very careful.
replies(1): >>43545191 #
2. ph4evers ◴[] No.43545191[source]
Thanks for trying, and sorry about that. I thought that the videos for Finnish where on a decent enough level (only checked one). I'm afraid the transcription quality is not on par yet for Finnish. I'll add a warning for the smaller languages and hopefully the models will improve.