←back to thread

456 points ph4evers | 2 comments | | HN request time: 0.397s | source

I've been working on a little side project that combines Duolingo-like listening comprehension exercises with real content .

Every video is transcribed to get much better transcripts than the closed captions. I filter on high quality transcripts, and afterwards a LLM selects only plausible segments for the exercises. This seems to work well for quality control and seems to be reliable enough for these short exercises.

Would love your thoughts!

1. Vinnl ◴[] No.43544186[source]
I just tried the Dutch (my native language) version, and it looks neat, but at some point it asked me to type Emmeloord, which is a small town in the Netherlands. That would be very challenging for someone learning the language without being relatively familiar with the Netherlands, so maybe you can tell the LLM to avoid names?
replies(1): >>43544330 #
2. ph4evers ◴[] No.43544330[source]
Hah thanks for the suggestion. I'll make it more strict!