←back to thread

470 points ph4evers | 1 comments | | HN request time: 0.201s | source

I've been working on a little side project that combines Duolingo-like listening comprehension exercises with real content .

Every video is transcribed to get much better transcripts than the closed captions. I filter on high quality transcripts, and afterwards a LLM selects only plausible segments for the exercises. This seems to work well for quality control and seems to be reliable enough for these short exercises.

Would love your thoughts!

1. mitthrowaway2 ◴[] No.43550708[source]

I tried Japanese; the Youtube video that autoplayed had its timing slightly off so that instead of saying あたまも疲れました I only heard まも疲れました. It was pretty confusing but fortunately the answer was displayed right in the video because the video itself had its text spelled out.

https://app.fluentsubs.com/exercises/cm8v909oq00fj9x1kztl1ez...