←back to thread

456 points ph4evers | 1 comments | | HN request time: 0.198s | source

I've been working on a little side project that combines Duolingo-like listening comprehension exercises with real content .

Every video is transcribed to get much better transcripts than the closed captions. I filter on high quality transcripts, and afterwards a LLM selects only plausible segments for the exercises. This seems to work well for quality control and seems to be reliable enough for these short exercises.

Would love your thoughts!

Show context
JimmyBuckets ◴[] No.43544136[source]
Awesome idea! Do you plan to add Portuguese soon? I found it surprising that Dutch is in there before it given there are far fewer speakers. Was this related to the amount of content available?
replies(1): >>43544295 #
ph4evers ◴[] No.43544295[source]
Thanks! And yes I'll add it soon. I'm Dutch so I could validate the videos.

> Was this related to the amount of content available?

Yes, Portuguese is available in the app, but I only transcribe the Easy Portuguese videos for now so I don't have a lot of content available at the moment.

replies(1): >>43544342 #
rlf_dev ◴[] No.43544342[source]
I checked the Portuguese content available and you should clarify it's Brazilian (and change the flag to the Brazilian one so it doesn't induce in error).
replies(1): >>43544379 #
1. ph4evers ◴[] No.43544379[source]
Yes, will do that. Thanks for the suggestion!