(george.mand.is)

740 points georgemandis | 5 comments | 25 Jun 25 13:17 UTC | HN request time: 0.627s | source

Show context

georgemandis ◴[25 Jun 25 13:17 UTC] No.44376990[source]▶

I was trying to summarize a 40-minute talk with OpenAI’s transcription API, but it was too long. So I sped it up with ffmpeg to fit within the 25-minute cap. It worked quite well (Up to 3x speeds) and was cheaper and faster, so I wrote about it.

Felt like a fun trick worth sharing. There’s a full script and cost breakdown.

replies(1): >>44378167 #

bravesoul2 ◴[25 Jun 25 15:03 UTC] No.44378167[source]▶

>>44376990 #

You could have kept quiet and started a cheaper than openai transcription business :)

replies(4): >>44378890 #>>44379081 #>>44379840 #>>44380550 #

1. ilyakaminsky ◴[25 Jun 25 18:39 UTC] No.44380550[source]▶

>>44378167 #

I've already done that [1]. A fraction of the price, 24-hour limit per file, and speedup tricks like the OP's are welcome. :)

[1] https://speechischeap.com

replies(2): >>44382158 #>>44393218 #

2. bravesoul2 ◴[25 Jun 25 21:55 UTC] No.44382158[source]▶

>>44380550 (TP) #

Nice. Don't expect you to spill the beans but is it doing OK (some customers?)

Just wondering if I cam build a retirement out of APIs :)

replies(1): >>44384932 #

3. ilyakaminsky ◴[26 Jun 25 07:04 UTC] No.44384932[source]▶

>>44382158 #

It's sustainable, but not enough to retire on at this point.

> Just wondering if I cam build a retirement out of APIs :)

I think it's possible, but you need to find a way to add value beyond the commodity itself (e.g., audio classification and speaker diarization in my case).

4. satvikpendem ◴[27 Jun 25 02:21 UTC] No.44393218[source]▶

>>44380550 (TP) #

Can it do real-time transcription with diarization? I'm looking for that for a product feature I'm working on. Currently I've seen Speechmatics do this well, haven't heard of others.

replies(1): >>44501440 #

5. ilyakaminsky ◴[08 Jul 25 16:16 UTC] No.44501440[source]▶

>>44393218 #

Not yet. The gains in efficiency come from optimizing the speedup factor. Real-time audio cannot be processed any faster than 1× by definition.

↑

OpenAI charges by the minute, so speed up your audio