Most active commenters
  • rezivor(4)

←back to thread

137 points rezivor | 16 comments | | HN request time: 0.293s | source | bottom

Hey HN! Built this because I was tired of waiting hours for transcription services and didn't want to upload sensitive recordings to the cloud.

  Real metrics from my M1 Max: 4.5hr video file transcribed in 3 minutes 32
  seconds. Works completely offline.

   First 5 HN users who click the button on the page get it free. Literally promo code straight to the app sore  


  Key differences vs Rev/Otter:
  - No 2-hour file limits (handles any length)
  - Timecodes stay accurate on long files (no drift from chunking)
  - Supports MP3, WAV, MP4, MOV, M4A, FLAC
  - Exports to SRT, VTT, JSON, PDF, DOCX, CSV, Markdown

  Built for macOS. Happy to answer questions!
1. nubg ◴[] No.45591597[source]
How does it compare to MacWhisper?
replies(1): >>45591639 #
2. rezivor ◴[] No.45591639[source]
MacWhisper crashes at about an hour of context. This uses, smart, invisible regex in the text generation pipe. Makes this fast. + bonus, there is no context limit
replies(7): >>45591738 #>>45591977 #>>45592072 #>>45593177 #>>45594771 #>>45594799 #>>45595635 #
3. barapa ◴[] No.45591738[source]
Smart invisible regex makes it fast and prevents it from crashing? What does that mean?
4. grosswait ◴[] No.45591977[source]
I've done 3+hours with MacWhisper without issue? One downside is the transcription is not real time - can Scriber Pro do realtime?
replies(1): >>45596835 #
5. fl_rn_st ◴[] No.45592072[source]
"Smart, invisible regex" sounds like a lot of bs... could you give a more technical explanation?

Also the Whisper model doesn't really have a context window, it already segments the audio with a certain amount of overlap between the chunks, I really have a hard time understanding what you are trying to say here.

replies(1): >>45592208 #
6. rezivor ◴[] No.45592208{3}[source]
Whisper will fail > 99%* (edit, most of the time) of the time at lengths over 90 minutes and fairly high over one hour.
replies(4): >>45592339 #>>45592617 #>>45593059 #>>45595625 #
7. fl_rn_st ◴[] No.45592339{4}[source]
This is just plain wrong. I have my own Whisper App in the AppStore (on iOS, with very limited memory capacity) and there are no problems at all with longer Audio / Video files.
replies(1): >>45592579 #
8. rezivor ◴[] No.45592579{5}[source]
I've never had whisper complete a single attempt a anything over 75 min
9. saaaaaam ◴[] No.45592617{4}[source]
This is absolutely not my experience. I regularly (weekly at least) use whisper for 90-120 minutes pieces of content and only rarely have problems.
10. gcr ◴[] No.45593059{4}[source]
I’ve used whisper-cop on 5-hour podcasts without problems.

Would also love to hear what you mean by “smart invisible regex,” sounds like AI slop to me.

11. gcr ◴[] No.45593177[source]
What do you mean context limit?

Neither whisper nor MacWhisper have any context limit

12. CharlesW ◴[] No.45594771[source]
> MacWhisper crashes at about an hour of context.

This is not true. (I've been a MacWhisper user since 2023. I have two bugs during that time, which the author addressed quickly.)

13. fady0 ◴[] No.45594799[source]
I am a MacWhisper Pro user, and I successfully transcribed and translated a 15-hour course inside the app without any issues
14. pmarreck ◴[] No.45595625{4}[source]
Can't really declare that without declaring which whisper model in particular you are referring to, as there are a number of them
15. pmarreck ◴[] No.45595635[source]
> Smart invisible regex

I've never heard a regex person speak this way of a regex.

Please tell me you didn't vibecode the regex... one of the areas it's still not good at

16. KPGv2 ◴[] No.45596835{3}[source]
I haven't worked in a while with transcription, but whisper.cpp itself (which I assume is the underlying tech behind MacWhisper) does realtime transcription on my MBP with an M1 Pro chip. When I first started writing my last completed novel, I fired it up and just started telling the story to test it out. Realtime.

That was back in 2023. I assume things work better now.