←back to thread

612 points meetpateltech | 2 comments | | HN request time: 0.424s | source
Show context
Ninjinka ◴[] No.42951897[source]
Pricing is CRAZY.

Audio input is $0.70 per million tokens on 2.0 Flash, $0.075 for 2.0 Flash-Lite and 1.5 Flash.

For gpt-4o-mini-audio-preview, it's $10 per million tokens of audio input.

replies(2): >>42952141 #>>42952542 #
1. KTibow ◴[] No.42952542[source]
The increase is likely because 1.5 Flash was actually cheaper than all other STT services. I wrote about this a while ago at https://ktibow.github.io/blog/geminiaudio/.
replies(1): >>42953271 #
2. radeeyate ◴[] No.42953271[source]
I feel that the audio interpreting aspects of the Gemini models aren't just STT. If you give it something like a song, it can give you information about it.