Pricing is CRAZY.
Audio input is $0.70 per million tokens on 2.0 Flash, $0.075 for 2.0 Flash-Lite and 1.5 Flash.
For gpt-4o-mini-audio-preview, it's $10 per million tokens of audio input.
replies(2):
https://ai.google.dev/gemini-api/docs/audio?lang=rest#techni...