←back to thread

566 points PaulHoule | 5 comments | | HN request time: 0.627s | source
Show context
JimDabell ◴[] No.44490396[source]
Pricing:

US$0.000001 per output token ($1/M tokens)

US$0.00000025 per input token ($0.25/M tokens)

https://platform.inceptionlabs.ai/docs#models

replies(1): >>44490656 #
asaddhamani ◴[] No.44490656[source]
The pricing is a little on the higher side. Working on a performance-sensitive application, I tried Mercury and Groq (Llama 3.1 8b, Llama 4 Scout) and the performance was neck-and-neck but the pricing was way better for Groq.

But I'll be following diffusion models closely, and I hope we get some good open source ones soon. Excited about their potential.

replies(1): >>44492609 #
1. tripplyons ◴[] No.44492609[source]
Good to know. I didn't realize how good the pricing is on Groq!
replies(2): >>44493536 #>>44497444 #
2. tlack ◴[] No.44493536[source]
If your application is pricing sensitive, check out DeepInfra.com - they have a variety of models in the pennies-per-mil range. Not quite as fast as Mercury, Groq or Samba Nova though.

(I have no affiliation with this company aside from being a happy customer the last few years)

replies(1): >>44498128 #
3. sexeriy237 ◴[] No.44497444[source]
You're getting the savings by shifting the pollution of the datacenter onto a largely black community and choking them out.
replies(1): >>44498135 #
4. asaddhamani ◴[] No.44498128[source]
DeepInfra is amazing in terms of price, like really, they have the Qwen3 embedding model for $0.002 per mn tokens. That's an order of magnitude cheaper than most alternatives with better benchmark scores. But the performance P99 is slow and the variance is huge. For latency sensitive workloads it's problematic, if they can fix that it'll be a no-brainer to use them. DeepInfra does tend to have the lowest prices of any API provider.
5. JimDabell ◴[] No.44498135[source]
Are you confusing the AI company Groq with xAI, Elon Musk’s AI company that has a model called Grok?