←back to thread

167 points xnx | 4 comments | | HN request time: 0.204s | source
Show context
tripplyons ◴[] No.44527652[source]
For those who aren't aware, OpenAI has a very similar batch mode (50% discount if you wait up to 24 hours): https://platform.openai.com/docs/api-reference/batch

It's nice to see competition in this space. AI is getting cheaper and cheaper!

replies(4): >>44528108 #>>44528444 #>>44528451 #>>44532342 #
1. fantispug ◴[] No.44528108[source]
Yes, this seems to be a common capability - Anthropic and Mistral have something very similar as do resellers like AWS Bedrock.

I guess it lets them better utilise their hardware in quiet times throughout the day. It's interesting they all picked 50% discount.

replies(3): >>44528237 #>>44529423 #>>44532883 #
2. qrian ◴[] No.44528237[source]
Bedrock has a batch mode but only for claude 3.5 which is like one year old, which isn't very useful.
3. calaphos ◴[] No.44529423[source]
Inference throughout scales really well with larger batch sizes (at the cost of latency) due to rising arithmetic intensity and the fact that it's almost always memory BW limited.
4. briangriffinfan ◴[] No.44532883[source]
50% is my personal threshold for a discount going from not worth it to worth it.