←back to thread

167 points xnx | 1 comments | | HN request time: 0.621s | source
Show context
tripplyons ◴[] No.44527652[source]
For those who aren't aware, OpenAI has a very similar batch mode (50% discount if you wait up to 24 hours): https://platform.openai.com/docs/api-reference/batch

It's nice to see competition in this space. AI is getting cheaper and cheaper!

replies(4): >>44528108 #>>44528444 #>>44528451 #>>44532342 #
fantispug ◴[] No.44528108[source]
Yes, this seems to be a common capability - Anthropic and Mistral have something very similar as do resellers like AWS Bedrock.

I guess it lets them better utilise their hardware in quiet times throughout the day. It's interesting they all picked 50% discount.

replies(3): >>44528237 #>>44529423 #>>44532883 #
1. calaphos ◴[] No.44529423[source]
Inference throughout scales really well with larger batch sizes (at the cost of latency) due to rising arithmetic intensity and the fact that it's almost always memory BW limited.