(developers.googleblog.com)

1. anupj ◴[11 Jul 25 09:25 UTC] No.44530084[source]▶

Batch Mode for the Gemini API feels like Google’s way of asking, “What if we made AI more affordable and slower, but at massive scale?” Now you can process 10,000 prompts like “Summarize each customer review in one line” for half the cost, provided you’re willing to wait until tomorrow for the results.

replies(4): >>44530624 #>>44531272 #>>44533342 #>>44534982 #

2. dist-epoch ◴[11 Jul 25 10:43 UTC] No.44530624[source]▶

>>44530084 (TP) #

Most LLM providers have batch mode. Not sure why you are calling them out.

replies(1): >>44534996 #

3. diggan ◴[11 Jul 25 12:16 UTC] No.44531272[source]▶

>>44530084 (TP) #

> Now you can process 10,000 prompts like “Summarize each customer review in one line” for half the cost, provided you’re willing to wait until tomorrow for the results.

Sounds like a great option to have available? Not every task I use LLMs for need immediate responses, and if I wasn't using local models for those things, getting a 50% discount and having to wait a day sounds like a fine tradeoff.

4. XTXinverseXTY ◴[11 Jul 25 15:33 UTC] No.44533342[source]▶

>>44530084 (TP) #

This is an extremely common use case.

Reading your comment history: are you an LLM?

https://news.ycombinator.com/item?id=44531907

https://news.ycombinator.com/item?id=44531868

5. okdood64 ◴[11 Jul 25 17:39 UTC] No.44534982[source]▶

>>44530084 (TP) #

I don't understand the point you're making. This has been a commonly used offering since cloud blew up.

https://aws.amazon.com/ec2/spot/

6. okdood64 ◴[11 Jul 25 17:40 UTC] No.44534996[source]▶

>>44530624 #

I'll take it further. Regular cloud compute have batch workload capabilities at cheaper rates, as well since forever.

↑

Batch Mode in the Gemini API: Process More for Less