So, if anybody else is frustrated and not finding anything online about this, here are a few things I learned, specifically for structured output generation (which is a main use case for batching) - the individual request JSON should resolve to this:
```json { "request": { "contents": [ { "parts": [ { "text": "Give me the main output please" } ] } ], "system_instruction": { "parts": [ { "text": "You are a main output maker." } ] }, "generation_config": { "response_mime_type": "application/json", "response_json_schema": { "type": "object", "properties": { "output1": { "type": "string" }, "output2": { "type": "string" } }, "required": [ "output1", "output2" ] } } }, "metadata": { "key": "my_id" } } ```
To get actual structured output, don't just do `generation_config.response_schema`, you need to include the mime-type, and the key should be `response_json_schema`. Any other combination will either throw opaque errors or won't trigger Structured Output (and will contain the usual LLM intros "I'm happy to do this for you...").
So you upload a .jsonl file with the above JSON, and then you try to submit it for a batch job. If something is wrong with your file, you'll get a "400" and no other info. If something is wrong with the request submission you'll get a 400 with "Invalid JSON payload received. Unknown name \"file_name\" at 'batch.input_config.requests': Cannot find field."
I got the above error endless times when trying their exact sample code: ``` BATCH_INPUT_FILE='files/123456' # File ID curl https://generativelanguage.googleapis.com/v1beta/models/gemi... \ -X POST \ -H "x-goog-api-key: $GEMINI_API_KEY" \ -H "Content-Type:application/json" \ -d "{ 'batch': { 'display_name': 'my-batch-requests', 'input_config': { 'requests': { 'file_name': ${BATCH_INPUT_FILE} } } } }" ```
Finally got the job submission working via the python api (`file_batch_job = client.batches.create()`), but remember, if something is wrong with the file you're submitting, they won't tell you what, or how.