Are OpenAI and Anthropic losing money on inference?

(martinalderson.com)

507 points martinald | 1 comments | 28 Aug 25 10:15 UTC | HN request time: 0s | source

Show context

WhitneyLand ◴[28 Aug 25 18:05 UTC] No.45055133[source]▶

Model context limits are not “artificial” as claimed.

The largest context window a model can offer at a given quality level depends on the context size the model was pretrained with as well as specific fine tuning techniques.

It’s not simply a matter of considering increased costs.

replies(1): >>45055226 #

Der_Einzige ◴[28 Aug 25 18:15 UTC] No.45055226[source]▶

>>45055133 #

Context extension methods exist and work. Please educate yourself about these rather than confidentially saying wrong things.

replies(1): >>45074750 #

1. WhitneyLand ◴[30 Aug 25 13:59 UTC] No.45074750[source]▶

>>45055226 #

Not sure what you’re disagreeing with? Context window size limits are not artificial. It takes real time/money/resources to increase them.

There are a few ways to approach the problem. Pre-training on longer context lengths I’ve already mentioned. Fine-tuning techniques (like LongRoPE) I’ve already mentioned.

Inference time context extension tricks I didn’t mention because the papers I’ve seen seem to suggest there’s often problems with quality or unfavorable tradeoffs.

There’s no magic way around these limits, it’s a real engineering problem.

↑