Microsoft and OpenAI's close partnership shows signs of fraying

(www.nytimes.com)

326 points jhunter1016 | 2 comments | 18 Oct 24 11:11 UTC | HN request time: 0.432s | source

Show context

Roark66 ◴[18 Oct 24 12:02 UTC] No.41878594[source]▶

>OpenAI plans to loose $5 billion this year

Let that sink in for anyone that has incorporated Chatgpt in their work routines to the point their normal skills start to atrophy. Imagine in 2 years time OpenAI goes bust and MS gets all the IP. Now you can't really do your work without ChatGPT, but it cost has been brought up to how much it really costs to run. Maybe $2k per month per person? And you get about 1h of use per day for the money too...

I've been saying for ages, being a luditite and abstaining from using AI is not the answer (no one is tiling the fields with oxen anymore either). But it is crucial to at the very least retain 50% of capability hosted models like Chatgpt offer locally.

replies(20): >>41878631 #>>41878635 #>>41878683 #>>41878699 #>>41878717 #>>41878719 #>>41878725 #>>41878727 #>>41878813 #>>41878824 #>>41878984 #>>41880860 #>>41880934 #>>41881556 #>>41881938 #>>41882059 #>>41883046 #>>41883088 #>>41883171 #>>41885425 #

sebzim4500 ◴[18 Oct 24 12:20 UTC] No.41878719[source]▶

>>41878594 #

The marginal cost of inference per token is lower than what OpenAI charges you (IIRC about 2x cheaper), they make a loss because of the enormous costs of R&D and training new models.

replies(4): >>41878823 #>>41878875 #>>41878927 #>>41879029 #

1. diggan ◴[18 Oct 24 12:34 UTC] No.41878823[source]▶

>>41878719 #

Did OpenAI publish concrete numbers regarding this, or where are you getting this data from?

replies(1): >>41881067 #

2. lukeschlather ◴[18 Oct 24 16:39 UTC] No.41881067[source]▶

>>41878823 (TP) #

https://news.ycombinator.com/item?id=41833287

This says 506 tokens/second for Llama 405B on a machine with 8x H200s which you can rent for $4/GPU so probably $40/hour for a server with enough GPUs. And so it can do ~1.8M tokens per hour. OpenAI charges $10/1M output tokens for GPT4o. (input tokens and cached tokens are cheaper, but this is just ballpark estimates.) So if it were 405B it might cost $20/1M output tokens.

Now, OpenAI is a little vague, but they have implied that GPT4o is actually only 60B-80B parameters. So they're probably selling it with a reasonable profit margin assuming it can do $5/1M output tokens at approximately 100B parameters.

And even if they were selling it at cost, I wouldn't be worried because a couple years from now Nvidia will release H300s that are at least 30% more efficient and that will cause a profit margin to materialize without raising prices. So if I have a use case that works with today's models, I will be able to rent the same thing a year or two from now for roughly the same price.

↑