Most active commenters

    ←back to thread

    321 points jhunter1016 | 20 comments | | HN request time: 0.831s | source | bottom
    Show context
    Roark66 ◴[] No.41878594[source]
    >OpenAI plans to loose $5 billion this year

    Let that sink in for anyone that has incorporated Chatgpt in their work routines to the point their normal skills start to atrophy. Imagine in 2 years time OpenAI goes bust and MS gets all the IP. Now you can't really do your work without ChatGPT, but it cost has been brought up to how much it really costs to run. Maybe $2k per month per person? And you get about 1h of use per day for the money too...

    I've been saying for ages, being a luditite and abstaining from using AI is not the answer (no one is tiling the fields with oxen anymore either). But it is crucial to at the very least retain 50% of capability hosted models like Chatgpt offer locally.

    replies(20): >>41878631 #>>41878635 #>>41878683 #>>41878699 #>>41878717 #>>41878719 #>>41878725 #>>41878727 #>>41878813 #>>41878824 #>>41878984 #>>41880860 #>>41880934 #>>41881556 #>>41881938 #>>41882059 #>>41883046 #>>41883088 #>>41883171 #>>41885425 #
    1. sebzim4500 ◴[] No.41878719[source]
    The marginal cost of inference per token is lower than what OpenAI charges you (IIRC about 2x cheaper), they make a loss because of the enormous costs of R&D and training new models.
    replies(4): >>41878823 #>>41878875 #>>41878927 #>>41879029 #
    2. diggan ◴[] No.41878823[source]
    Did OpenAI publish concrete numbers regarding this, or where are you getting this data from?
    replies(1): >>41881067 #
    3. ignoramous ◴[] No.41878875[source]
    > The marginal cost of inference per token is lower than what OpenAI charges you

    Unlike most Gen AI shops, OpenAI also incurs a heavy cost for traning base models gunning for SoTA, which involves drawing power from a literal nuclear reactor inside data centers.

    replies(2): >>41878936 #>>41878996 #
    4. tempusalaria ◴[] No.41878927[source]
    It’s not clear this is true because reported numbers don’t disaggregate paid subscription revenue (certainly massively GP positive) vs free usage (certainly negative) vs API revenue (probably GP negative).

    Most of their revenue is the subscription stuff, which makes it highly likely they lose money per token on the api (not surprising as they are are in price war with Google et al)

    If you have an enterprise ChatGPT sub you have to consume around 5mln tokens a month to match the cost of using the api on GPT4o. At 100 words per minute that’s 35 days on continuous typing which shows how ridiculous the costs of api vs subscription are.

    replies(1): >>41881150 #
    5. fransje26 ◴[] No.41878936[source]
    > from a literal nuclear reactor inside data centers.

    No.

    replies(1): >>41880823 #
    6. candiddevmike ◴[] No.41878996[source]
    > literal nuclear reactor inside data centers

    This is fascinating to think about. Wonder what kind of shielding/environmental controls/all other kinds of changes you'd need for this to actually work. Would rack-sized SMR be contained enough not to impact anything? Would datacenter operators/workers need to follow NRC guidance?

    replies(3): >>41880937 #>>41881030 #>>41882203 #
    7. ◴[] No.41879029[source]
    8. Tostino ◴[] No.41880823{3}[source]
    Their username is fitting though.
    replies(1): >>41881514 #
    9. talldayo ◴[] No.41880937{3}[source]
    I think the simple answer is that it doesn't make sense. Nuclear power plants generate a byproduct that inherently limits the performance of computers; heat. Having either a cooling system, reactor or turbine located inside a datacenter is immediately rendered pointless because you end up managing two competing thermal systems at once. There is no reason to localize a reactor inside a datacenter when you could locate it elsewhere and pipe the generated electricity into it via preexisting high voltage lines.
    replies(1): >>41882190 #
    10. ◴[] No.41881030{3}[source]
    11. lukeschlather ◴[] No.41881067[source]
    https://news.ycombinator.com/item?id=41833287

    This says 506 tokens/second for Llama 405B on a machine with 8x H200s which you can rent for $4/GPU so probably $40/hour for a server with enough GPUs. And so it can do ~1.8M tokens per hour. OpenAI charges $10/1M output tokens for GPT4o. (input tokens and cached tokens are cheaper, but this is just ballpark estimates.) So if it were 405B it might cost $20/1M output tokens.

    Now, OpenAI is a little vague, but they have implied that GPT4o is actually only 60B-80B parameters. So they're probably selling it with a reasonable profit margin assuming it can do $5/1M output tokens at approximately 100B parameters.

    And even if they were selling it at cost, I wouldn't be worried because a couple years from now Nvidia will release H300s that are at least 30% more efficient and that will cause a profit margin to materialize without raising prices. So if I have a use case that works with today's models, I will be able to rent the same thing a year or two from now for roughly the same price.

    12. seizethecheese ◴[] No.41881150[source]
    In summary, the original point of this thread is wrong. There’s essentially no future where these tools disappear or become unavailable at reasonable cost for consumers. Much more likely is they get way better.
    replies(2): >>41883125 #>>41884310 #
    13. ignoramous ◴[] No.41881514{4}[source]
    Bully.

    I wrote "inside" to mean that those mini reactors (300MW+) are meant to be used solely for the DCs.

    (noun: https://www.collinsdictionary.com/dictionary/english-thesaur... / https://en.wikipedia.org/wiki/Heterosemy)

    Replace it with nearby if that's makes you feel good about anyone's username.

    replies(1): >>41882539 #
    14. kergonath ◴[] No.41882190{4}[source]
    > Nuclear power plants generate a byproduct that inherently limits the performance of computers; heat.

    The reactor does not need to be in the datacenter. It can be a couple hundreds meters away, bog-standard cables would be perfectly able to move the electrons. The cables being 20m or 200m long does not matter much.

    You’re right though, putting them in the same building as a datacenter still makes no sense.

    15. kergonath ◴[] No.41882203{3}[source]
    It makes zero sense to build them in datacenters and I don’t know of any safety authority that would allow deploying reactors without serious protection measures that would at the very least impose a different, dedicated building.

    At some point it does make sense to have a small reactor powering a local datacenter or two, however. Licensing would still be not trivial.

    16. Tostino ◴[] No.41882539{5}[source]
    You are right, that wasn't a charitable reading of your comment. Should have kept it to myself.

    Sorry for being rude.

    17. jazzyjackson ◴[] No.41883125{3}[source]
    I mean use to be I could get an Uber across Manhattan for $5

    From my view chatbots are still in the "selling dollars for 90 cents" category of product, of course it sells like discounted hotcakes...

    replies(2): >>41883329 #>>41887013 #
    18. seizethecheese ◴[] No.41883329{4}[source]
    … this is conflating two things, marginal and average cost/revenue. They are very very different.
    19. tempusalaria ◴[] No.41884310{3}[source]
    Definitely they will.

    OpenAI’s potential issue is that if Google offers tokens at a 10% gross margin, OpenAI won’t be able to offer api tokens at a positive gross margin at all. Their only chance really is building a big subscription business. No way they can compete with a hyperscaler on api cost long run

    20. sebzim4500 ◴[] No.41887013{4}[source]
    The difference is that Uber was making a loss on those journeys whereas OpenAI aren't making a loss on chatgpt subscriptions.

    They make a loss overall because they spend a ton on R&D.