Most active commenters

jayd16(6)
lelanthran(4)
tick_tock_tick(3)

Popular/hot comments

>>45106298 #
>>45109892 #

←back to thread

Anthropic raises $13B Series F

(www.anthropic.com)

Show context

llamasushi ◴[02 Sep 25 16:29 UTC] No.45105325[source]▶

>>45104907 (OP) #

The compute moat is getting absolutely insane. We're basically at the point where you need a small country's GDP just to stay in the game for one more generation of models.

What gets me is that this isn't even a software moat anymore - it's literally just whoever can get their hands on enough GPUs and power infrastructure. TSMC and the power companies are the real kingmakers here. You can have all the talent in the world but if you can't get 100k H100s and a dedicated power plant, you're out.

Wonder how much of this $13B is just prepaying for compute vs actual opex. If it's mostly compute, we're watching something weird happen - like the privatization of Manhattan Project-scale infrastructure. Except instead of enriching uranium we're computing gradient descents lol

The wildest part is we might look back at this as cheap. GPT-4 training was what, $100M? GPT-5/Opus-4 class probably $1B+? At this rate GPT-7 will need its own sovereign wealth fund

replies(48): >>45105396 #>>45105412 #>>45105420 #>>45105480 #>>45105535 #>>45105549 #>>45105604 #>>45105619 #>>45105641 #>>45105679 #>>45105738 #>>45105766 #>>45105797 #>>45105848 #>>45105855 #>>45105915 #>>45105960 #>>45105963 #>>45105985 #>>45106070 #>>45106096 #>>45106150 #>>45106272 #>>45106285 #>>45106679 #>>45106851 #>>45106897 #>>45106940 #>>45107085 #>>45107239 #>>45107242 #>>45107347 #>>45107622 #>>45107915 #>>45108298 #>>45108477 #>>45109495 #>>45110545 #>>45110824 #>>45110882 #>>45111336 #>>45111695 #>>45111885 #>>45111904 #>>45111971 #>>45112441 #>>45112552 #>>45113827 #

jayd16 ◴[02 Sep 25 16:46 UTC] No.45105619[source]▶

>>45105325 #

In this imaginary timeline where initial investments keep increasing this way, how long before we see a leak shutter a company? Once the model is out, no one would pay for it, right?

replies(6): >>45105704 #>>45105708 #>>45105778 #>>45105857 #>>45106040 #>>45112321 #

1. wmf ◴[02 Sep 25 16:58 UTC] No.45105778[source]▶

>>45105619 #

You can't run Claude on your PC; you need servers. Companies that have that kind of hardware are not going to touch a pirated model. And the next model will be out in a few months anyway.

replies(1): >>45106298 #

2. jayd16 ◴[02 Sep 25 17:30 UTC] No.45106298[source]▶

>>45105778 (TP) #

If it was worth it, you'd see some easy self hostable package, no? And by definition, its profitable to self host or these AI companies are in trouble.

replies(3): >>45107146 #>>45107372 #>>45109892 #

3. quotemstr ◴[02 Sep 25 18:31 UTC] No.45107146[source]▶

>>45106298 #

Does your "self hostable package" come with its own electric substation?

replies(1): >>45108582 #

4. serf ◴[02 Sep 25 18:48 UTC] No.45107372[source]▶

>>45106298 #

I think this misunderstands the scale of these models.

And honestly I don't think a lot of these companies would turn a profit on pure utility -- the electric and water company doesn't advertise like these groups do; I think that probably means something.

replies(1): >>45108646 #

5. jayd16 ◴[02 Sep 25 20:29 UTC] No.45108582{3}[source]▶

>>45107146 #

You're saying that's needed for inference?

6. jayd16 ◴[02 Sep 25 20:34 UTC] No.45108646{3}[source]▶

>>45107372 #

What's the scale for inference? Is it truly that immense? Can you ballpark what you think would make such a thing impossible?

> the electric and water company doesn't advertise like these groups do

I'm trying to understand what you mean here. In the US these utilities usually operate in a monopoly so there's no point in advertising. Cell service has plenty of advertising though.

7. tick_tock_tick ◴[02 Sep 25 22:23 UTC] No.45109892[source]▶

>>45106298 #

You need a 100+gigs ram and a top of the line GPU to run legacy models at home. Maybe if you push it that setup will let you handle 2 people maybe 3 people. You think anyone is going to make money on that vs $20 a month to anthropic?

replies(3): >>45112200 #>>45112210 #>>45112761 #

8. ◴[03 Sep 25 04:13 UTC] No.45112200{3}[source]▶

>>45109892 #

9. jayd16 ◴[03 Sep 25 04:15 UTC] No.45112210{3}[source]▶

>>45109892 #

Can you explain to me where Anthropic (or it's investors) expect to be making money if that's what it actually costs to run this stuff?

replies(1): >>45112785 #

10. lelanthran ◴[03 Sep 25 06:20 UTC] No.45112761{3}[source]▶

>>45109892 #

> You need a 100+gigs ram and a top of the line GPU to run legacy models at home. Maybe if you push it that setup will let you handle 2 people maybe 3 people.

This doesn't seem correct. I run legacy models with only slightly reduced performance on 32GB RAM with a 12GB VRAM GPU right now. BTW, that's not an expensive setup.

> You think anyone is going to make money on that vs $20 a month to anthropic?

Why does it have to be run as a profit-making machine for other users? It can run as a useful service for the entire household, when running at home. After all, we're not talking about specialised coding agents using this[1], just normal user requests.

====================================

[1] For an outlay of $1k for a new GPU I can run a reduced-performance coding LLM. Once again, when it's only myself using it, the economics work out. I don't need the agent to be fully autonomous because I'm not vibe coding - I can take the reduced-performance output, fix it and use it.

replies(2): >>45118444 #>>45122107 #

11. lelanthran ◴[03 Sep 25 06:23 UTC] No.45112785{4}[source]▶

>>45112210 #

> Can you explain to me where Anthropic (or it's investors) expect to be making money if that's what it actually costs to run this stuff?

Not the GP (in fact I just replied to GP, disagreeing with them), but I think that economies of scale kick in when you are provisioning M GPUs for N users and both M and N are large.

When you are provisioning for N=1 (a single user), then M=1 is the minimum you need, which makes it very expensive per user. When N=5 and M is still 1, then the cost per user is roughly a fifth of the original single-user cost.

12. jayd16 ◴[03 Sep 25 17:33 UTC] No.45118444{4}[source]▶

>>45112761 #

Plus, when you're hosting it yourself, you can be reckless with what you feed it. Pricing in the privacy gain, it seems like self hosting would be worth the effort/cost.

13. tick_tock_tick ◴[04 Sep 25 00:44 UTC] No.45122107{4}[source]▶

>>45112761 #

Just your GPU not counting the rest of the system costs 4 years of subscription and with the sub you get the new models where your existing hardware will likely not be able to run it at all.

It's closer to $3k to build a machine that you can reasonable use which is 12 whole years of subscription. It's not hard to see why no one is doing it.

replies(1): >>45124259 #

14. lelanthran ◴[04 Sep 25 06:37 UTC] No.45124259{5}[source]▶

>>45122107 #

> Just your GPU not counting the rest of the system costs 4 years of subscription

With my existing setup for non-coding tasks (GPU is a 3060 12GB which I bought prior to wanting local LLM inference, but use it now for that purpose anyway) the GPU alone was a once-off ~$350 cost (https://www.newegg.com/gigabyte-windforce-oc-gv-n3060wf2oc-1...).

It gives me literally unlimited requests, not pseudo-unlimited as I get from ChatGPT, Claude and Gemini.

> and with the sub you get the new models where your existing hardware will likely not be able to run it at all.

I'm not sure about that. Why wouldn't the new LLM models run on a 4yo GPU? Wasn't a primary selling point of the newer models being "They use less computation for inference"?

Now, of course there are limitations, but for non-coding usage (of which there is a lot) this cheap setup appears to be fine.

> It's closer to $3k to build a machine that you can reasonable use which is 12 whole years of subscription. It's not hard to see why no one is doing it.

But there are people doing it. Lots, actually, and not just for research purposes. With the costs apparently still falling, with each passing month it gets more viable to self-host, not less.

The calculus looks even better when you have a small group (say 3 - 5 developers) needing inference for an agent; then you can get a 5060ti with 16GB RAM for slightly over $1000. The limited RAM means it won't perform as well, but at that performance the agent will still capable of writing 90% of boilerplate, making edits, etc.

These companies (Anthropic, OpenAI, etc) are at the bottom of the value chain, because they are selling tokens, not solutions. When you can generate your own tokens continuously 24x7, does it matter if you generate at half the speed?

replies(1): >>45124355 #

15. tick_tock_tick ◴[04 Sep 25 06:56 UTC] No.45124355{6}[source]▶

>>45124259 #

> does it matter if you generate at half the speed?

Yes, massively it's not even linear 1/2 speed is probably 1/8 or less the value of "full speed". It's going to be even more pronounced as "full speed" gets faster.

replies(1): >>45125212 #

16. lelanthran ◴[04 Sep 25 09:11 UTC] No.45125212{7}[source]▶

>>45124355 #

> Yes, massively it's not even linear 1/2 speed is probably 1/8 or less the value of "full speed". It's going to be even more pronounced as "full speed" gets faster.

I don't think that's true for most use-cases (content generation, including artwork, code/software, reading material, summarising, etc). Something that takes a day without an LLM might take only 30m with GPT5 (artwork), or maybe one hour with Claude Code.

Does the user really care that their full-day artwork task is now one hour and not 30m? Or that their full-day coding task is now only two hours, and not one hour?

After all, from day one of the ChatGPT release, literally no one complained that it was too slow (and it was much slower than it is now).

Right now no one is asking for faster token generation, everyone is asking for more accurate solutions, even at the expense of speed.

↑