Most active commenters

foobiekr(4)

Running a 180B parameter LLM on a single Apple M2 Ultra

(twitter.com)

Show context

adam_arthur ◴[07 Sep 23 15:32 UTC] No.37420461[source]▶

>>37419518 (OP) #

Even a linear growth rate of average RAM capacity would obviate the need to run current SOTA LLMs remotely in short order.

Historically average RAM has grown far faster than linear, and there really hasn't been anything pressing manufacturers to push the envelope here in the past few years... until now.

It could be that LLM model sizes keep increasing such that we continue to require cloud consumption, but I suspect the sizes will not increase as quickly as hardware for inference.

Given how useful GPT-4 is already. Maybe one more iteration would unlock the vast majority of practical use cases.

I think people will be surprised that consumers ultimately end up benefitting far more from LLMs than the providers. There's not going to be much moat or differentiation to defend margins... more of a race to the bottom on pricing

replies(8): >>37420537 #>>37420948 #>>37421196 #>>37421214 #>>37421497 #>>37421862 #>>37421945 #>>37424918 #

1. cs702 ◴[07 Sep 23 16:30 UTC] No.37421497[source]▶

>>37420461 #

I agree: No one has any technological advantage when it comes to LLMs anymore. Some companies, like OpenAI, may have other advantages, like an ecosystem of developers. But most of the gobs of money that so many companies have burned to train giant proprietary models is unlikely to see any payback.

What I think will happen is that more companies will come to the realization it's in their best interest to open their giant models. The cost of training all those giant models is already a sunk cost. If there's no profit to be made by keeping a model proprietary, why not open it to gain or avoid losing mind-share, and to mess with competitors' plans?

First, it was LLaMA, with up to 65B params, opened against Meta's wishes. Then, it was LLaMA 2, with up to 70B params, opened by Meta on purpose, to mess with Google's and Microsoft/OpenAI's plans. Now, it's Falcon 180B. Like you, I'm wondering, what comes next?

replies(4): >>37421627 #>>37422256 #>>37424763 #>>37429907 #

2. bugglebeetle ◴[07 Sep 23 16:38 UTC] No.37421627[source]▶

>>37421497 (TP) #

I think it’s the opposite. Models will become more commoditized and closed/invisible as the basis of other service offerings. Apple isn’t going to start offering general API access to the model they’re training, but will bake it into a bunch of stuff and maybe give platform developers limited access. Meta will probably continue to drive the commoditization train because they have a killer ML/AI team, but the same thing will likely happen there once it’s the basis for a service that generates money.

replies(2): >>37422273 #>>37422892 #

3. foobiekr ◴[07 Sep 23 17:16 UTC] No.37422256[source]▶

>>37421497 (TP) #

The cost isn’t sunk cost at all. These models need to be trained and retrained as data sets increase. Putting aside historical cutoff points, there’s a lot of data and kinds of data not currenty used and the costs even to train the current models is incredible.

I think you guys are missing a massive technical consideration which is cost. Training cost, offering cost. As with everything else in tech, outside of the bubble created by ZIRP over the last decade and a half (and the entire two generations of tech workers who never learned this important lesson thus far in their careers), costs matter and are a primary driver of technology success.

If you attached dollar costs to these models above, if the data was available, you’d quickly discover who (if anyone) has a sustainable business model and who doesn’t.

A sustainable model is what determines long term whether w technology is available and whether that leads to further improvement (and increasing sustainability/financial value).

replies(1): >>37422898 #

4. foobiekr ◴[07 Sep 23 17:17 UTC] No.37422273[source]▶

>>37421627 #

This. We haven’t even entered the get-serious monetization era.

Now that the infinite free money pump has been turned down a bunch, we’re going to see what reality looks like.

replies(1): >>37430155 #

5. cs702 ◴[07 Sep 23 17:54 UTC] No.37422892[source]▶

>>37421627 #

Actually, we're saying the same thing: Models are becoming more commoditized, so profits will accrue, not to those companies who say they have the "best" models, but to the companies that have other kinds of advantages. When it comes to LLMs, no one has a technological advantage.

6. adam_arthur ◴[07 Sep 23 17:55 UTC] No.37422898[source]▶

>>37422256 #

GPT-4 cost on the order of $100 million, per Sam Altman.

This is orders of magnitude lower than many companies and government R&D budgets. It's easily financeable by 1000s of independently wealthy people and organizations. It's easily financeable by VC money. This is far cheaper than many other startups or product initiatives that have been tried. There are very likely to be many organizations that build models for the specific purpose of open sourcing the resulting model... the Falcon and Llama models are already proof enough of this

Costs to train equivalent models may increase in the short term due to race towards GPU consumption raising costs... but compute will get cheaper in aggregate over time due to improving compute tech.

And once the model is built it is largely a sunk cost, yes. All that needs to happen is for a single SoTA model to be made open to completely negate any advantage a competitor has. Monetization from LLMs will be driven by focused application of the models, not from providing an interface to a general model. High quality data holds more value than the resulting model

Not every query requires timeliness of data. Incorporating new data into an existing model is likely to be cheaper than retraining the model from scratch, but just speculation on my end.

replies(1): >>37429443 #

7. lambda_garden ◴[07 Sep 23 20:00 UTC] No.37424763[source]▶

>>37421497 (TP) #

> LLaMA, with up to 65B params, opened against Meta's wishes

They sure didn't try very hard to secure it. I wonder if it was their strategy all along.

replies(1): >>37426416 #

8. AnthonyMouse ◴[07 Sep 23 22:21 UTC] No.37426416[source]▶

>>37424763 #

I suspect this was the goal of some of the people inside the company but imposing some nominal terms on it was the price of getting it through the bureaucracy, or maybe required by some agreement related to some mostly irrelevant but actually present subset of the original model.

Then the inevitable occurred and made it obvious that the restrictions were both impractical to enforce and counterproductive, so they released a new one with less of them.

9. foobiekr ◴[08 Sep 23 04:46 UTC] No.37429443{3}[source]▶

>>37422898 #

I think you are overestimating R&D budgets for companies. Very few tech companies - even large ones - have R&D budgets in the $10B+ range, let alone $100B. Most of the fortune 100 isn't even $10B.

replies(1): >>37430039 #

10. mistymountains ◴[08 Sep 23 05:55 UTC] No.37429907[source]▶

>>37421497 (TP) #

Cool it with the italics.

11. danielbln ◴[08 Sep 23 06:12 UTC] No.37430039{4}[source]▶

>>37429443 #

Where do you get $100B from?

replies(1): >>37435993 #

12. 6510 ◴[08 Sep 23 06:28 UTC] No.37430155{3}[source]▶

>>37422273 #

Okay ill tell you. You need to start a startup that sets up a good number of cameras at manual labor jobs. Most of the footage will be completely useless but every day you hit a once in a day event, every week you get a once in a week event, every month, every year, every decade etc! Then the guy working there for 40 years wacks pipe 224 with a hammer 50 cm from the outlet and production resumes.

The footage can be aggressively pruned to fit on the disk.

When the robot is delivered in 2033 it can easily figure out, from the footage, all these weird and rare edge cases.

The difference will be like that between a competent but new employee and someone with 10 years of experience.

I can see the Tesla bots disassembling the production line already. Or do you think it wont happen?

replies(2): >>37454779 #>>37495611 #

13. foobiekr ◴[08 Sep 23 16:43 UTC] No.37435993{5}[source]▶

>>37430039 #

"orders of magnitude"

14. Aerbil313 ◴[10 Sep 23 11:14 UTC] No.37454779{4}[source]▶

>>37430155 #

Transformers can and do forget.

15. checkyoursudo ◴[13 Sep 23 12:05 UTC] No.37495611{4}[source]▶

>>37430155 #

Assuming that this would work, which I am fine with granting for purposes of discussion, how does this method ever let you build anything new? Or make use of advances in production methods? Or completely reconfigure a production line because of some regulatory requirement?

↑