Llama.cpp 30B runs with only 6GB of RAM now

(github.com)

1311 points msoad | 2 comments | 31 Mar 23 20:37 UTC | HN request time: 0.399s | source

Show context

detrites ◴[31 Mar 23 20:58 UTC] No.35393558[source]▶

The pace of collaborative OSS development on these projects is amazing, but the rate of optimisations being achieved is almost unbelievable. What has everyone been doing wrong all these years cough sorry, I mean to say weeks?

Ok I answered my own question.

replies(5): >>35393627 #>>35393885 #>>35393921 #>>35394786 #>>35397029 #

kmeisthax ◴[31 Mar 23 21:28 UTC] No.35393921[source]▶

>>35393558 #

>What has everyone been doing wrong all these years

So it's important to note that all of these improvements are the kinds of things that are cheap to run on a pretrained model. And all of the developments involving large language models recently have been the product of hundreds of thousands of dollars in rented compute time. Once you start putting six digits on a pile of model weights, that becomes a capital cost that the business either needs to recuperate or turn into a competitive advantage. So everyone who scales up to this point doesn't release model weights.

The model in question - LLaMA - isn't even a public model. It leaked and people copied[0] it. But because such a large model leaked, now people can actually work on iterative improvements again.

Unfortunately we don't really have a way for the FOSS community to pool together that much money to buy compute from cloud providers. Contributions-in-kind through distributed computing (e.g. a "GPT@home" project) would require significant changes to training methodology[1]. Further compounding this, the state-of-the-art is actually kind of a trade secret now. Exact training code isn't always available, and OpenAI has even gone so far as to refuse to say anything about GPT-4's architecture or training set to prevent open replication.

[0] I'm avoiding the use of the verb "stole" here, not just because I support filesharing, but because copyright law likely does not protect AI model weights alone.

[1] AI training has very high minimum requirements to get in the door. If your GPU has 12GB of VRAM and your model and gradients require 13GB, you can't train the model. CPUs don't have this limitation but they are ridiculously inefficient for any training task. There are techniques like ZeRO to give pagefile-like state partitioning to GPU training, but that requires additional engineering.

replies(7): >>35393979 #>>35394466 #>>35395609 #>>35396273 #>>35400202 #>>35400942 #>>35573426 #

chii ◴[01 Apr 23 01:59 UTC] No.35396273[source]▶

>>35393921 #

> Exact training code isn't always available, and OpenAI has even gone so far as to refuse to say anything about GPT-4's architecture or training set to prevent open replication.

this is why i think the patent and copyright system is a failure. The idea that having laws protecting information like this would advance the progress of science.

It doesn't, because look how an illegally leaked model gets much more advances in shorter time. The laws protecting IP merely gives a moat to incumbents.

replies(2): >>35396877 #>>35400870 #

breck ◴[01 Apr 23 15:07 UTC] No.35400870[source]▶

>>35396273 #

> The laws protecting IP merely gives a moat to incumbents.

Yes. These laws are bad. We could fix this with a 2 line change:

    Section 1. Article I, Section 8, Clause 8 of this Constitution is hereby repealed.
    Section 2. Congress shall make no law abridging the right of the people to publish information.

replies(1): >>35401622 #

1. kmeisthax ◴[01 Apr 23 16:35 UTC] No.35401622[source]▶

>>35400870 #

Abolishing the copyright clause would not solve this problem because OpenAI is not leveraging copyright or patents. They're just not releasing anything.

To fix this, you'd need to ban trade secrecy entirely. As in, if you have some kind of invention or creative work you must publish sufficient information to replicate it "in a timely manner". This would be one of those absolutely insane schemes that only a villain in an Ayn Rand book would come up with.

replies(1): >>35404345 #

2. breck ◴[01 Apr 23 21:37 UTC] No.35404345[source]▶

>>35401622 (TP) #

> Abolishing the copyright clause would not solve this problem because OpenAI is not leveraging copyright or patents. They're just not releasing anything.

The problem is how in the world is ChatGPT so good compared to the average human being? The answer is that human beings (except for the 1%), have their left hands tied behind their back because of copyright law.

↑