/top/
/new/
/best/
/ask/
/show/
/job/
^
slacker news
login
about
←back to thread
Llama.cpp 30B runs with only 6GB of RAM now
(github.com)
1311 points
msoad
| 1 comments |
31 Mar 23 20:37 UTC
|
HN request time: 0.202s
|
source
Show context
cubefox
◴[
31 Mar 23 21:35 UTC
]
No.
35393976
[source]
▶
>>35393284 (OP)
#
I don't understand. I thought each parameter was 16 bit (two bytes) which would predict minimally 60GB of RAM for a 30 billion parameter model. Not 6GB.
replies(2):
>>35394470
#
>>35394590
#
1.
gamegoblin
◴[
31 Mar 23 22:28 UTC
]
No.
35394590
[source]
▶
>>35393976
#
Parameters have been quantized down to 4 bits per parameter, and not all parameters are needed at the same time.
ID:
GO
↑