/top/
/new/
/best/
/ask/
/show/
/job/
^
slacker news
login
about
←back to thread
Llama.cpp 30B runs with only 6GB of RAM now
(github.com)
1311 points
msoad
| 1 comments |
31 Mar 23 20:37 UTC
|
HN request time: 0.212s
|
source
1.
dvt
◴[
01 Apr 23 00:20 UTC
]
No.
35395606
[source]
▶
>>35393284 (OP)
#
This seems suspiciously like a bug (either in inference or in mmap reporting), as these models are not sparse enough for the savings to come from anywhere viable.
ID:
GO
↑