←back to thread

1311 points msoad | 1 comments | | HN request time: 0s | source
Show context
arthurcolle ◴[] No.35395744[source]
"How much RAM did you shave off last week?"

"Oh, you know, like 12-18GB"

"Haha shut the fuck up, how much RAM did you shave off last week"

"12-18GB"

"Let me tell you what - you show me your commits right now, if you shaved off 12-18GB of RAM last week I quit my job right now and come work for you"

https://www.youtube.com/watch?v=TxHITqC5rxE

replies(1): >>35397178 #
PragmaticPulp ◴[] No.35397178[source]
Maybe not so fast. Other users are reporting that it’s not actually running properly in environments with limited RAM. The reduced memory usage might be more of a reporting misunderstanding, not an actual reduction in memory usage.
replies(1): >>35400418 #
lostmsu ◴[] No.35400418[source]
It will run, just will have to reread the model for every new token.
replies(1): >>35413645 #
Szpadel ◴[] No.35413645[source]
with nvme gen 4 ssds this might not be that huge of an issue, and for sure much cheaper than investing in ram
replies(1): >>35415612 #
lostmsu ◴[] No.35415612[source]
I don't believe the consumer ones actually have sustained sequential read speed to saturate Gen 4.
replies(2): >>35418877 #>>35572979 #
1. Robotbeat ◴[] No.35572979[source]
Gen 5 pcie is ~4GB/s per lane, AMD Genoa chips have 128 such lanes. That means on the order of 500GB/s aggregate throughput, which is comparable to the aggregate theoretical throughput of the 12 channel DDR5 RAM of the Genoa CPUs.

In other words, with enough data interleaving between enough NVME SSDs, you should have SSD throughput of the same order of magnitude as the system RAM.

The weights are static, so it’s just reads.