(gist.github.com)

262 points rain1 | 1 comments | 02 Jul 25 10:39 UTC | HN request time: 0.204s | source

Show context

dale_glass ◴[02 Jul 25 11:16 UTC] No.44442315[source]▶

>>44442072 (OP) #

How big are those in terms of size on disk and VRAM size?

Something like 1.61B just doesn't mean much to me since I don't know much about the guts of LLMs. But I'm curious about how that translates to computer hardware -- what specs would I need to run these? What could I run now, what would require spending some money, and what I might hope to be able to run in a decade?

replies(3): >>44442353 #>>44442714 #>>44450773 #

1. ethan_smith ◴[03 Jul 25 01:39 UTC] No.44450773[source]▶

>>44442315 #

As a rule of thumb, each billion parameters requires about 4GB of VRAM in FP16 (2 bytes per parameter), so a 7B model needs ~28GB, 70B needs ~280GB, while the 405B models need ~1.6TB of VRAM - though quantization can reduce this by 2-4x (4-bit models use only ~0.5GB per billion parameters).

↑

How large are large language models?