←back to thread

Zamba2-7B

(www.zyphra.com)
282 points dataminer | 1 comments | | HN request time: 0.197s | source
Show context
nox101 ◴[] No.41847916[source]
what is magic about 7B? why not 8B, 9B, 11.234B? Is 7B some power of 2 reinterpreted?
replies(2): >>41848051 #>>41848104 #
1. ikeashark ◴[] No.41848051[source]
I believe it comes from the original Llama papers where they chose these sizes because it fits each of the standard ML compute GPUs nicely.

Model Size + Overhead (context length, etc...)

7B: 13 GB - fits on T4 (16 GB).

13B: 26 GB - fits on V100 (32 GB).

30B: 65 GB - fits on A100 (80 GB).

65B: 131 GB - fits on 2x A100 (160 GB).

That's it really.