←back to thread

1311 points msoad | 1 comments | | HN request time: 0.202s | source
Show context
cubefox ◴[] No.35393976[source]
I don't understand. I thought each parameter was 16 bit (two bytes) which would predict minimally 60GB of RAM for a 30 billion parameter model. Not 6GB.
replies(2): >>35394470 #>>35394590 #
1. gamegoblin ◴[] No.35394590[source]
Parameters have been quantized down to 4 bits per parameter, and not all parameters are needed at the same time.