←back to thread

171 points pizza | 1 comments | | HN request time: 0s | source
Show context
torginus ◴[] No.43600579[source]
This sound like compression with extra steps.. What makes this technique particular to LLM weights instead of general purpose data?
replies(2): >>43600928 #>>43601244 #
1. pornel ◴[] No.43600928[source]
Weights in neural networks don't always need to be precise. Not all weights are equally useful to the network. There seems to be a lot of redundancy that can be replaced with approximations.

This technique seems a bit similar to lossy image compression that replaces exact pixels with a combination of pre-defined patterns (DCT in JPEG), but here the patterns aren't from cosine function, but from a pseudo-random one.

It may also be beating simple quantization from just adding noise that acts as dithering, and breaks up the bands created by combinations of quantized numbers.