This sound like compression with extra steps.. What makes this technique particular to LLM weights instead of general purpose data?
replies(2):
This technique seems a bit similar to lossy image compression that replaces exact pixels with a combination of pre-defined patterns (DCT in JPEG), but here the patterns aren't from cosine function, but from a pseudo-random one.
It may also be beating simple quantization from just adding noise that acts as dithering, and breaks up the bands created by combinations of quantized numbers.