It sounds like they basically find part of a pseudo-random sequence that is closest to the desired data, then store the random seed and corrections (which are small so take less space).
Pretty fascinating from an information theory point of view. Surprising that it works at all. Is this, like, the JPEG of uniformly distributed, uncorrelated data?
We don't know. They basically look for sequences that approximate NN weights well, in the same way sinusoidal functions work well with "natural" images, but not with graphics with hard edges.