←back to thread

171 points pizza | 1 comments | | HN request time: 0s | source
Show context
visarga ◴[] No.43600416[source]
Very interesting trick, using a dictionary of basis vectors which are quickly computed from a seed without storage. But the result is the same 3 or 4 bit quantization, with only a slight improvement. Their tiles are small, just 8 or 12 weights, it's why compression doesn't go too far. It would have been great if this trick lowered quantization <1 bit/weight, that would require longer tiles. Wondering what are the limits if we use a larger reservoir of cheap entropy as part of neural net architecture, even in training.

Congrats to Apple and Meta, makes sense they did the research, this will go towards efficient serving of LLMs on phones. And it's very easy to implement.

replies(2): >>43600451 #>>43601616 #
1. samus ◴[] No.43601616[source]
It should be definitely worth it because you can reuse databases of sequence to seed mappings for all future models.