Has anyone done any comprehensive analysis on exactly how much quantization affects the quality of model output? I haven't seen any more than people running it and being impressed (or not) by a few sample outputs.
I would be very curious about some contrastive benchmarks between a quantized and non-quantized version of the same model.
replies(4):