Leo Boytsov 12 months ago
🧵Perhaps everything you need to know about compression of generative models.
1. It's hard to remove more than 50% of the parameters.
2. Compression is achieved via a combination of sparsification, distillation, and (optionally) quantization.
↩️