Leo Boytsov about 1 year ago
🧵Perhaps everything you need to know about compression of generative models.
1. It's hard to remove more than 50% of the parameters.
2. Compression is achieved via a combination of sparsification, distillation, and (optionally) quantization.
↩️