Categories: FAANG

Layer-Wise Data-Free CNN Compression

We present an efficient method for compressing a trained neural network without using any data. Our data-free method requires 14x-450x fewer FLOPs than comparable state-of-the-art methods. We break the problem of data-free network compression into a number of independent layer-wise compressions. We show how to efficiently generate layer-wise training data, and how to precondition the network to maintain accuracy during layer-wise compression. We show state-of-the-art performance on MobileNetV1 for data-free low-bit-width quantization. We also show state-of-the-art performance on data-free…
AI Generated Robotic Content

Recent Posts

We can finally watch TNG in 16:9

Somone posted an example of LTX 2.3 outpainting to expand 4:3 video to 16:9. I…

16 mins ago

The Complete Guide to Inference Caching in LLMs

Calling a large language model API at scale is expensive and slow.

16 mins ago

The Human Infrastructure: How Netflix Built the Operations Layer Behind Live at Scale

By: Brett Axler, Casper Choffat, and Alo LowryIn the three years since our first Live show,…

16 mins ago

Introducing granular cost attribution for Amazon Bedrock

As AI inference grows into a significant share of cloud spend, understanding who and what…

16 mins ago

OpenAI Executive Kevin Weil Is Leaving the Company

The former Instagram VP is departing the ChatGPT-maker, which is folding the AI science application…

1 hour ago