Categories: FAANG

Layer-Wise Data-Free CNN Compression

We present an efficient method for compressing a trained neural network without using any data. Our data-free method requires 14x-450x fewer FLOPs than comparable state-of-the-art methods. We break the problem of data-free network compression into a number of independent layer-wise compressions. We show how to efficiently generate layer-wise training data, and how to precondition the network to maintain accuracy during layer-wise compression. We show state-of-the-art performance on MobileNetV1 for data-free low-bit-width quantization. We also show state-of-the-art performance on data-free…

OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework

The reproducibility and transparency of large language models are crucial for advancing open research, ensuring the trustworthiness of results, and enabling investigations into data and model biases, as well as potential risks. To this end, we release OpenELM, a state-of-the-art open language model. OpenELM uses a layer-wise scaling strategy to…

April 25, 2024

In "FAANG"

Stochastic KV Routing: Enabling Adaptive Depth-Wise Cache Sharing

Serving transformer language models with high throughput requires caching Key-Values (KVs) to avoid redundant computation during autoregressive generation. The memory footprint of KV caching is significant and heavily impacts serving costs. This work proposes to lessen these memory requirements. While recent work has largely addressed KV cache reduction via compression…

May 6, 2026

In "FAANG"

Addressing the network data monetization complexities

In our previous blog, we identified the three layers to network data monetization. These were the data layer, the analytics layer and the automation layer. To address the network data value tree successfully, we must address the complexities of these three layers, which are essential for automated operations in telco.…

April 13, 2023

In "FAANG"

AI Generated Robotic Content

Next Sharing a laugh: Scientists teach a robot when to have a sense of humor »

Previous « RGB-X Classification for Electronics Sorting

Published by

AI Generated Robotic Content

Tags: ai/mlfaang

4 years ago

Using depth maps and weight noising to get better character LoRAs

A few weeks ago I introduced a new method for training style LoRAs which has…

9 hours ago

AI/ML Research

The Statistics of Token Selection: Logits, Temperature, and Top-P Walkthrough

When large language models, or LLMs for short, produce outputs, several criteria are at stake,…

9 hours ago

FAANG

Process financial documents using Amazon Bedrock Data Automation

Financial institutions process thousands of documents daily, including tax forms, loan statements, and purchase orders.…

9 hours ago

FAANG

Introducing Google AI Threat Defense to help you outpace the adversary

aside_block <ListValue: [StructValue([('title', 'Summary of today’s news'), ('body', <wagtail.rich_text.RichText object at 0x7f00683723a0>), ('btn_text', ''), ('href',…

9 hours ago