faang - Robotic Content

Combining Compressions for Multiplicative Size Scaling on Natural Language Tasks

by AI Generated Robotic ContentFAANG September 3, 2022Comments are Disabled

Quantization, knowledge distillation, and magnitude pruning are among the most popular methods for neural network compression in NLP. Independently, these methods reduce model size and can accelerate inference, but their relative benefit and combinatorial inter- actions have not been rigorously studied. For each of the eight possible subsets of these techniques, we compare accuracy vs. …

CVNets: High Performance Library for Computer Vision

by AI Generated Robotic ContentFAANG September 3, 2022Comments are Disabled

We introduce CVNets, a high-performance open-source library for training deep neural networks for visual recognition tasks, including classification, detection, and segmentation. CVNets supports image and video understanding tools, including data loading, data transformations, novel data sampling methods, and implementations of several standard networks with similar or better performance than previous studies.

Space-Efficient Representation of Entity-centric Query Language Models

by AI Generated Robotic ContentFAANG September 3, 2022Comments are Disabled

Virtual assistants make use of automatic speech recognition (ASR) to help users answer entity-centric queries. However, spoken entity recognition is a difficult problem, due to the large number of frequently-changing named entities. In addition, resources available for recognition are constrained when ASR is performed on-device. In this work, we investigate the use of probabilistic grammars …

FORML: Learning to Reweight Data for Fairness

by AI Generated Robotic ContentFAANG September 3, 2022Comments are Disabled

Machine learning models are trained to minimize the mean loss for a single metric, and thus typically do not consider fairness and robustness. Neglecting such metrics in training can make these models prone to fairness violations when training data are imbalanced or test distributions differ. This work introduces Fairness Optimized Reweighting via Meta-Learning (FORML), a …

A Dense Material Segmentation Dataset for Indoor and Outdoor Scene Parsing

by AI Generated Robotic ContentFAANG September 3, 2022Comments are Disabled

A key algorithm for understanding the world is material segmentation, which assigns a label (metal, glass, etc.) to each pixel. We find that a model trained on existing data underperforms in some settings and propose to address this with a large-scale dataset of 3.2 million dense segments on 44,560 indoor and outdoor images, which is …

Regularized Training of Nearest Neighbor Language Models

by AI Generated Robotic ContentFAANG September 3, 2022Comments are Disabled

Including memory banks in a natural language processing architecture increases model capacity by equipping it with additional data at inference time. In this paper, we build upon kNN-LM, which uses a pre-trained language model together with an exhaustive kNN search through the training data (memory bank) to achieve state-of-the-art results. We investigate whether we can …

Benign, Tempered, or Catastrophic: A Taxonomy of Overfitting

by AI Generated Robotic ContentFAANG September 3, 2022Comments are Disabled

The practical success of overparameterized neural networks has motivated the recent scientific study of interpolating methods, which perfectly fit their training data. Certain interpolating methods, including neural networks, can fit noisy training data without catastrophically bad test performance, in defiance of standard intuitions from statistical learning theory. Aiming to explain this, a body of recent …

How The Chefz serves the perfect meal with Amazon Personalize

by AI Generated Robotic ContentFAANG September 2, 2022Comments are Disabled

This is a guest post by Ramzi Alqrainy, Chief Technology Officer, The Chefz. The Chefz is a Saudi-based online food delivery startup, founded in 2016. At the core of The Chefz’s business model is enabling its customers to order food and sweets from top elite restaurants, bakeries, and chocolate shops. In this post, we explain …

Distributed training with Amazon EKS and Torch Distributed Elastic

by AI Generated Robotic ContentFAANG September 2, 2022Comments are Disabled

Distributed deep learning model training is becoming increasingly important as data sizes are growing in many industries. Many applications in computer vision and natural language processing now require training of deep learning models, which are growing exponentially in complexity and are often trained with hundreds of terabytes of data. It then becomes important to use …

The Benefits of Running Kubernetes on Ephemeral Compute

by AI Generated Robotic ContentFAANG September 1, 2022Comments are Disabled

This is the third post in our blog series on Rubix (#1, #2), our effort to rebuild our cloud architecture around Kubernetes. Introduction The advent of containers and their orchestration platforms, most popularly Kubernetes (K8s), has familiarized engineers with the concept of ephemeral compute: once a workload has completed, the resources used to run it …