When most people think of using machine learning (ML) with audio data, the use case that usually comes to mind…
Like two valedictorians, SimInsights and Photomath tell stories worth hearing about how AI is advancing education. SimInsights in Irvine, Calif.,…
Voice trigger detection is an important task, which enables activating a voice assistant when a target user speaks a keyword…
We present a differentiable rendering framework for material and lighting estimation from multi-view images and a reconstructed geometry. In the…
All-neural, end-to-end ASR systems gained rapid interest from the speech recognition community. Such systems convert speech input to text units…
Quantization, knowledge distillation, and magnitude pruning are among the most popular methods for neural network compression in NLP. Independently, these…
We introduce CVNets, a high-performance open-source library for training deep neural networks for visual recognition tasks, including classification, detection, and…
Virtual assistants make use of automatic speech recognition (ASR) to help users answer entity-centric queries. However, spoken entity recognition is…
Machine learning models are trained to minimize the mean loss for a single metric, and thus typically do not consider…
A key algorithm for understanding the world is material segmentation, which assigns a(metal, glass, etc.) to each pixel.…