ai/ml

Mel Spectrogram Inversion with Stable Pitch

Vocoders are models capable of transforming a low-dimensional spectral representation of an audio signal, typically the mel spectrogram, to a…

2 years ago

GAUDI: A Neural Architect for Immersive 3D Scene Generation

We introduce GAUDI, a generative model capable of capturing the distribution of complex and realistic 3D scenes that can be…

2 years ago

Digitizing Smell: Using Molecular Maps to Understand Odor

Posted by Richard C. Gerkin, Google Research, and Alexander B. Wiltschko, Google Did you ever try to measure a smell?…

2 years ago

Detect audio events with Amazon Rekognition

When most people think of using machine learning (ML) with audio data, the use case that usually comes to mind…

2 years ago

Model Teachers: Startups Make Schools Smarter With Machine Learning

Like two valedictorians, SimInsights and Photomath tell stories worth hearing about how AI is advancing education. SimInsights in Irvine, Calif.,…

2 years ago

Improving Voice Trigger Detection with Metric Learning

Voice trigger detection is an important task, which enables activating a voice assistant when a target user speaks a keyword…

2 years ago

NeILF: Neural Incident Light Field for Material and Lighting Estimation

We present a differentiable rendering framework for material and lighting estimation from multi-view images and a reconstructed geometry. In the…

2 years ago

Integrating Categorical Features in End-To-End ASR

All-neural, end-to-end ASR systems gained rapid interest from the speech recognition community. Such systems convert speech input to text units…

2 years ago

Combining Compressions for Multiplicative Size Scaling on Natural Language Tasks

Quantization, knowledge distillation, and magnitude pruning are among the most popular methods for neural network compression in NLP. Independently, these…

2 years ago

CVNets: High Performance Library for Computer Vision

We introduce CVNets, a high-performance open-source library for training deep neural networks for visual recognition tasks, including classification, detection, and…

2 years ago