Categories: FAANG

Corpus Synthesis for Zero-shot ASR Domain Adaptation using Large Language Models

While Automatic Speech Recognition (ASR) systems are widely used in many real-world applications, they often do not generalize well to new domains and need to be finetuned on data from these domains. However, target-domain data is usually not readily available in many scenarios. In this paper, we propose a new strategy for adapting ASR models to new target domains without any text or speech from those domains. To accomplish this, we propose a novel data synthesis pipeline that uses a Large Language Model (LLM) to generate a target domain text corpus, and a state-of-the-art controllable speech…
AI Generated Robotic Content

Recent Posts

Statistical Methods for Evaluating LLM Performance

The large language model (LLM) has become a cornerstone of many AI applications.

3 hours ago

Getting started with computer use in Amazon Bedrock Agents

Computer use is a breakthrough capability from Anthropic that allows foundation models (FMs) to visually…

3 hours ago

OpenAI’s strategic gambit: The Agents SDK and why it changes everything for enterprise AI

OpenAI's new API and Agents SDK consolidate a previously fragmented complex ecosystem into a unified,…

4 hours ago

Under Trump, AI Scientists Are Told to Remove ‘Ideological Bias’ From Powerful Models

A directive from the National Institute of Standards and Technology eliminates mention of “AI safety”…

4 hours ago

Exploring Prediction Targets in Masked Pre-Training for Speech Foundation Models

Speech foundation models, such as HuBERT and its variants, are pre-trained on large amounts of…

1 day ago

How GoDaddy built a category generation system at scale with batch inference for Amazon Bedrock

This post was co-written with Vishal Singh, Data Engineering Leader at Data & Analytics team…

1 day ago