Categories: FAANG

Corpus Synthesis for Zero-shot ASR Domain Adaptation using Large Language Models

While Automatic Speech Recognition (ASR) systems are widely used in many real-world applications, they often do not generalize well to new domains and need to be finetuned on data from these domains. However, target-domain data is usually not readily available in many scenarios. In this paper, we propose a new strategy for adapting ASR models to new target domains without any text or speech from those domains. To accomplish this, we propose a novel data synthesis pipeline that uses a Large Language Model (LLM) to generate a target domain text corpus, and a state-of-the-art controllable speech…
AI Generated Robotic Content

Recent Posts

Fine-tuning SDXL with childhood pictures → audio-reactive geometries – [Experiment]

After a deeply introspective and emotional journey, I fine-tuned SDXL using old family album pictures…

7 hours ago

Beyond Accuracy: 5 Metrics That Actually Matter for AI Agents

AI agents , or autonomous systems powered by agentic AI, have reshaped the current landscape…

7 hours ago

Apple Workshop on Reasoning and Planning 2025

Reasoning and planning are the bedrock of intelligent AI systems, enabling them to plan, interact,…

7 hours ago

MediaFM: The Multimodal AI Foundation for Media Understanding at Netflix

Avneesh Saluja, Santiago Castro, Bowei Yan, Ashish RastogiIntroductionNetflix’s core mission is to connect millions of members…

7 hours ago

Scaling data annotation using vision-language models to power physical AI systems

Critical labor shortages are constraining growth across manufacturing, logistics, construction, and agriculture. The problem is…

7 hours ago

Start Your Surround Sound Journey With $50 off This Klipsch Soundbar

This soundbar is just the beginning, with the option to add wireless bookshelf speakers or…

8 hours ago