Categories: FAANG

Adaptive Knowledge Distillation for Device-Directed Speech Detection

Device-directed speech detection (DDSD) is a binary classification task that separates the user’s queries to a voice assistant (VA) from background speech or side conversations. This is important for achieving naturalistic user experience. To this end, we propose knowledge distillation (KD) to enhance DDSD accuracy while ensuring efficient deployment. Specifically, we introduce a novel adaptive KD method that transfers knowledge from general representations of an ASR large pre-trained acoustic encoder (teacher). We apply task-specific adapters, on top of the (frozen) teacher encoder, trained…
AI Generated Robotic Content

Recent Posts

Qwen + Wan 2.2 Low Noise T2I (2K GGUF Workflow Included)

Workflow : https://pastebin.com/f32CAsS7 Hardware : RTX 3090 24GB Models : Qwen Q4 GGUF + Wan…

4 hours ago

7 Pandas Tricks for Time-Series Feature Engineering

Feature engineering is one of the most important steps when it comes to building effective…

4 hours ago

How AI is helping advance the science of bioacoustics to save endangered species

Our new Perch model helps conservationists analyze audio faster to protect endangered species, from Hawaiian…

4 hours ago

The DIVA logistics agent, powered by Amazon Bedrock

DTDC is India’s leading integrated express logistics provider, operating the largest network of customer access…

4 hours ago

ChatGPT users dismayed as OpenAI pulls popular models GPT-4o, o3 and more — enterprise API remains (for now)

OpenAI has announced GPT-5 will replace all models on ChatGPT. Many users are mourning the…

5 hours ago

Leak Reveals the Workaday Lives of North Korean IT Scammers

Spreadsheets, Slack messages, and files linked to an alleged group of North Korean IT workers…

5 hours ago