Categories: FAANG

Joint Speech Transcription and Translation: Pseudo-Labeling with Out-of-Distribution Data

Self-training has been shown to be helpful in addressing data scarcity for many domains, including vision, speech, and language. Specifically, self-training, or pseudo-labeling, labels unsupervised data and adds that to the training pool. In this work, we investigate and use pseudo-labeling for a recently proposed novel setup: joint transcription and translation of speech, which suffers from an absence of sufficient parallel data resources. We show that under such data-deficient circumstances, the unlabeled data can significantly vary in domain from the supervised data, which results in…
AI Generated Robotic Content

Recent Posts

Attention May Be All We Need… But Why?

A lot (if not nearly all) of the success and progress made by many generative…

5 hours ago

US Customs and Border Protection Quietly Revokes Protections for Pregnant Women and Infants

CBP’s acting commissioner has rescinded four Biden-era policies that aimed to protect vulnerable people in…

6 hours ago

Robotic dog mimics mammals for superior mobility on land and in water

A team of researchers has unveiled a cutting-edge Amphibious Robotic Dog capable of roving across…

6 hours ago

AI model translates text commands into motion for diverse robots and avatars

Brown University researchers have developed an artificial intelligence model that can generate movement in robots…

6 hours ago

Creating a Secure Machine Learning API with FastAPI and Docker

Machine learning models deliver real value only when they reach users, and APIs are the…

1 day ago

Measuring Dialogue Intelligibility for Netflix Content

Enhancing Member Experience Through Strategic CollaborationOzzie Sutherland, Iroro Orife, Chih-Wei Wu, Bhanu SrikanthAt Netflix, delivering the…

1 day ago