Categories: FAANG

Personalization of CTC-based End-to-End Speech Recognition Using Pronunciation-Driven Subword Tokenization

Recent advances in deep learning and automatic speech recognition have boosted the accuracy of end-to-end speech recognition to a new level. However, recognition of personal content such as contact names remains a challenge. In this work, we present a personalization solution for an end-to-end system based on connectionist temporal classification. Our solution uses class-based language model, in which a general language model provides modeling of the context for named entity classes, and personal named entities are compiled in a separate finite state transducer. We further introduce a…
AI Generated Robotic Content

Recent Posts

Restrict access to sensitive documents in your Amazon Quick knowledge bases for Amazon S3

Organizations that must restrict access to sensitive documents increasingly rely on AI-driven search and chat…

11 seconds ago

Gemini Live Agent Challenge: Announcing the winners and highlights

The Gemini Live Agent Challenge is officially in the books! We challenged developers worldwide to…

13 seconds ago

The Best Outdoor Deals From the REI Anniversary Sale 2026

It’s the best time of year to pick up all the outdoor gadgets, tents, sleeping…

1 hour ago

NASA’s new AI space chip could let spacecraft think for themselves

NASA is testing a next-generation space computer chip that could give spacecraft the ability to…

1 hour ago

Improve bot accuracy with Amazon Lex Assisted NLU

Improving bot accuracy in Amazon Lex starts with handling how customers communicate naturally. Your customers…

1 day ago