Categories: FAANG

Personalization of CTC-based End-to-End Speech Recognition Using Pronunciation-Driven Subword Tokenization

Recent advances in deep learning and automatic speech recognition have boosted the accuracy of end-to-end speech recognition to a new level. However, recognition of personal content such as contact names remains a challenge. In this work, we present a personalization solution for an end-to-end system based on connectionist temporal classification. Our solution uses class-based language model, in which a general language model provides modeling of the context for named entity classes, and personal named entities are compiled in a separate finite state transducer. We further introduce a…
AI Generated Robotic Content

Recent Posts

Build interactive PDF text extraction from Amazon S3

Picture this: a compliance officer needs a specific clause during an audit, an attorney needs…

17 hours ago

Securing agentic AI with perimeter guardrails: What’s new in VPC Service Controls

As enterprises scale autonomous AI agents into production, enabling safe innovation requires robust architectural guardrails.…

17 hours ago

The 28 Best Deals Under $100 Before Prime Day Ends

Times are hard in 2026. These Amazon Prime Day deals under $100 on earbuds, Kindles,…

18 hours ago

Shifting data center power to off-peak hours could cut grid costs in the age of AI

The number of U.S. data centers is growing, largely to power artificial intelligence programs. That…

18 hours ago

Agentic Workflow vs. Autonomous Agent: What’s the Difference?

In this article, you will learn how to distinguish agentic workflows from autonomous agents by…

2 days ago