Categories: FAANG

Integrating Categorical Features in End-To-End ASR

All-neural, end-to-end ASR systems gained rapid interest from the speech recognition community. Such systems convert speech input to text units using a single trainable neural network model. E2E models require large amounts of paired speech text data that is expensive to obtain. The amount of data available varies across different languages and dialects. It is critical to make use of all these data so that both low resource languages and high resource languages can be improved. When we want to deploy an ASR system for a new application domain, the amount of domain specific training data is…
AI Generated Robotic Content

Recent Posts

Tried longer videos with WAN 2.2 Animate

I altered the workflow a little bit from my previous post (using Hearmeman's Animate v2…

18 hours ago

10 Python One-Liners for Generating Time Series Features

Time series data normally requires an in-depth understanding in order to build effective and insightful…

18 hours ago

Evaluating Evaluation Metrics — The Mirage of Hallucination Detection

Hallucinations pose a significant obstacle to the reliability and widespread adoption of language models, yet…

18 hours ago

Announcing new capabilities in Vertex AI Training for large-scale training

Building and scaling generative AI models demands enormous resources, but this process can get tedious.…

18 hours ago

MiniMax-M2 is the new king of open source LLMs (especially for agentic tool calling)

Watch out, DeepSeek and Qwen! There's a new king of open source large language models…

19 hours ago

Elon Musk’s Grokipedia Pushes Far-Right Talking Points

The new AI-powered Wikipedia competitor falsely claims that pornography worsened the AIDS epidemic and that…

19 hours ago