Categories: FAANG

Neural Transducer Training: Reduced Memory Consumption with Sample-wise Computation

The neural transducer is an end-to-end model for automatic speech recognition (ASR). While the model is well-suited for streaming ASR, the training process remains challenging. During training, the memory requirements may quickly exceed the capacity of state-of-the-art GPUs, limiting batch size and sequence lengths. In this work, we analyze the time and space complexity of a typical transducer training setup. We propose a memory-efficient training method that computes the transducer loss and gradients sample by sample. We present optimizations to increase the efficiency and parallelism of the…
AI Generated Robotic Content

Recent Posts

Agentic Workflow vs. Autonomous Agent: What’s the Difference?

In this article, you will learn how to distinguish agentic workflows from autonomous agents by…

6 hours ago

Retrofit, don’t rebuild: Agentic overlays for transforming legacy enterprise services

The opinions expressed in this post are the authors’ views and not those of Cisco.…

6 hours ago

Anthropic Thinks Its Own Success Is Key to Making AI Safe

Anthropic's critics argue it's rapidly accumulating power. The company says that's what responsible AI development…

7 hours ago

Agentic AI bot helps scientists speak to robots, speeding up experiments

Researchers at the Department of Energy's Pacific Northwest National Laboratory use a slew of autonomous…

7 hours ago

Context Windows Are Not Memory: What AI Agent Developers Need to Understand

In this article, you will learn why a large context window is not the same…

1 day ago

Huntington Bank: Redacting sensitive data from 400M+ documents with AWS

When your document repository contains hundreds of millions of files accumulated over nearly a decade,…

1 day ago