Categories: FAANG

Neural Transducer Training: Reduced Memory Consumption with Sample-wise Computation

The neural transducer is an end-to-end model for automatic speech recognition (ASR). While the model is well-suited for streaming ASR, the training process remains challenging. During training, the memory requirements may quickly exceed the capacity of state-of-the-art GPUs, limiting batch size and sequence lengths. In this work, we analyze the time and space complexity of a typical transducer training setup. We propose a memory-efficient training method that computes the transducer loss and gradients sample by sample. We present optimizations to increase the efficiency and parallelism of the…
AI Generated Robotic Content

Recent Posts

Could not resist…

submitted by /u/GTManiK [link] [comments]

1 hour ago

Sigma BF Review (2026): Eccentric but Strangely Lovable

Sigma’s new entry is both a bold design experiment and a pretty decent camera.

2 hours ago

The Best 3-in-1 Apple Charging Stations After Testing Top Models

I tried all the top models to find the best 3-in-1 Apple charging stations, pads,…

1 day ago

Scientists are seriously asking if bees and ChatGPT are conscious

New studies suggest consciousness can't be judged solely by behavior, whether it's a chatbot discussing…

1 day ago

Announcing Comfy Desktop: One App for every Comfy, rolling out 100% by Monday June 8

Introducing Comfy Desktop - official Comfy app for every ComfyUI. Same name, new app; and…

2 days ago

Building Semantic Search with Transformers.js and Sentence Embeddings

You've probably shipped this bug before, where a user types " affordable laptop " into…

2 days ago