Categories: FAANG

Personalization of CTC-based End-to-End Speech Recognition Using Pronunciation-Driven Subword Tokenization

Recent advances in deep learning and automatic speech recognition have boosted the accuracy of end-to-end speech recognition to a new level. However, recognition of personal content such as contact names remains a challenge. In this work, we present a personalization solution for an end-to-end system based on connectionist temporal classification. Our solution uses class-based language model, in which a general language model provides modeling of the context for named entity classes, and personal named entities are compiled in a separate finite state transducer. We further introduce a…

AI Generated Robotic Content

Next I created a free tool for texturing 3D objects using Automatic1111 webui and sd-webui-controlnet ( by Mikubill + llyasviel). Now game-devs can texture lots of decorations/characters on their own PC for free. »

Previous « Democratic inputs to AI grant program: lessons learned and implementation plans

Share

Published by

AI Generated Robotic Content

Tags: ai/mlfaang

2 years ago

Recent Posts

Image

Flux Krea Dev is hands down the best model on the planet right now

I started with trying to recreate SD3 style glitches but ended up discovering this is…

18 hours ago

AI/ML Research

Building a Transformer Model for Language Translation

This post is divided into six parts; they are: • Why Transformer is Better than…

18 hours ago

AI/ML News

Peacock Feathers Are Stunning. They Can Also Emit Laser Beams

Scientists hope their plumage project could someday lead to biocompatible lasers that could safely be…

19 hours ago

Image

Pirate VFX Breakdown | Made almost exclusively with SDXL and Wan!

In the past weeks, I've been tweaking Wan to get really good at video inpainting.…

2 days ago

FAANG

Try Deep Think in the Gemini app

Deep Think utilizes extended, parallel thinking and novel reinforcement learning techniques for significantly improved problem-solving.

2 days ago

FAANG

Introducing Amazon Bedrock AgentCore Browser Tool

At AWS Summit New York City 2025, Amazon Web Services (AWS) announced the preview of…

2 days ago

L