PolyNorm: Few-Shot LLM-Based Text Normalization for Text-to-Speech
Text Normalization (TN) is a key preprocessing step in Text-to-Speech (TTS) systems, converting written forms into their canonical spoken equivalents. Traditional TN systems can exhibit high accuracy, but involve substantial engineering effort, are difficult to scale, and pose challenges to language coverage, particularly in low-resource settings. We propose PolyNorm, a prompt-based approach to TN using …
Read more “PolyNorm: Few-Shot LLM-Based Text Normalization for Text-to-Speech”