Switchboard-Affect: Emotion Perception Labels from Conversational Speech
Understanding the nuances of speech emotion dataset curation and labeling is essential for assessing speech emotion recognition (SER) model potential in real-world applications. Most training and evaluation datasets contain acted or pseudo-acted speech (e.g., podcast speech) in which emotion expressions may be exaggerated or otherwise intentionally modified. Furthermore, datasets labeled based on crowd perception often …
Read more “Switchboard-Affect: Emotion Perception Labels from Conversational Speech”