Approaching Multi-Lingual Emotion Recognition from Speech - On Language Dependency of Acoustic/Prosodic Features for Anger Detection

Polzehl, Tim; Schmitt, Alexander; Metze, Florian

doi:10.1184/R1/6473054.v1

file.pdf (106.29 kB)

Approaching Multi-Lingual Emotion Recognition from Speech - On Language Dependency of Acoustic/Prosodic Features for Anger Detection

journal contribution

posted on 2010-05-01, 00:00 authored by Tim Polzehl, Alexander Schmitt, Florian MetzeFlorian Metze

This paper reports on mono- and cross-lingual performance of different acoustic and/or prosodic features. We analyze the way to define an optimal set of features when building a multilingual emotion classification system, i.e. a system that can handle more than a single input language. Due to our findings that cross-lingual emotion recognition suffers from low recognition rates we analyze our features on both an American English and a German database. Both databases contain speech of real-life users calling into interactive voice response (IVR) platforms. After calculating performance scores when cross-lingual decoding is involved, i.e. when an emotion classification system is confronted with a language it has not been trained on, we further report on different strategies to build a single feature space that is capable of dealing with both languages. We estimate the relative importance of different features for different languages by looking at their distribution, their classification scores and their rank in terms of information gain ratio. Finally, we construct a feature space on the joint data, replacing two formerly separated system by a single on. We obtain a bi-lingual emotion recognition system which performs as well as the monolingual systems on the test data.

History

Date

2010-05-01

Usage metrics

Keywords

emotion recognition anger detection IVR speech IGR acoustic prosodic features speech processing

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Approaching Multi-Lingual Emotion Recognition from Speech - On Language Dependency of Acoustic/Prosodic Features for Anger Detection

History

Date

Usage metrics

Categories

Keywords

Licence

Exports