Emotion Recognition using Imperfect Speech Recognition

Metze, Florian; Batliner, Anton; Eyben, Florian; Polzehl, Tim; Schuller, Bjorn; Steidl, Stefan

doi:10.1184/R1/6473348.v1

file.pdf (105.34 kB)

Emotion Recognition using Imperfect Speech Recognition

journal contribution

posted on 2010-09-01, 00:00 authored by Florian MetzeFlorian Metze, Anton Batliner, Florian Eyben, Tim Polzehl, Bjorn Schuller, Stefan Steidl

This paper investigates the use of speech-to-text methods for assigning an emotion class to a given speech utterance. Previous work shows that an emotion extracted from text can convey complementary evidence to the information extracted by classifiers based on spectral, or other non-linguistic features. As speech-to-text usually presents significantly more computational effort, in this study we investigate the degree of speech-to-text accuracy needed for reliable detection of emotions from an automatically generated transcription of an utterance. We evaluate the use of hypotheses in both training and testing, and compare several classification approaches on the same task. Our results show that emotion recognition performance stays roughly constant as long as word accuracy doesn't fall below a reasonable value, making the use of speech-to-text viable for training of emotion classifiers based on linguistics.

History

Publisher Statement

Date

2010-09-01

Usage metrics

Keywords

speech-to-text emotion detection meta-data extraction rich transcription children’s speech

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Emotion Recognition using Imperfect Speech Recognition

History

Publisher Statement

Date

Usage metrics

Categories

Keywords

Licence

Exports