Carnegie Mellon University
Browse

Analysis of Gender Normalization using MLP and VTLN Features

Download (93.07 kB)
journal contribution
posted on 2010-09-01, 00:00 authored by Thomas Schaaf, Florian MetzeFlorian Metze

This paper analyzes the capability of multilayer perceptron frontends to perform speaker normalization. We find the context decision tree to be a very useful tool to assess the speaker normalization power of different frontends. We introduce a gender question into the training of the phonetic context decision tree. After the context clustering the gender specific models are counted. We compare this for the following frontends: (1) Bottle-Neck (BN) with and without vocal tract length normalization (VTLN), (2) standard MFCC, (3) stacking of multiple MFCC frames with linear discriminant analysis (LDA). We find the BN-frontend to be even more effective in reducing the number of gender questions than VTLN. From this we conclude that a Bottle-Neck frontend is more effective for gender normalization. Combining VTLN and BN-features reduces the number of gender specific models further.

History

Publisher Statement

Copyright 2010 ISCA

Date

2010-09-01

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC