Towards Integrated Acoustic Models for Speech Synthesis

Muthukumar, Prasanna Kumar

doi:10.1184/R1/21644852.v1

Towards Integrated Acoustic Models for Speech Synthesis

thesis

posted on 2022-12-02, 21:21 authored by Prasanna Kumar Muthukumar

All Statistical Parametric Speech Synthesizers consist of a linear pipeline of components. This view means that the synthesizer consists of a top-down structure where data fed into the synthesizer goes to front-end, then to the prediction algorithm, then to the waveform generation, and so on until the speech is finally constructed. Each component in this pipeline naïvely receives a stream of numbers from the preceding component, and spits out a stream of numbers for the next one in line, with little to no knowledge of what happens in the larger scheme of the pipeline. In this thesis, I argue against this “Markovian” structure, and instead propose the idea of an Integrated structure. In an integrated structure, every component in the system influences, and is in turn influenced by every other component in the system. This thesis describes four sets of experiments that move towards this idea. The first involves using lexical information to improve waveform generation algorithms. The second tries to increase the interaction between prediction algorithms and waveform generation. The third is an attempt to derive phonemes and phonetic information automatically from the speech rather than from the text. The last, and probably hardest, describes an idea for an evaluation metric that pays attention to multiple components of the synthesizer, rather than focusing on just a single one.

History

Date

2016-05-03

Degree Type

Dissertation

Department

Language Technologies Institute

Degree Name

Doctor of Philosophy (PhD)

Advisor(s)

Alan W Black

Usage metrics

Keywords

Statistical Parametric Speech Synthesizer Integrated structure waveform generation algorithms Speech Synthesis Natural Language Processing

Licence

In Copyright

Towards Integrated Acoustic Models for Speech Synthesis

History

Date

Degree Type

Department

Degree Name

Advisor(s)

Usage metrics

Categories

Keywords

Licence

Exports