Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

Glottal source parametrisation by multi-estimate fusion

Li, Haoxuan (2013) Glottal source parametrisation by multi-estimate fusion. PhD thesis, Dublin City University.

Abstract
Glottal source information has been proven useful in many applications such as speech synthesis, speaker characterisation, voice transformation and pathological speech diagnosis. However, currently no single algorithm can extract reliable glottal source estimates across a wide range of speech signals. This thesis describes an investigation into glottal source parametrisation, including studies, proposals and evaluations on glottal waveform extraction, glottal source modelling by Liljencrants-Fant (LF) model fitting and a new multi-estimate fusion framework. As one of the critical steps in voice source parametrisation, glottal waveform extraction techniques are reviewed. A performance study is carried out on three existing glottal inverse filtering approaches and results confirm that no single algorithm consistently outperforms others and provide a reliable and accurate estimate for different speech signals. The next step is modelling the extracted glottal flow. To more accurately estimate the glottal source parameters, a new time-domain LF-model fitting algorithm by extended Kalman filter is proposed. The algorithm is evaluated by comparing it with a standard time-domain method and a spectral approach. Results show the proposed fitting method is superior to existing fitting methods. To obtain accurate glottal source estimates for different speech signals, a multi-estimate (ME) fusion framework is proposed. In the framework different algorithms are applied in parallel to extract multiple sets of LF-model estimates which are then combined by quantitative data fusion. The ME fusion approach is implemented and tested in several ways. The novel fusion framework is shown to be able to give more reliable glottal LF-model estimates than any single algorithm.
Metadata
Item Type:Thesis (PhD)
Date of Award:November 2013
Refereed:No
Supervisor(s):Scaife, Ronan and O'Brien, Darragh
Uncontrolled Keywords:speech processing; speech synthesis; speaker transformation
Subjects:Computer Science > Computational linguistics
Engineering > Signal processing
Engineering > Acoustical engineering
DCU Faculties and Centres:DCU Faculties and Schools > Faculty of Engineering and Computing > School of Electronic Engineering
Use License:This item is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 3.0 License. View License
ID Code:17917
Deposited On:02 Dec 2013 14:32 by Ronan Scaife . Last Modified 19 Jul 2018 14:58
Documents

Full text available as:

[thumbnail of Li_Haoxuan_PhD_2013.pdf]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
2MB
Downloads

Downloads

Downloads per month over past year

Archive Staff Only: edit this record