A Speech Parameter Generation Algorithm Considering Global Variance for HMM-Based Speech Synthesis

Tomoki TODA; Keiichi TOKUDA

doi:10.1093/ietisy/e90-d.5.816

A Speech Parameter Generation Algorithm Considering Global Variance for HMM-Based Speech Synthesis

Tomoki TODA, Keiichi TOKUDA

Full Text Views

1

Share
Cite this

Summary :

This paper describes a novel parameter generation algorithm for an HMM-based speech synthesis technique. The conventional algorithm generates a parameter trajectory of static features that maximizes the likelihood of a given HMM for the parameter sequence consisting of the static and dynamic features under an explicit constraint between those two features. The generated trajectory is often excessively smoothed due to the statistical processing. Using the over-smoothed speech parameters usually causes muffled sounds. In order to alleviate the over-smoothing effect, we propose a generation algorithm considering not only the HMM likelihood maximized in the conventional algorithm but also a likelihood for a global variance (GV) of the generated trajectory. The latter likelihood works as a penalty for the over-smoothing, i.e., a reduction of the GV of the generated trajectory. The result of a perceptual evaluation demonstrates that the proposed algorithm causes considerably large improvements in the naturalness of synthetic speech.

Publication: IEICE TRANSACTIONS on Information Vol.E90-D No.5 pp.816-824

Publication Date: 2007/05/01

Publicized

Online ISSN: 1745-1361

DOI: 10.1093/ietisy/e90-d.5.816

Type of Manuscript: PAPER

Category: Speech and Hearing

Cite this

Copy

Tomoki TODA, Keiichi TOKUDA, "A Speech Parameter Generation Algorithm Considering Global Variance for HMM-Based Speech Synthesis" in IEICE TRANSACTIONS on Information, vol. E90-D, no. 5, pp. 816-824, May 2007, doi: 10.1093/ietisy/e90-d.5.816.
Abstract: This paper describes a novel parameter generation algorithm for an HMM-based speech synthesis technique. The conventional algorithm generates a parameter trajectory of static features that maximizes the likelihood of a given HMM for the parameter sequence consisting of the static and dynamic features under an explicit constraint between those two features. The generated trajectory is often excessively smoothed due to the statistical processing. Using the over-smoothed speech parameters usually causes muffled sounds. In order to alleviate the over-smoothing effect, we propose a generation algorithm considering not only the HMM likelihood maximized in the conventional algorithm but also a likelihood for a global variance (GV) of the generated trajectory. The latter likelihood works as a penalty for the over-smoothing, i.e., a reduction of the GV of the generated trajectory. The result of a perceptual evaluation demonstrates that the proposed algorithm causes considerably large improvements in the naturalness of synthetic speech.
URL: https://globals.ieice.org/en_transactions/information/10.1093/ietisy/e90-d.5.816/_p

Copy

@ARTICLE{e90-d_5_816,
author={Tomoki TODA, Keiichi TOKUDA, },
journal={IEICE TRANSACTIONS on Information},
title={A Speech Parameter Generation Algorithm Considering Global Variance for HMM-Based Speech Synthesis},
year={2007},
volume={E90-D},
number={5},
pages={816-824},
abstract={This paper describes a novel parameter generation algorithm for an HMM-based speech synthesis technique. The conventional algorithm generates a parameter trajectory of static features that maximizes the likelihood of a given HMM for the parameter sequence consisting of the static and dynamic features under an explicit constraint between those two features. The generated trajectory is often excessively smoothed due to the statistical processing. Using the over-smoothed speech parameters usually causes muffled sounds. In order to alleviate the over-smoothing effect, we propose a generation algorithm considering not only the HMM likelihood maximized in the conventional algorithm but also a likelihood for a global variance (GV) of the generated trajectory. The latter likelihood works as a penalty for the over-smoothing, i.e., a reduction of the GV of the generated trajectory. The result of a perceptual evaluation demonstrates that the proposed algorithm causes considerably large improvements in the naturalness of synthetic speech.},
keywords={},
doi={10.1093/ietisy/e90-d.5.816},
ISSN={1745-1361},
month={May},}

Copy

TY - JOUR
TI - A Speech Parameter Generation Algorithm Considering Global Variance for HMM-Based Speech Synthesis
T2 - IEICE TRANSACTIONS on Information
SP - 816
EP - 824
AU - Tomoki TODA
AU - Keiichi TOKUDA
PY - 2007
DO - 10.1093/ietisy/e90-d.5.816
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E90-D
IS - 5
JA - IEICE TRANSACTIONS on Information
Y1 - May 2007
AB - This paper describes a novel parameter generation algorithm for an HMM-based speech synthesis technique. The conventional algorithm generates a parameter trajectory of static features that maximizes the likelihood of a given HMM for the parameter sequence consisting of the static and dynamic features under an explicit constraint between those two features. The generated trajectory is often excessively smoothed due to the statistical processing. Using the over-smoothed speech parameters usually causes muffled sounds. In order to alleviate the over-smoothing effect, we propose a generation algorithm considering not only the HMM likelihood maximized in the conventional algorithm but also a likelihood for a global variance (GV) of the generated trajectory. The latter likelihood works as a penalty for the over-smoothing, i.e., a reduction of the GV of the generated trajectory. The result of a perceptual evaluation demonstrates that the proposed algorithm causes considerably large improvements in the naturalness of synthetic speech.
ER -