This paper describes a novel parameter generation algorithm for an HMM-based speech synthesis technique. The conventional algorithm generates a parameter trajectory of static features that maximizes the likelihood of a given HMM for the parameter sequence consisting of the static and dynamic features under an explicit constraint between those two features. The generated trajectory is often excessively smoothed due to the statistical processing. Using the over-smoothed speech parameters usually causes muffled sounds. In order to alleviate the over-smoothing effect, we propose a generation algorithm considering not only the HMM likelihood maximized in the conventional algorithm but also a likelihood for a global variance (GV) of the generated trajectory. The latter likelihood works as a penalty for the over-smoothing, i.e., a reduction of the GV of the generated trajectory. The result of a perceptual evaluation demonstrates that the proposed algorithm causes considerably large improvements in the naturalness of synthetic speech.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Tomoki TODA, Keiichi TOKUDA, "A Speech Parameter Generation Algorithm Considering Global Variance for HMM-Based Speech Synthesis" in IEICE TRANSACTIONS on Information,
vol. E90-D, no. 5, pp. 816-824, May 2007, doi: 10.1093/ietisy/e90-d.5.816.
Abstract: This paper describes a novel parameter generation algorithm for an HMM-based speech synthesis technique. The conventional algorithm generates a parameter trajectory of static features that maximizes the likelihood of a given HMM for the parameter sequence consisting of the static and dynamic features under an explicit constraint between those two features. The generated trajectory is often excessively smoothed due to the statistical processing. Using the over-smoothed speech parameters usually causes muffled sounds. In order to alleviate the over-smoothing effect, we propose a generation algorithm considering not only the HMM likelihood maximized in the conventional algorithm but also a likelihood for a global variance (GV) of the generated trajectory. The latter likelihood works as a penalty for the over-smoothing, i.e., a reduction of the GV of the generated trajectory. The result of a perceptual evaluation demonstrates that the proposed algorithm causes considerably large improvements in the naturalness of synthetic speech.
URL: https://globals.ieice.org/en_transactions/information/10.1093/ietisy/e90-d.5.816/_p
Copy
@ARTICLE{e90-d_5_816,
author={Tomoki TODA, Keiichi TOKUDA, },
journal={IEICE TRANSACTIONS on Information},
title={A Speech Parameter Generation Algorithm Considering Global Variance for HMM-Based Speech Synthesis},
year={2007},
volume={E90-D},
number={5},
pages={816-824},
abstract={This paper describes a novel parameter generation algorithm for an HMM-based speech synthesis technique. The conventional algorithm generates a parameter trajectory of static features that maximizes the likelihood of a given HMM for the parameter sequence consisting of the static and dynamic features under an explicit constraint between those two features. The generated trajectory is often excessively smoothed due to the statistical processing. Using the over-smoothed speech parameters usually causes muffled sounds. In order to alleviate the over-smoothing effect, we propose a generation algorithm considering not only the HMM likelihood maximized in the conventional algorithm but also a likelihood for a global variance (GV) of the generated trajectory. The latter likelihood works as a penalty for the over-smoothing, i.e., a reduction of the GV of the generated trajectory. The result of a perceptual evaluation demonstrates that the proposed algorithm causes considerably large improvements in the naturalness of synthetic speech.},
keywords={},
doi={10.1093/ietisy/e90-d.5.816},
ISSN={1745-1361},
month={May},}
Copy
TY - JOUR
TI - A Speech Parameter Generation Algorithm Considering Global Variance for HMM-Based Speech Synthesis
T2 - IEICE TRANSACTIONS on Information
SP - 816
EP - 824
AU - Tomoki TODA
AU - Keiichi TOKUDA
PY - 2007
DO - 10.1093/ietisy/e90-d.5.816
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E90-D
IS - 5
JA - IEICE TRANSACTIONS on Information
Y1 - May 2007
AB - This paper describes a novel parameter generation algorithm for an HMM-based speech synthesis technique. The conventional algorithm generates a parameter trajectory of static features that maximizes the likelihood of a given HMM for the parameter sequence consisting of the static and dynamic features under an explicit constraint between those two features. The generated trajectory is often excessively smoothed due to the statistical processing. Using the over-smoothed speech parameters usually causes muffled sounds. In order to alleviate the over-smoothing effect, we propose a generation algorithm considering not only the HMM likelihood maximized in the conventional algorithm but also a likelihood for a global variance (GV) of the generated trajectory. The latter likelihood works as a penalty for the over-smoothing, i.e., a reduction of the GV of the generated trajectory. The result of a perceptual evaluation demonstrates that the proposed algorithm causes considerably large improvements in the naturalness of synthetic speech.
ER -