Introducing pronunciation models into decoding has been proven to be benefit to LVCSR. In this paper, a discriminative pronunciation modeling method is presented, within the framework of the Minimum Phone Error (MPE) training for HMM/GMM. In order to bring the pronunciation models into the MPE training, the auxiliary function is rewritten at word level and decomposes into two parts. One is for co-training the acoustic models, and the other is for discriminatively training the pronunciation models. On Mandarin conversational telephone speech recognition task, compared to the baseline using a canonical lexicon, the discriminative pronunciation models reduced the absolute Character Error Rate (CER) by 0.7% on LDC test set, and with the acoustic model co-training, 0.8% additional CER decrease had been achieved.
Meixu SONG
Key Laboratory of Speech Acoustics and Content Understanding, Institute of Acoustics, Chinese Academy of Sciences
Jielin PAN
Key Laboratory of Speech Acoustics and Content Understanding, Institute of Acoustics, Chinese Academy of Sciences
Qingwei ZHAO
Key Laboratory of Speech Acoustics and Content Understanding, Institute of Acoustics, Chinese Academy of Sciences
Yonghong YAN
Key Laboratory of Speech Acoustics and Content Understanding, Institute of Acoustics, Chinese Academy of Sciences
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Meixu SONG, Jielin PAN, Qingwei ZHAO, Yonghong YAN, "Discriminative Pronunciation Modeling Using the MPE Criterion" in IEICE TRANSACTIONS on Information,
vol. E98-D, no. 3, pp. 717-720, March 2015, doi: 10.1587/transinf.2014EDL8212.
Abstract: Introducing pronunciation models into decoding has been proven to be benefit to LVCSR. In this paper, a discriminative pronunciation modeling method is presented, within the framework of the Minimum Phone Error (MPE) training for HMM/GMM. In order to bring the pronunciation models into the MPE training, the auxiliary function is rewritten at word level and decomposes into two parts. One is for co-training the acoustic models, and the other is for discriminatively training the pronunciation models. On Mandarin conversational telephone speech recognition task, compared to the baseline using a canonical lexicon, the discriminative pronunciation models reduced the absolute Character Error Rate (CER) by 0.7% on LDC test set, and with the acoustic model co-training, 0.8% additional CER decrease had been achieved.
URL: https://globals.ieice.org/en_transactions/information/10.1587/transinf.2014EDL8212/_p
Copy
@ARTICLE{e98-d_3_717,
author={Meixu SONG, Jielin PAN, Qingwei ZHAO, Yonghong YAN, },
journal={IEICE TRANSACTIONS on Information},
title={Discriminative Pronunciation Modeling Using the MPE Criterion},
year={2015},
volume={E98-D},
number={3},
pages={717-720},
abstract={Introducing pronunciation models into decoding has been proven to be benefit to LVCSR. In this paper, a discriminative pronunciation modeling method is presented, within the framework of the Minimum Phone Error (MPE) training for HMM/GMM. In order to bring the pronunciation models into the MPE training, the auxiliary function is rewritten at word level and decomposes into two parts. One is for co-training the acoustic models, and the other is for discriminatively training the pronunciation models. On Mandarin conversational telephone speech recognition task, compared to the baseline using a canonical lexicon, the discriminative pronunciation models reduced the absolute Character Error Rate (CER) by 0.7% on LDC test set, and with the acoustic model co-training, 0.8% additional CER decrease had been achieved.},
keywords={},
doi={10.1587/transinf.2014EDL8212},
ISSN={1745-1361},
month={March},}
Copy
TY - JOUR
TI - Discriminative Pronunciation Modeling Using the MPE Criterion
T2 - IEICE TRANSACTIONS on Information
SP - 717
EP - 720
AU - Meixu SONG
AU - Jielin PAN
AU - Qingwei ZHAO
AU - Yonghong YAN
PY - 2015
DO - 10.1587/transinf.2014EDL8212
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E98-D
IS - 3
JA - IEICE TRANSACTIONS on Information
Y1 - March 2015
AB - Introducing pronunciation models into decoding has been proven to be benefit to LVCSR. In this paper, a discriminative pronunciation modeling method is presented, within the framework of the Minimum Phone Error (MPE) training for HMM/GMM. In order to bring the pronunciation models into the MPE training, the auxiliary function is rewritten at word level and decomposes into two parts. One is for co-training the acoustic models, and the other is for discriminatively training the pronunciation models. On Mandarin conversational telephone speech recognition task, compared to the baseline using a canonical lexicon, the discriminative pronunciation models reduced the absolute Character Error Rate (CER) by 0.7% on LDC test set, and with the acoustic model co-training, 0.8% additional CER decrease had been achieved.
ER -