This paper presents an inversion algorithm for dynamic Bayesian networks towards robust speech recognition, namely DBNI, which is a generalization of hidden Markov model inversion (HMMI). As a dual procedure of expectation maximization (EM)-based model reestimation, DBNI finds the 'uncontaminated' speech by moving the input noisy speech to the Gaussian means under the maximum likelihood (ML) sense given the DBN models trained on clean speech. This algorithm can provide both the expressive advantage from DBN and the noise-removal feature from model inversion. Experiments on the Aurora 2.0 database show that the hidden feature model (a typical DBN for speech recognition) with the DBNI algorithm achieves superior performance in terms of word error rate reduction.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Lei XIE, Hongwu YANG, "Dynamic Bayesian Network Inversion for Robust Speech Recognition" in IEICE TRANSACTIONS on Information,
vol. E90-D, no. 7, pp. 1117-1120, July 2007, doi: 10.1093/ietisy/e90-d.7.1117.
Abstract: This paper presents an inversion algorithm for dynamic Bayesian networks towards robust speech recognition, namely DBNI, which is a generalization of hidden Markov model inversion (HMMI). As a dual procedure of expectation maximization (EM)-based model reestimation, DBNI finds the 'uncontaminated' speech by moving the input noisy speech to the Gaussian means under the maximum likelihood (ML) sense given the DBN models trained on clean speech. This algorithm can provide both the expressive advantage from DBN and the noise-removal feature from model inversion. Experiments on the Aurora 2.0 database show that the hidden feature model (a typical DBN for speech recognition) with the DBNI algorithm achieves superior performance in terms of word error rate reduction.
URL: https://globals.ieice.org/en_transactions/information/10.1093/ietisy/e90-d.7.1117/_p
Copy
@ARTICLE{e90-d_7_1117,
author={Lei XIE, Hongwu YANG, },
journal={IEICE TRANSACTIONS on Information},
title={Dynamic Bayesian Network Inversion for Robust Speech Recognition},
year={2007},
volume={E90-D},
number={7},
pages={1117-1120},
abstract={This paper presents an inversion algorithm for dynamic Bayesian networks towards robust speech recognition, namely DBNI, which is a generalization of hidden Markov model inversion (HMMI). As a dual procedure of expectation maximization (EM)-based model reestimation, DBNI finds the 'uncontaminated' speech by moving the input noisy speech to the Gaussian means under the maximum likelihood (ML) sense given the DBN models trained on clean speech. This algorithm can provide both the expressive advantage from DBN and the noise-removal feature from model inversion. Experiments on the Aurora 2.0 database show that the hidden feature model (a typical DBN for speech recognition) with the DBNI algorithm achieves superior performance in terms of word error rate reduction.},
keywords={},
doi={10.1093/ietisy/e90-d.7.1117},
ISSN={1745-1361},
month={July},}
Copy
TY - JOUR
TI - Dynamic Bayesian Network Inversion for Robust Speech Recognition
T2 - IEICE TRANSACTIONS on Information
SP - 1117
EP - 1120
AU - Lei XIE
AU - Hongwu YANG
PY - 2007
DO - 10.1093/ietisy/e90-d.7.1117
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E90-D
IS - 7
JA - IEICE TRANSACTIONS on Information
Y1 - July 2007
AB - This paper presents an inversion algorithm for dynamic Bayesian networks towards robust speech recognition, namely DBNI, which is a generalization of hidden Markov model inversion (HMMI). As a dual procedure of expectation maximization (EM)-based model reestimation, DBNI finds the 'uncontaminated' speech by moving the input noisy speech to the Gaussian means under the maximum likelihood (ML) sense given the DBN models trained on clean speech. This algorithm can provide both the expressive advantage from DBN and the noise-removal feature from model inversion. Experiments on the Aurora 2.0 database show that the hidden feature model (a typical DBN for speech recognition) with the DBNI algorithm achieves superior performance in terms of word error rate reduction.
ER -