We propose a noise suppression method based on multi-model compositions and multi-pass search. In real environments, input speech for speech recognition includes many kinds of noise signals. To obtain good recognized candidates, suppressing many kinds of noise signals at once and finding target speech is important. Before noise suppression, to find speech and noise label sequences, we introduce multi-pass search with acoustic models including many kinds of noise models and their compositions, their n-gram models, and their lexicon. Noise suppression is frame-synchronously performed using the multiple models selected by recognized label sequences with time alignments. We evaluated this method using the E-Nightingale task, which contains voice memoranda spoken by nurses during actual work at hospitals. The proposed method obtained higher performance than the conventional method.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Takatoshi JITSUHIRO, Tomoji TORIYAMA, Kiyoshi KOGURE, "Noise Suppression Based on Multi-Model Compositions Using Multi-Pass Search with Multi-Label N-gram Models" in IEICE TRANSACTIONS on Information,
vol. E91-D, no. 3, pp. 402-410, March 2008, doi: 10.1093/ietisy/e91-d.3.402.
Abstract: We propose a noise suppression method based on multi-model compositions and multi-pass search. In real environments, input speech for speech recognition includes many kinds of noise signals. To obtain good recognized candidates, suppressing many kinds of noise signals at once and finding target speech is important. Before noise suppression, to find speech and noise label sequences, we introduce multi-pass search with acoustic models including many kinds of noise models and their compositions, their n-gram models, and their lexicon. Noise suppression is frame-synchronously performed using the multiple models selected by recognized label sequences with time alignments. We evaluated this method using the E-Nightingale task, which contains voice memoranda spoken by nurses during actual work at hospitals. The proposed method obtained higher performance than the conventional method.
URL: https://globals.ieice.org/en_transactions/information/10.1093/ietisy/e91-d.3.402/_p
Copy
@ARTICLE{e91-d_3_402,
author={Takatoshi JITSUHIRO, Tomoji TORIYAMA, Kiyoshi KOGURE, },
journal={IEICE TRANSACTIONS on Information},
title={Noise Suppression Based on Multi-Model Compositions Using Multi-Pass Search with Multi-Label N-gram Models},
year={2008},
volume={E91-D},
number={3},
pages={402-410},
abstract={We propose a noise suppression method based on multi-model compositions and multi-pass search. In real environments, input speech for speech recognition includes many kinds of noise signals. To obtain good recognized candidates, suppressing many kinds of noise signals at once and finding target speech is important. Before noise suppression, to find speech and noise label sequences, we introduce multi-pass search with acoustic models including many kinds of noise models and their compositions, their n-gram models, and their lexicon. Noise suppression is frame-synchronously performed using the multiple models selected by recognized label sequences with time alignments. We evaluated this method using the E-Nightingale task, which contains voice memoranda spoken by nurses during actual work at hospitals. The proposed method obtained higher performance than the conventional method.},
keywords={},
doi={10.1093/ietisy/e91-d.3.402},
ISSN={1745-1361},
month={March},}
Copy
TY - JOUR
TI - Noise Suppression Based on Multi-Model Compositions Using Multi-Pass Search with Multi-Label N-gram Models
T2 - IEICE TRANSACTIONS on Information
SP - 402
EP - 410
AU - Takatoshi JITSUHIRO
AU - Tomoji TORIYAMA
AU - Kiyoshi KOGURE
PY - 2008
DO - 10.1093/ietisy/e91-d.3.402
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E91-D
IS - 3
JA - IEICE TRANSACTIONS on Information
Y1 - March 2008
AB - We propose a noise suppression method based on multi-model compositions and multi-pass search. In real environments, input speech for speech recognition includes many kinds of noise signals. To obtain good recognized candidates, suppressing many kinds of noise signals at once and finding target speech is important. Before noise suppression, to find speech and noise label sequences, we introduce multi-pass search with acoustic models including many kinds of noise models and their compositions, their n-gram models, and their lexicon. Noise suppression is frame-synchronously performed using the multiple models selected by recognized label sequences with time alignments. We evaluated this method using the E-Nightingale task, which contains voice memoranda spoken by nurses during actual work at hospitals. The proposed method obtained higher performance than the conventional method.
ER -