Committee-Based Active Learning for Speech Recognition

Yuzo HAMANAKA; Koichi SHINODA; Takuya TSUTAOKA; Sadaoki FURUI; Tadashi EMORI; Takafumi KOSHINAKA

doi:10.1587/transinf.E94.D.2015

Committee-Based Active Learning for Speech Recognition

Yuzo HAMANAKA, Koichi SHINODA, Takuya TSUTAOKA, Sadaoki FURUI, Tadashi EMORI, Takafumi KOSHINAKA

Full Text Views

0

Share
Cite this

Summary :

We propose a committee-based method of active learning for large vocabulary continuous speech recognition. Multiple recognizers are trained in this approach, and the recognition results obtained from these are used for selecting utterances. Those utterances whose recognition results differ the most among recognizers are selected and transcribed. Progressive alignment and voting entropy are used to measure the degree of disagreement among recognizers on the recognition result. Our method was evaluated by using 191-hour speech data in the Corpus of Spontaneous Japanese. It proved to be significantly better than random selection. It only required 63 h of data to achieve a word accuracy of 74%, while standard training (i.e., random selection) required 103 h of data. It also proved to be significantly better than conventional uncertainty sampling using word posterior probabilities.

Publication: IEICE TRANSACTIONS on Information Vol.E94-D No.10 pp.2015-2023

Publication Date: 2011/10/01

Publicized

Online ISSN: 1745-1361

DOI: 10.1587/transinf.E94.D.2015

Type of Manuscript: PAPER

Category: Speech and Hearing

Cite this

Copy

Yuzo HAMANAKA, Koichi SHINODA, Takuya TSUTAOKA, Sadaoki FURUI, Tadashi EMORI, Takafumi KOSHINAKA, "Committee-Based Active Learning for Speech Recognition" in IEICE TRANSACTIONS on Information, vol. E94-D, no. 10, pp. 2015-2023, October 2011, doi: 10.1587/transinf.E94.D.2015.
Abstract: We propose a committee-based method of active learning for large vocabulary continuous speech recognition. Multiple recognizers are trained in this approach, and the recognition results obtained from these are used for selecting utterances. Those utterances whose recognition results differ the most among recognizers are selected and transcribed. Progressive alignment and voting entropy are used to measure the degree of disagreement among recognizers on the recognition result. Our method was evaluated by using 191-hour speech data in the Corpus of Spontaneous Japanese. It proved to be significantly better than random selection. It only required 63 h of data to achieve a word accuracy of 74%, while standard training (i.e., random selection) required 103 h of data. It also proved to be significantly better than conventional uncertainty sampling using word posterior probabilities.
URL: https://globals.ieice.org/en_transactions/information/10.1587/transinf.E94.D.2015/_p

Copy

@ARTICLE{e94-d_10_2015,
author={Yuzo HAMANAKA, Koichi SHINODA, Takuya TSUTAOKA, Sadaoki FURUI, Tadashi EMORI, Takafumi KOSHINAKA, },
journal={IEICE TRANSACTIONS on Information},
title={Committee-Based Active Learning for Speech Recognition},
year={2011},
volume={E94-D},
number={10},
pages={2015-2023},
abstract={We propose a committee-based method of active learning for large vocabulary continuous speech recognition. Multiple recognizers are trained in this approach, and the recognition results obtained from these are used for selecting utterances. Those utterances whose recognition results differ the most among recognizers are selected and transcribed. Progressive alignment and voting entropy are used to measure the degree of disagreement among recognizers on the recognition result. Our method was evaluated by using 191-hour speech data in the Corpus of Spontaneous Japanese. It proved to be significantly better than random selection. It only required 63 h of data to achieve a word accuracy of 74%, while standard training (i.e., random selection) required 103 h of data. It also proved to be significantly better than conventional uncertainty sampling using word posterior probabilities.},
keywords={},
doi={10.1587/transinf.E94.D.2015},
ISSN={1745-1361},
month={October},}

Copy

TY - JOUR
TI - Committee-Based Active Learning for Speech Recognition
T2 - IEICE TRANSACTIONS on Information
SP - 2015
EP - 2023
AU - Yuzo HAMANAKA
AU - Koichi SHINODA
AU - Takuya TSUTAOKA
AU - Sadaoki FURUI
AU - Tadashi EMORI
AU - Takafumi KOSHINAKA
PY - 2011
DO - 10.1587/transinf.E94.D.2015
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E94-D
IS - 10
JA - IEICE TRANSACTIONS on Information
Y1 - October 2011
AB - We propose a committee-based method of active learning for large vocabulary continuous speech recognition. Multiple recognizers are trained in this approach, and the recognition results obtained from these are used for selecting utterances. Those utterances whose recognition results differ the most among recognizers are selected and transcribed. Progressive alignment and voting entropy are used to measure the degree of disagreement among recognizers on the recognition result. Our method was evaluated by using 191-hour speech data in the Corpus of Spontaneous Japanese. It proved to be significantly better than random selection. It only required 63 h of data to achieve a word accuracy of 74%, while standard training (i.e., random selection) required 103 h of data. It also proved to be significantly better than conventional uncertainty sampling using word posterior probabilities.
ER -