Active Learning Using Phone-Error Distribution for Speech Modeling

Hiroko MURAKAMI, Koichi SHINODA, Sadaoki FURUI

  • Full Text Views

    0

  • Cite this

Summary :

We propose an active learning framework for speech recognition that reduces the amount of data required for acoustic modeling. This framework consists of two steps. We first obtain a phone-error distribution using an acoustic model estimated from transcribed speech data. Then, from a text corpus we select a sentence whose phone-occurrence distribution is close to the phone-error distribution and collect its speech data. We repeat this process to increase the amount of transcribed speech data. We applied this framework to speaker adaptation and acoustic model training. Our evaluation results showed that it significantly reduced the amount of transcribed data while maintaining the same level of accuracy.

Publication
IEICE TRANSACTIONS on Information Vol.E95-D No.10 pp.2486-2494
Publication Date
2012/10/01
Publicized
Online ISSN
1745-1361
DOI
10.1587/transinf.E95.D.2486
Type of Manuscript
PAPER
Category
Speech and Hearing

Authors

Keyword

FlyerIEICE has prepared a flyer regarding multilingual services. Please use the one in your native language.