Phonology and Morphology Modeling in a Very Large Vocabulary Hungarian Dictation System

Mate SZARVAS, Sadaoki FURUI

  • Full Text Views

    0

  • Cite this

Summary :

This article introduces a novel approach to model phonology and morphosyntax in morpheme unit-based speech recognizers. The proposed methods are evaluated on a Hungarian newspaper dictation task that requires modeling over 1 million different word forms. The architecture of the recognition system is based on the weighted finite-state transducer (WFST) paradigm. The vocabulary units used in the system are morpheme-based in order to provide sufficient coverage of the large number of word-forms resulting from affixation and compounding. Besides the basic pronunciation model and the morpheme N-gram language model we evaluate a novel phonology model and the novel stochastic morphosyntactic language model (SMLM). Thanks to the flexible transducer-based architecture of the system, these new components are integrated seamlessly with the basic modules with no need to modify the decoder itself. We compare the phoneme, morpheme, and word error-rates as well as the sizes of the recognition networks in two configurations. In one configuration we use only the N-gram model while in the other we use the combined model. The proposed stochastic morphosyntactic language model decreases the morpheme error rate by between 1.7 and 7.2% relatively when compared to the baseline trigram system. The proposed phonology model reduced the error rate by 8.32%. The morpheme error-rate of the best configuration is 18% and the best word error-rate is 22.3%.

Publication
IEICE TRANSACTIONS on Information Vol.E87-D No.12 pp.2791-2801
Publication Date
2004/12/01
Publicized
Online ISSN
DOI
Type of Manuscript
PAPER
Category
Speech and Hearing

Authors

Keyword

FlyerIEICE has prepared a flyer regarding multilingual services. Please use the one in your native language.