This paper proposes a low power tone recognition suitable for automatic tonal speech recognizer (ATSR). The tone recognition estimates fundamental frequency (F0) only from vowels by using a new magnitude difference function (MDF), called vowel-MDF. Accordingly, the number of operations is considerably reduced. In order to apply the tone recognition in portable electronic equipment, the tone recognition is designed using parallel and pipeline architecture. Due to the pipeline and parallel computations, the architecture achieves high throughput and consumes low power. In addition, the architecture is able to reduce the number of input frames depending on vowels, making it more adaptable depending on the maximum number of frames. The proposed architecture is evaluated with words selected from voice activation for GPS systems, phone dialing options, and words having the same phoneme but different tones. In comparison with the autocorrelation method, the experimental results show 35.7% reduction in power consumption and 27.1% improvement of tone recognition accuracy (110 words comprising 187 syllables). In comparison with ATSR without the tone recognition, the speech recognition accuracy indicates 25.0% improvement of ATSR with tone recogntion (2,250 training data and 45 testing words).
Jirabhorn CHAIWONGSAI
King Mongkut's University of Technology Thonburi
Werapon CHIRACHARIT
King Mongkut's University of Technology Thonburi
Kosin CHAMNONGTHAI
King Mongkut's University of Technology Thonburi
Yoshikazu MIYANAGA
Hokkaido University
Kohji HIGUCHI
The University of Electro-Communications
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Jirabhorn CHAIWONGSAI, Werapon CHIRACHARIT, Kosin CHAMNONGTHAI, Yoshikazu MIYANAGA, Kohji HIGUCHI, "A Low Power Tone Recognition for Automatic Tonal Speech Recognizer" in IEICE TRANSACTIONS on Fundamentals,
vol. E96-A, no. 6, pp. 1403-1411, June 2013, doi: 10.1587/transfun.E96.A.1403.
Abstract: This paper proposes a low power tone recognition suitable for automatic tonal speech recognizer (ATSR). The tone recognition estimates fundamental frequency (F0) only from vowels by using a new magnitude difference function (MDF), called vowel-MDF. Accordingly, the number of operations is considerably reduced. In order to apply the tone recognition in portable electronic equipment, the tone recognition is designed using parallel and pipeline architecture. Due to the pipeline and parallel computations, the architecture achieves high throughput and consumes low power. In addition, the architecture is able to reduce the number of input frames depending on vowels, making it more adaptable depending on the maximum number of frames. The proposed architecture is evaluated with words selected from voice activation for GPS systems, phone dialing options, and words having the same phoneme but different tones. In comparison with the autocorrelation method, the experimental results show 35.7% reduction in power consumption and 27.1% improvement of tone recognition accuracy (110 words comprising 187 syllables). In comparison with ATSR without the tone recognition, the speech recognition accuracy indicates 25.0% improvement of ATSR with tone recogntion (2,250 training data and 45 testing words).
URL: https://globals.ieice.org/en_transactions/fundamentals/10.1587/transfun.E96.A.1403/_p
Copy
@ARTICLE{e96-a_6_1403,
author={Jirabhorn CHAIWONGSAI, Werapon CHIRACHARIT, Kosin CHAMNONGTHAI, Yoshikazu MIYANAGA, Kohji HIGUCHI, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={A Low Power Tone Recognition for Automatic Tonal Speech Recognizer},
year={2013},
volume={E96-A},
number={6},
pages={1403-1411},
abstract={This paper proposes a low power tone recognition suitable for automatic tonal speech recognizer (ATSR). The tone recognition estimates fundamental frequency (F0) only from vowels by using a new magnitude difference function (MDF), called vowel-MDF. Accordingly, the number of operations is considerably reduced. In order to apply the tone recognition in portable electronic equipment, the tone recognition is designed using parallel and pipeline architecture. Due to the pipeline and parallel computations, the architecture achieves high throughput and consumes low power. In addition, the architecture is able to reduce the number of input frames depending on vowels, making it more adaptable depending on the maximum number of frames. The proposed architecture is evaluated with words selected from voice activation for GPS systems, phone dialing options, and words having the same phoneme but different tones. In comparison with the autocorrelation method, the experimental results show 35.7% reduction in power consumption and 27.1% improvement of tone recognition accuracy (110 words comprising 187 syllables). In comparison with ATSR without the tone recognition, the speech recognition accuracy indicates 25.0% improvement of ATSR with tone recogntion (2,250 training data and 45 testing words).},
keywords={},
doi={10.1587/transfun.E96.A.1403},
ISSN={1745-1337},
month={June},}
Copy
TY - JOUR
TI - A Low Power Tone Recognition for Automatic Tonal Speech Recognizer
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 1403
EP - 1411
AU - Jirabhorn CHAIWONGSAI
AU - Werapon CHIRACHARIT
AU - Kosin CHAMNONGTHAI
AU - Yoshikazu MIYANAGA
AU - Kohji HIGUCHI
PY - 2013
DO - 10.1587/transfun.E96.A.1403
JO - IEICE TRANSACTIONS on Fundamentals
SN - 1745-1337
VL - E96-A
IS - 6
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - June 2013
AB - This paper proposes a low power tone recognition suitable for automatic tonal speech recognizer (ATSR). The tone recognition estimates fundamental frequency (F0) only from vowels by using a new magnitude difference function (MDF), called vowel-MDF. Accordingly, the number of operations is considerably reduced. In order to apply the tone recognition in portable electronic equipment, the tone recognition is designed using parallel and pipeline architecture. Due to the pipeline and parallel computations, the architecture achieves high throughput and consumes low power. In addition, the architecture is able to reduce the number of input frames depending on vowels, making it more adaptable depending on the maximum number of frames. The proposed architecture is evaluated with words selected from voice activation for GPS systems, phone dialing options, and words having the same phoneme but different tones. In comparison with the autocorrelation method, the experimental results show 35.7% reduction in power consumption and 27.1% improvement of tone recognition accuracy (110 words comprising 187 syllables). In comparison with ATSR without the tone recognition, the speech recognition accuracy indicates 25.0% improvement of ATSR with tone recogntion (2,250 training data and 45 testing words).
ER -