A Deep Learning-Based Approach to Non-Intrusive Objective Speech Intelligibility Estimation

Deokgyu YUN, Hannah LEE, Seung Ho CHOI

  • Full Text Views

    0

  • Cite this

Summary :

This paper proposes a deep learning-based non-intrusive objective speech intelligibility estimation method based on recurrent neural network (RNN) with long short-term memory (LSTM) structure. Conventional non-intrusive estimation methods such as standard P.563 have poor estimation performance and lack of consistency, especially, in various noise and reverberation environments. The proposed method trains the LSTM RNN model parameters by utilizing the STOI that is the standard intrusive intelligibility estimation method with reference speech signal. The input and output of the LSTM RNN are the MFCC vector and the frame-wise STOI value, respectively. Experimental results show that the proposed objective intelligibility estimation method outperforms the conventional standard P.563 in various noisy and reverberant environments.

Publication
IEICE TRANSACTIONS on Information Vol.E101-D No.4 pp.1207-1208
Publication Date
2018/04/01
Publicized
2018/01/09
Online ISSN
1745-1361
DOI
10.1587/transinf.2017EDL8225
Type of Manuscript
LETTER
Category
Speech and Hearing

Authors

Deokgyu YUN
  Seoul National University of Science and Technology
Hannah LEE
  Seoul National University of Science and Technology
Seung Ho CHOI
  Seoul National University of Science and Technology

Keyword

FlyerIEICE has prepared a flyer regarding multilingual services. Please use the one in your native language.