An Improved Supervised Speech Separation Method Based on Perceptual Weighted Deep Recurrent Neural Networks

Wei HAN, Xiongwei ZHANG, Meng SUN, Li LI, Wenhua SHI

  • Full Text Views

    0

  • Cite this

Summary :

In this letter, we propose a novel speech separation method based on perceptual weighted deep recurrent neural network (DRNN) which incorporate the masking properties of the human auditory system. In supervised training stage, we firstly utilize the clean label speech of two different speakers to calculate two perceptual weighting matrices. Then, the obtained different perceptual weighting matrices are utilized to adjust the mean squared error between the network outputs and the reference features of both the two clean speech so that the two different speech can mask each other. Experimental results on TSP speech corpus demonstrate that the proposed speech separation approach can achieve significant improvements over the state-of-the-art methods when tested with different mixing cases.

Publication
IEICE TRANSACTIONS on Fundamentals Vol.E100-A No.2 pp.718-721
Publication Date
2017/02/01
Publicized
Online ISSN
1745-1337
DOI
10.1587/transfun.E100.A.718
Type of Manuscript
LETTER
Category
Speech and Hearing

Authors

Wei HAN
  PLA University of Science and Technology
Xiongwei ZHANG
  PLA University of Science and Technology
Meng SUN
  PLA University of Science and Technology
Li LI
  PLA University of Science and Technology
Wenhua SHI
  PLA University of Science and Technology

Keyword

FlyerIEICE has prepared a flyer regarding multilingual services. Please use the one in your native language.