DNN-Based Voice Activity Detection with Multi-Task Learning

Tae Gyoon KANG, Nam Soo KIM

  • Full Text Views

    0

  • Cite this

Summary :

Recently, notable improvements in voice activity detection (VAD) problem have been achieved by adopting several machine learning techniques. Among them, the deep neural network (DNN) which learns the mapping between the noisy speech features and the corresponding voice activity status with its deep hidden structure has been one of the most popular techniques. In this letter, we propose a novel approach which enhances the robustness of DNN in mismatched noise conditions with multi-task learning (MTL) framework. In the proposed algorithm, a feature enhancement task for speech features is jointly trained with the conventional VAD task. The experimental results show that the DNN with the proposed framework outperforms the conventional DNN-based VAD algorithm.

Publication
IEICE TRANSACTIONS on Information Vol.E99-D No.2 pp.550-553
Publication Date
2016/02/01
Publicized
2015/10/26
Online ISSN
1745-1361
DOI
10.1587/transinf.2015EDL8168
Type of Manuscript
LETTER
Category
Speech and Hearing

Authors

Tae Gyoon KANG
  Seoul National University
Nam Soo KIM
  Seoul National University

Keyword

FlyerIEICE has prepared a flyer regarding multilingual services. Please use the one in your native language.