Statistical Model-Based VAD Algorithm with Wavelet Transform

Yoon-Chang LEE, Sang-Sik AHN

  • Full Text Views

    0

  • Cite this

Summary :

This paper presents a new statistical model-based voice activity detection (VAD) algorithm in the wavelet domain to improve the performance in non-stationary environments. Due to the efficient time-frequency localization and the multi-resolution characteristics of the wavelet representations, the wavelet transforms are quite suitable for processing non-stationary signals such as speech. To utilize the fact that the wavelet packet is very efficient approximation of discrete Fourier transform and has built-in de-noising capability, we first apply wavelet packet decomposition to effectively localize the energy in frequency space, use spectral subtraction, and employ matched filtering to enhance the SNR. Since the conventional wavelet-based spectral subtraction eliminates the low-power speech signal in onset and offset regions and generates musical noise, we derive an improved multi-band spectral subtraction. On the other hand, noticing that fixed threshold cannot follow fluctuations of time varying noise power and the inability to adapt to a time-varying environment severely limits the VAD performance, we propose a statistical model-based VAD algorithm in wavelet domain with an adaptive threshold. We perform extensive computer simulations and compare with the conventional algorithms to demonstrate performance improvement of the proposed algorithm under various noise environments.

Publication
IEICE TRANSACTIONS on Fundamentals Vol.E89-A No.6 pp.1594-1600
Publication Date
2006/06/01
Publicized
Online ISSN
1745-1337
DOI
10.1093/ietfec/e89-a.6.1594
Type of Manuscript
Special Section PAPER (Special Section on Papers Selected from 2005 International Technical Conference on Circuits/Systems, Computers and Communications (ITC-CSCC2005))
Category

Authors

Keyword

FlyerIEICE has prepared a flyer regarding multilingual services. Please use the one in your native language.