The estimation of the power spectral density (PSD) of noise is crucial for retrieving speech in noisy environments. In this study, we propose a novel method for estimating the non-white noise PSD from noisy speech on the basis of a generalized gamma distribution and the minimum mean square error (MMSE) approach. Because of the highly non-stationary nature of speech, deriving its actual spectral probability density function (PDF) using conventional modeling techniques is difficult. On the other hand, spectral components of noise are more stationary than those of speech and can be represented more accurately by a generalized gamma PDF. The generalized gamma PDF can be adapted to optimally match the actual distribution of the noise spectral amplitudes observed at each frequency bin utilizing two real-time updated parameters, which are calculated in each frame based on the moment matching method. The MMSE noise PSD estimator is derived on the basis of the generalized gamma PDF and Gaussian PDF models for noise and speech spectral amplitudes, respectively. Combined with an improved Weiner filter, the proposed noise PSD estimate method exhibits the best performance compared with the minimum statistics, weighted noise estimation, and MMSE-based noise PSD estimation methods in terms of both subjective and objective measures.
Xin DANG
Tianjin Polytechnic University,Shizuoka University
Takayoshi NAKAI
Shizuoka University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Xin DANG, Takayoshi NAKAI, "Noise Power Spectral Density Estimation Using the Generalized Gamma Probability Density Function and Minimum Mean Square Error" in IEICE TRANSACTIONS on Fundamentals,
vol. E97-A, no. 3, pp. 820-829, March 2014, doi: 10.1587/transfun.E97.A.820.
Abstract: The estimation of the power spectral density (PSD) of noise is crucial for retrieving speech in noisy environments. In this study, we propose a novel method for estimating the non-white noise PSD from noisy speech on the basis of a generalized gamma distribution and the minimum mean square error (MMSE) approach. Because of the highly non-stationary nature of speech, deriving its actual spectral probability density function (PDF) using conventional modeling techniques is difficult. On the other hand, spectral components of noise are more stationary than those of speech and can be represented more accurately by a generalized gamma PDF. The generalized gamma PDF can be adapted to optimally match the actual distribution of the noise spectral amplitudes observed at each frequency bin utilizing two real-time updated parameters, which are calculated in each frame based on the moment matching method. The MMSE noise PSD estimator is derived on the basis of the generalized gamma PDF and Gaussian PDF models for noise and speech spectral amplitudes, respectively. Combined with an improved Weiner filter, the proposed noise PSD estimate method exhibits the best performance compared with the minimum statistics, weighted noise estimation, and MMSE-based noise PSD estimation methods in terms of both subjective and objective measures.
URL: https://globals.ieice.org/en_transactions/fundamentals/10.1587/transfun.E97.A.820/_p
Copy
@ARTICLE{e97-a_3_820,
author={Xin DANG, Takayoshi NAKAI, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={Noise Power Spectral Density Estimation Using the Generalized Gamma Probability Density Function and Minimum Mean Square Error},
year={2014},
volume={E97-A},
number={3},
pages={820-829},
abstract={The estimation of the power spectral density (PSD) of noise is crucial for retrieving speech in noisy environments. In this study, we propose a novel method for estimating the non-white noise PSD from noisy speech on the basis of a generalized gamma distribution and the minimum mean square error (MMSE) approach. Because of the highly non-stationary nature of speech, deriving its actual spectral probability density function (PDF) using conventional modeling techniques is difficult. On the other hand, spectral components of noise are more stationary than those of speech and can be represented more accurately by a generalized gamma PDF. The generalized gamma PDF can be adapted to optimally match the actual distribution of the noise spectral amplitudes observed at each frequency bin utilizing two real-time updated parameters, which are calculated in each frame based on the moment matching method. The MMSE noise PSD estimator is derived on the basis of the generalized gamma PDF and Gaussian PDF models for noise and speech spectral amplitudes, respectively. Combined with an improved Weiner filter, the proposed noise PSD estimate method exhibits the best performance compared with the minimum statistics, weighted noise estimation, and MMSE-based noise PSD estimation methods in terms of both subjective and objective measures.},
keywords={},
doi={10.1587/transfun.E97.A.820},
ISSN={1745-1337},
month={March},}
Copy
TY - JOUR
TI - Noise Power Spectral Density Estimation Using the Generalized Gamma Probability Density Function and Minimum Mean Square Error
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 820
EP - 829
AU - Xin DANG
AU - Takayoshi NAKAI
PY - 2014
DO - 10.1587/transfun.E97.A.820
JO - IEICE TRANSACTIONS on Fundamentals
SN - 1745-1337
VL - E97-A
IS - 3
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - March 2014
AB - The estimation of the power spectral density (PSD) of noise is crucial for retrieving speech in noisy environments. In this study, we propose a novel method for estimating the non-white noise PSD from noisy speech on the basis of a generalized gamma distribution and the minimum mean square error (MMSE) approach. Because of the highly non-stationary nature of speech, deriving its actual spectral probability density function (PDF) using conventional modeling techniques is difficult. On the other hand, spectral components of noise are more stationary than those of speech and can be represented more accurately by a generalized gamma PDF. The generalized gamma PDF can be adapted to optimally match the actual distribution of the noise spectral amplitudes observed at each frequency bin utilizing two real-time updated parameters, which are calculated in each frame based on the moment matching method. The MMSE noise PSD estimator is derived on the basis of the generalized gamma PDF and Gaussian PDF models for noise and speech spectral amplitudes, respectively. Combined with an improved Weiner filter, the proposed noise PSD estimate method exhibits the best performance compared with the minimum statistics, weighted noise estimation, and MMSE-based noise PSD estimation methods in terms of both subjective and objective measures.
ER -