This paper is concerned with improving noise-robustness of a multi-SNR multi-band speaker identification system by introducing automatic adjustment of subband likelihood recombination weights. The adjustment is performed on the basis of subband power calculated from the noise observed just before the speech starts in the input signal. To evaluate the noise-robustness of this system, text-independent speaker identification experiments were conducted on speech data corrupted with noises recorded in five environments: "bus," "car," "office," "lobby," and "restaurant". It was found that the present method reduces the identification error by 15.9% compared with the multi-SNR multi-band method with equal recombination weights at 0 dB SNR. The performance of the present method was compared with a clean fullband method in which a speaker model training is performed on clean speech data, and spectral subtraction is applied to the input signal in the speaker identification stage. When the clean fullband method without spectral subtraction is taken as a baseline, the multi-SNR multi-band method with automatic adjustment of recombination weights attained 56.8% error reduction on average, while the average error reduction rate of the clean fullband method with spectral subtraction was 11.4% at 0 dB SNR.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Kenichi YOSHIDA, Kazuyuki TAKAGI, Kazuhiko OZEKI, "Automatic Adjustment of Subband Likelihood Recombination Weights for Improving Noise-Robustness of a Multi-SNR Multi-Band Speaker Identification System" in IEICE TRANSACTIONS on Information,
vol. E87-D, no. 11, pp. 2453-2459, November 2004, doi: .
Abstract: This paper is concerned with improving noise-robustness of a multi-SNR multi-band speaker identification system by introducing automatic adjustment of subband likelihood recombination weights. The adjustment is performed on the basis of subband power calculated from the noise observed just before the speech starts in the input signal. To evaluate the noise-robustness of this system, text-independent speaker identification experiments were conducted on speech data corrupted with noises recorded in five environments: "bus," "car," "office," "lobby," and "restaurant". It was found that the present method reduces the identification error by 15.9% compared with the multi-SNR multi-band method with equal recombination weights at 0 dB SNR. The performance of the present method was compared with a clean fullband method in which a speaker model training is performed on clean speech data, and spectral subtraction is applied to the input signal in the speaker identification stage. When the clean fullband method without spectral subtraction is taken as a baseline, the multi-SNR multi-band method with automatic adjustment of recombination weights attained 56.8% error reduction on average, while the average error reduction rate of the clean fullband method with spectral subtraction was 11.4% at 0 dB SNR.
URL: https://globals.ieice.org/en_transactions/information/10.1587/e87-d_11_2453/_p
Copy
@ARTICLE{e87-d_11_2453,
author={Kenichi YOSHIDA, Kazuyuki TAKAGI, Kazuhiko OZEKI, },
journal={IEICE TRANSACTIONS on Information},
title={Automatic Adjustment of Subband Likelihood Recombination Weights for Improving Noise-Robustness of a Multi-SNR Multi-Band Speaker Identification System},
year={2004},
volume={E87-D},
number={11},
pages={2453-2459},
abstract={This paper is concerned with improving noise-robustness of a multi-SNR multi-band speaker identification system by introducing automatic adjustment of subband likelihood recombination weights. The adjustment is performed on the basis of subband power calculated from the noise observed just before the speech starts in the input signal. To evaluate the noise-robustness of this system, text-independent speaker identification experiments were conducted on speech data corrupted with noises recorded in five environments: "bus," "car," "office," "lobby," and "restaurant". It was found that the present method reduces the identification error by 15.9% compared with the multi-SNR multi-band method with equal recombination weights at 0 dB SNR. The performance of the present method was compared with a clean fullband method in which a speaker model training is performed on clean speech data, and spectral subtraction is applied to the input signal in the speaker identification stage. When the clean fullband method without spectral subtraction is taken as a baseline, the multi-SNR multi-band method with automatic adjustment of recombination weights attained 56.8% error reduction on average, while the average error reduction rate of the clean fullband method with spectral subtraction was 11.4% at 0 dB SNR.},
keywords={},
doi={},
ISSN={},
month={November},}
Copy
TY - JOUR
TI - Automatic Adjustment of Subband Likelihood Recombination Weights for Improving Noise-Robustness of a Multi-SNR Multi-Band Speaker Identification System
T2 - IEICE TRANSACTIONS on Information
SP - 2453
EP - 2459
AU - Kenichi YOSHIDA
AU - Kazuyuki TAKAGI
AU - Kazuhiko OZEKI
PY - 2004
DO -
JO - IEICE TRANSACTIONS on Information
SN -
VL - E87-D
IS - 11
JA - IEICE TRANSACTIONS on Information
Y1 - November 2004
AB - This paper is concerned with improving noise-robustness of a multi-SNR multi-band speaker identification system by introducing automatic adjustment of subband likelihood recombination weights. The adjustment is performed on the basis of subband power calculated from the noise observed just before the speech starts in the input signal. To evaluate the noise-robustness of this system, text-independent speaker identification experiments were conducted on speech data corrupted with noises recorded in five environments: "bus," "car," "office," "lobby," and "restaurant". It was found that the present method reduces the identification error by 15.9% compared with the multi-SNR multi-band method with equal recombination weights at 0 dB SNR. The performance of the present method was compared with a clean fullband method in which a speaker model training is performed on clean speech data, and spectral subtraction is applied to the input signal in the speaker identification stage. When the clean fullband method without spectral subtraction is taken as a baseline, the multi-SNR multi-band method with automatic adjustment of recombination weights attained 56.8% error reduction on average, while the average error reduction rate of the clean fullband method with spectral subtraction was 11.4% at 0 dB SNR.
ER -