Sparse representation has been studied within the field of signal processing as a means of providing a compact form of signal representation. This paper introduces a sparse representation based framework named Sparse Probabilistic Linear Discriminant Analysis in speaker recognition. In this latent variable model, probabilistic linear discriminant analysis is modified to obtain an algorithm for learning overcomplete sparse representations by replacing the Gaussian prior on the factors with Laplace prior that encourages sparseness. For a given speaker signal, the dictionary obtained from this model has good representational power while supporting optimal discrimination of the classes. An expectation-maximization algorithm is derived to train the model with a variational approximation to a range of heavy-tailed distributions whose limit is the Laplace. The variational approximation is also used to compute the likelihood ratio score of all trials of speakers. This approach performed well on the core-extended conditions of the NIST 2010 Speaker Recognition Evaluation, and is competitive compared to the Gaussian Probabilistic Linear Discriminant Analysis, in terms of normalized Decision Cost Function and Equal Error Rate.
Hai YANG
Chinese Academy of Sciences
Yunfei XU
Chinese Academy of Sciences
Qinwei ZHAO
Chinese Academy of Sciences
Ruohua ZHOU
Chinese Academy of Sciences
Yonghong YAN
Chinese Academy of Sciences
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Hai YANG, Yunfei XU, Qinwei ZHAO, Ruohua ZHOU, Yonghong YAN, "Speaker Recognition Using Sparse Probabilistic Linear Discriminant Analysis" in IEICE TRANSACTIONS on Fundamentals,
vol. E96-A, no. 10, pp. 1938-1945, October 2013, doi: 10.1587/transfun.E96.A.1938.
Abstract: Sparse representation has been studied within the field of signal processing as a means of providing a compact form of signal representation. This paper introduces a sparse representation based framework named Sparse Probabilistic Linear Discriminant Analysis in speaker recognition. In this latent variable model, probabilistic linear discriminant analysis is modified to obtain an algorithm for learning overcomplete sparse representations by replacing the Gaussian prior on the factors with Laplace prior that encourages sparseness. For a given speaker signal, the dictionary obtained from this model has good representational power while supporting optimal discrimination of the classes. An expectation-maximization algorithm is derived to train the model with a variational approximation to a range of heavy-tailed distributions whose limit is the Laplace. The variational approximation is also used to compute the likelihood ratio score of all trials of speakers. This approach performed well on the core-extended conditions of the NIST 2010 Speaker Recognition Evaluation, and is competitive compared to the Gaussian Probabilistic Linear Discriminant Analysis, in terms of normalized Decision Cost Function and Equal Error Rate.
URL: https://globals.ieice.org/en_transactions/fundamentals/10.1587/transfun.E96.A.1938/_p
Copy
@ARTICLE{e96-a_10_1938,
author={Hai YANG, Yunfei XU, Qinwei ZHAO, Ruohua ZHOU, Yonghong YAN, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={Speaker Recognition Using Sparse Probabilistic Linear Discriminant Analysis},
year={2013},
volume={E96-A},
number={10},
pages={1938-1945},
abstract={Sparse representation has been studied within the field of signal processing as a means of providing a compact form of signal representation. This paper introduces a sparse representation based framework named Sparse Probabilistic Linear Discriminant Analysis in speaker recognition. In this latent variable model, probabilistic linear discriminant analysis is modified to obtain an algorithm for learning overcomplete sparse representations by replacing the Gaussian prior on the factors with Laplace prior that encourages sparseness. For a given speaker signal, the dictionary obtained from this model has good representational power while supporting optimal discrimination of the classes. An expectation-maximization algorithm is derived to train the model with a variational approximation to a range of heavy-tailed distributions whose limit is the Laplace. The variational approximation is also used to compute the likelihood ratio score of all trials of speakers. This approach performed well on the core-extended conditions of the NIST 2010 Speaker Recognition Evaluation, and is competitive compared to the Gaussian Probabilistic Linear Discriminant Analysis, in terms of normalized Decision Cost Function and Equal Error Rate.},
keywords={},
doi={10.1587/transfun.E96.A.1938},
ISSN={1745-1337},
month={October},}
Copy
TY - JOUR
TI - Speaker Recognition Using Sparse Probabilistic Linear Discriminant Analysis
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 1938
EP - 1945
AU - Hai YANG
AU - Yunfei XU
AU - Qinwei ZHAO
AU - Ruohua ZHOU
AU - Yonghong YAN
PY - 2013
DO - 10.1587/transfun.E96.A.1938
JO - IEICE TRANSACTIONS on Fundamentals
SN - 1745-1337
VL - E96-A
IS - 10
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - October 2013
AB - Sparse representation has been studied within the field of signal processing as a means of providing a compact form of signal representation. This paper introduces a sparse representation based framework named Sparse Probabilistic Linear Discriminant Analysis in speaker recognition. In this latent variable model, probabilistic linear discriminant analysis is modified to obtain an algorithm for learning overcomplete sparse representations by replacing the Gaussian prior on the factors with Laplace prior that encourages sparseness. For a given speaker signal, the dictionary obtained from this model has good representational power while supporting optimal discrimination of the classes. An expectation-maximization algorithm is derived to train the model with a variational approximation to a range of heavy-tailed distributions whose limit is the Laplace. The variational approximation is also used to compute the likelihood ratio score of all trials of speakers. This approach performed well on the core-extended conditions of the NIST 2010 Speaker Recognition Evaluation, and is competitive compared to the Gaussian Probabilistic Linear Discriminant Analysis, in terms of normalized Decision Cost Function and Equal Error Rate.
ER -