Supervised Single-Channel Speech Separation via Sparse Decomposition Using Periodic Signal Models

Makoto NAKASHIZUKA; Hiroyuki OKUMURA; Youji IIGUNI

doi:10.1587/transfun.E95.A.853

Supervised Single-Channel Speech Separation via Sparse Decomposition Using Periodic Signal Models

Makoto NAKASHIZUKA, Hiroyuki OKUMURA, Youji IIGUNI

Full Text Views

0

Share
Cite this

Summary :

In this paper, we propose a method for supervised single-channel speech separation through sparse decomposition using periodic signal models. The proposed separation method employs sparse decomposition, which decomposes a signal into a set of periodic signals under a sparsity penalty. In order to achieve separation through sparse decomposition, the decomposed periodic signals have to be assigned to the corresponding sources. For the assignment of the periodic signal, we introduce clustering using a K-means algorithm to group the decomposed periodic signals into as many clusters as the number of speakers. After the clustering, each cluster is assigned to its corresponding speaker using preliminarily learnt codebooks. Through separation experiments, we compare our method with MaxVQ, which performs separation on the frequency spectrum domain. The experimental results in terms of signal-to-distortion ratio show that the proposed sparse decomposition method is comparable to the frequency domain approach and has less computational costs for assignment of speech components.

Publication: IEICE TRANSACTIONS on Fundamentals Vol.E95-A No.5 pp.853-866

Publication Date: 2012/05/01

Publicized

Online ISSN: 1745-1337

DOI: 10.1587/transfun.E95.A.853

Type of Manuscript: PAPER

Category: Engineering Acoustics

Cite this

Copy

Makoto NAKASHIZUKA, Hiroyuki OKUMURA, Youji IIGUNI, "Supervised Single-Channel Speech Separation via Sparse Decomposition Using Periodic Signal Models" in IEICE TRANSACTIONS on Fundamentals, vol. E95-A, no. 5, pp. 853-866, May 2012, doi: 10.1587/transfun.E95.A.853.
Abstract: In this paper, we propose a method for supervised single-channel speech separation through sparse decomposition using periodic signal models. The proposed separation method employs sparse decomposition, which decomposes a signal into a set of periodic signals under a sparsity penalty. In order to achieve separation through sparse decomposition, the decomposed periodic signals have to be assigned to the corresponding sources. For the assignment of the periodic signal, we introduce clustering using a K-means algorithm to group the decomposed periodic signals into as many clusters as the number of speakers. After the clustering, each cluster is assigned to its corresponding speaker using preliminarily learnt codebooks. Through separation experiments, we compare our method with MaxVQ, which performs separation on the frequency spectrum domain. The experimental results in terms of signal-to-distortion ratio show that the proposed sparse decomposition method is comparable to the frequency domain approach and has less computational costs for assignment of speech components.
URL: https://globals.ieice.org/en_transactions/fundamentals/10.1587/transfun.E95.A.853/_p

Copy

@ARTICLE{e95-a_5_853,
author={Makoto NAKASHIZUKA, Hiroyuki OKUMURA, Youji IIGUNI, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={Supervised Single-Channel Speech Separation via Sparse Decomposition Using Periodic Signal Models},
year={2012},
volume={E95-A},
number={5},
pages={853-866},
abstract={In this paper, we propose a method for supervised single-channel speech separation through sparse decomposition using periodic signal models. The proposed separation method employs sparse decomposition, which decomposes a signal into a set of periodic signals under a sparsity penalty. In order to achieve separation through sparse decomposition, the decomposed periodic signals have to be assigned to the corresponding sources. For the assignment of the periodic signal, we introduce clustering using a K-means algorithm to group the decomposed periodic signals into as many clusters as the number of speakers. After the clustering, each cluster is assigned to its corresponding speaker using preliminarily learnt codebooks. Through separation experiments, we compare our method with MaxVQ, which performs separation on the frequency spectrum domain. The experimental results in terms of signal-to-distortion ratio show that the proposed sparse decomposition method is comparable to the frequency domain approach and has less computational costs for assignment of speech components.},
keywords={},
doi={10.1587/transfun.E95.A.853},
ISSN={1745-1337},
month={May},}

Copy

TY - JOUR
TI - Supervised Single-Channel Speech Separation via Sparse Decomposition Using Periodic Signal Models
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 853
EP - 866
AU - Makoto NAKASHIZUKA
AU - Hiroyuki OKUMURA
AU - Youji IIGUNI
PY - 2012
DO - 10.1587/transfun.E95.A.853
JO - IEICE TRANSACTIONS on Fundamentals
SN - 1745-1337
VL - E95-A
IS - 5
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - May 2012
AB - In this paper, we propose a method for supervised single-channel speech separation through sparse decomposition using periodic signal models. The proposed separation method employs sparse decomposition, which decomposes a signal into a set of periodic signals under a sparsity penalty. In order to achieve separation through sparse decomposition, the decomposed periodic signals have to be assigned to the corresponding sources. For the assignment of the periodic signal, we introduce clustering using a K-means algorithm to group the decomposed periodic signals into as many clusters as the number of speakers. After the clustering, each cluster is assigned to its corresponding speaker using preliminarily learnt codebooks. Through separation experiments, we compare our method with MaxVQ, which performs separation on the frequency spectrum domain. The experimental results in terms of signal-to-distortion ratio show that the proposed sparse decomposition method is comparable to the frequency domain approach and has less computational costs for assignment of speech components.
ER -