IEICE globals.ieice.org Site

Author Search Result

[Author] Kazunori MANO(6hit)

1-6hit

Design of a Robust LSP Quantizer for a High-Quality 4-kbit/s CELP Speech Coder
Yusuke HIWASAKI Kazunori MANO Kazutoshi YASUNAGA Toshiyuki MORII Hiroyuki EHARA Takao KANEKO

PAPER-Speech and Hearing

Vol:
E87-D No:6
Page(s):
1496-1506
This paper presents an efficient LSP quantizer implementation for low bit-rate coders. The major feature of the quantizer is that it uses a truncated cepstral distance criterion for the code selection procedure. This approach has generally been considered too computationally costly. We utilized the quantizer with a moving-average predictor, two-stage-split vector quantizer and delayed decision. We have investigated the optimal parameter settings in this case and incorporated the quantizer thus obtained into an ITU-T 4-kbit/s speech coding candidate algorithm with a bit budget of 21 bits. The objective performance is better than that with a conventional weighted mean-square criterion, while the complexity is still kept to a reasonable level. The paper also describes the codebook design and techniques that were employed to achieve robustness in noisy channel conditions.
Pitch Synchronous Innovation CELP (PSI-CELP)
Takehiro MORIYA Satoshi MIKI Kazunori MANO Hitoshi OHMURO

LETTER

Vol:
E76-A No:7
Page(s):
1177-1180
A speech coding scheme at 3.6 kbit/s has been proposed. The scheme is based on CELP (Code Excited Linear Prediction) with pitch synchronous innovation, which means even random codevectors as well as adaptive codevectors have pitch periodicity. The quality is comparable to 6.7 kbit/s VSELP coder for the Japanese cellular radio standard.
Vector Quantization of Speech Spectrum Based on the VQ-VAE Embedding Space Learning by GAN Technique
Tanasan SRIKOTR Kazunori MANO

PAPER-Speech and Hearing, Digital Signal Processing

Pubricized:
2021/09/30
Vol:
E105-A No:4
Page(s):
647-654
The spectral envelope parameter is a significant speech parameter in the vocoder's quality. Recently, the Vector Quantized Variational AutoEncoder (VQ-VAE) is a state-of-the-art end-to-end quantization method based on the deep learning model. This paper proposed a new technique for improving the embedding space learning of VQ-VAE with the Generative Adversarial Network for quantizing the spectral envelope parameter, called VQ-VAE-EMGAN. In experiments, we designed the quantizer for the spectral envelope parameters of the WORLD vocoder extracted from the 16kHz speech waveform. As the results shown, the proposed technique reduced the Log Spectral Distortion (LSD) around 0.5dB and increased the PESQ by around 0.17 on average for four target bit operations compared to the conventional VQ-VAE.
Coding of LSP Parameters Using Interframe Moving Average Prediction and Multi-Stage Vector Quantization
Hitoshi OHMURO Takehiro MORIYA Kazunori MANO Satoshi MIKI

LETTER

Vol:
E76-A No:7
Page(s):
1181-1183
This letter proposes an LSP quantizing method which uses interframe correlation of the parameters. The quantized parameters are represented as a moving average of code vectors. Using this method, LSP parameters are quantized efficiently and the degradation of decoded parameters caused by bit errors affects only a few following frames.
FOREWORD
Kazunori MANO

FOREWORD

Vol:
E86-D No:1
Page(s):
1-2
Noise Post-Processing for Low Bit-Rate CELP Coders
Hiroyuki EHARA Kazutoshi YASUNAGA Koji YOSHIDA Yusuke HIWASAKI Kazunori MANO Takao KANEKO

PAPER-Speech and Hearing

Vol:
E87-D No:6
Page(s):
1507-1516
This paper presents a newly developed noise post-processing (NPP) algorithm and the results of several tests demonstrating its subjective performance. This NPP algorithm is designed to improve the subjective performance of low bit-rate code excited linear prediction (CELP) decoding under background noise conditions. The NPP algorithm is based on a stationary noise generator and improves the subjective quality of noisy signal input. A backward adaptive detector defines noisy input signal frames from decoded LSF, energy, and pitch parameters. The noise generator estimates and produces stationary noise signals using past line spectral frequency (LSF) and energy parameters. The stationary noise generator has a frame erasure concealment (FEC) scheme designed for stationary noise signals and therefore improves the speech decoder's robustness for frame erasure under background noise conditions. The algorithm has been applied to the following CELP decoders: 1) a candidate algorithm of the ITU-T 4-kbit/s speech coding standard and 2) existing ITU-T standards, the G.729 and G.723.1 series. In both cases, NPP improved the subjective performance of the baseline decoders. Improvements of approximately 0.25 CMOS (CCR MOS: comparison category rating mean opinion score) and around 0.2-0.8 DMOS (DCR MOS: degradation category rating mean opinion score) were demonstrated in the results of our subjective tests when applied to the 4-kbit/s decoder and G.729/G.723.1 decoders respectively. Other test results show that NPP improves the subjective performance of a G.729 decoder by around 0.45 in DMOS under both error-free and frame-erasure conditions, and a further improvement of around 0.2 DMOS is achieved by the FEC scheme in the noise generator.

Author Search Result

[Author] Kazunori MANO(6hit)

Design of a Robust LSP Quantizer for a High-Quality 4-kbit/s CELP Speech Coder

Pitch Synchronous Innovation CELP (PSI-CELP)

Vector Quantization of Speech Spectrum Based on the VQ-VAE Embedding Space Learning by GAN Technique

Coding of LSP Parameters Using Interframe Moving Average Prediction and Multi-Stage Vector Quantization

FOREWORD

Noise Post-Processing for Low Bit-Rate CELP Coders

Latest Issue

FlyerIEICE has prepared a flyer regarding multilingual services. Please use the one in your native language.

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles