Noritaka SHIGEI Hiromi MIYAJIMA Michiharu MAEDA
Learning algorithms for Vector Quantization (VQ) are categorized into two types: batch learning and incremental learning. Incremental learning is more useful than batch learning, because, unlike batch learning, incremental learning can be performed either on-line or off-line. In this paper, we develop effective incremental learning methods by using Stochastic Relaxation (SR) techniques, which have been developed for batch learning. It has been shown that, for batch learning, the SR techniques can provide good global optimization without greatly increasing the computational cost. We empirically investigates the effective implementation of SR for incremental learning. Specifically, we consider five types of SR methods: ISR1, ISR2, ISR3, WSR1 and WSR2. ISRs and WSRs add noise input and weight vectors, respectively. The difference among them is when the perturbed input or weight vectors are used in learning. These SR methods are applied to three types of incremental learning: K-means, Neural-Gas (NG) and Kohonen's Self-Organizing Mapping (SOM). We evaluate comprehensively these combinations in terms of accuracy and computation time. Our simulation results show that K-means with ISR3 is the most comprehensively effective among these combinations and is superior to the conventional NG method known as an excellent method.
Ching-Chih KUO Wen-Thong CHANG
By modelling the quantization error as additive white noise in the transform domain, Wiener filter is used to reduce quantization noise for DCT coded images in DCT domain. Instead of deriving the spectrum of the transform coefficient, a DPCM loop is used to whiten the quantized DCT coefficients. The DPCM loop predicts the mean for each coefficient. By subtracting the mean, the quantized DCT coefficient is converted into the sum of prediction error and quantization noise. After the DPCM loop, the prediction error can be assumed uncorrelated to make the design of the subsequent Wiener filter easy. The Wiener filter is applied to remove the quantization noise to restore the prediction error. The original coefficient is reconstructed by adding the DPCM predicted mean with the restored prediction error. To increase the prediction accuracy, the decimated DCT coefficients in each subband are interpolated from the overlapped blocks.
Jeng-Shyang PAN Min-Tsang SUNG Hsiang-Cheh HUANG Bin-Yih LIAO
A new scheme for watermarking based on vector quantization (VQ) over a binary symmetric channel is proposed. By optimizing VQ indices with genetic algorithm, simulation results not only demonstrate effective transmission of watermarked image, but also reveal the robustness of the extracted watermark.
Yusuke HIWASAKI Kazunori MANO Kazutoshi YASUNAGA Toshiyuki MORII Hiroyuki EHARA Takao KANEKO
This paper presents an efficient LSP quantizer implementation for low bit-rate coders. The major feature of the quantizer is that it uses a truncated cepstral distance criterion for the code selection procedure. This approach has generally been considered too computationally costly. We utilized the quantizer with a moving-average predictor, two-stage-split vector quantizer and delayed decision. We have investigated the optimal parameter settings in this case and incorporated the quantizer thus obtained into an ITU-T 4-kbit/s speech coding candidate algorithm with a bit budget of 21 bits. The objective performance is better than that with a conventional weighted mean-square criterion, while the complexity is still kept to a reasonable level. The paper also describes the codebook design and techniques that were employed to achieve robustness in noisy channel conditions.
Young-Ho SEO Soon-Young CHOI Sung-Ho PARK Dong-Wook KIM
This paper proposed a watermarking algorithm for image, which assumed an image compression based on DWT (Discrete Wavelet Transform). To reduce the amount of computation, this algorithm selects the watermarking positions by a threshold table which is statistically established from computing the energy correlation of the corresponding wavelet coefficients. The proposed algorithm can operate in a real-time if the image compression process operates in a real-time because the watermarking process was designed to operate in parallel with the compression process. Also it improves the property of losing the watermak and reducing the compresson ratio by the quantization and Huffman coding steps. It was done by considering the sign of the coefficients and the change in the value for watermarking. Visually recognizable pattern such as a binary image were used as the watermark. The experimental results showed that the proposed algorithm satisfied the properties of robustness and imperceptibility that are the major conditions of watermarking.
The artifacts of low-bit rate quantization in images cannot be removed satisfactorily by known methods. We propose decomposition of images as HSI and LSI (higher- and lower- significance images), followed by subsampling and reconstruction methods for LSI. Experiments show significant improvement in image quality, as compared to other methods.
Ching-Tang HSIEH Eugene LAI Wan-Chen CHEN
This paper presents some effective methods for improving the performance of a speaker identification system. Based on the multiresolution property of the wavelet transform, the input speech signal is decomposed into various frequency subbands in order not to spread noise distortions over the entire feature space. For capturing the characteristics of the vocal tract, the linear predictive cepstral coefficients (LPCC) of the lower frequency subband for each decomposition process are calculated. In addition, a hard threshold technique for the lower frequency subband in each decomposition process is also applied to eliminate the effect of noise interference. Furthermore, cepstral domain feature vector normalization is applied to all computed features in order to provide similar parameter statistics in all acoustic environments. In order to effectively utilize all these multiband speech features, we propose a modified vector quantization as the identifier. This model uses the multilayer concept to eliminate the interference among the multiband speech features and then uses the principal component analysis (PCA) method to evaluate the codebooks for capturing a more detailed distribution of the speaker's phoneme characteristics. The proposed method is evaluated using the KING speech database for text-independent speaker identification. Experimental results show that the recognition performance of the proposed method is better than those of the vector quantization (VQ) and the Gaussian mixture model (GMM) using full-band LPCC and mel-frequency cepstral coefficients (MFCC) features in both clean and noisy environments. Also, a satisfactory performance can be achieved in low SNR environments.
Kwang-deok SEO Seong-cheol HEO Soon-kak KWON Jae-kyoon KIM
In this paper, we propose a dynamic bit-rate reduction scheme for transcoding an MPEG-1 bitstream into an MPEG-4 simple profile bitstream with a typical bit-rate of 384 kbps. For dynamic bit-rate reduction, a significant reduction in the bit-rate is achieved by combining the processes of requantization and frame-skipping. Conventional requantization methods for a homogeneous transcoder cannot be used directly for a heterogeneous transcoder due to the mismatch in the quantization parameters between the MPEG-1 and MPEG-4 syntax and the difference in the compression efficiency between MPEG-1 and MPEG-4. Accordingly, to solve these problems, a new requantization method is proposed for an MPEG-1 to MPEG-4 transcoder consisting of R-Q (rate-quantization) modeling with a simple feedback and an adjustment of the quantization parameters to compensate for the different coding efficiency between MPEG-1 and MPEG-4. For bit-rate reduction by frame-skipping, an efficient method is proposed for estimating the relevant motion vectors from the skipped frames. The conventional FDVS (forward dominant vector selection) method is improved to reflect the effect of the macroblock types in the skipped frames. Simulation results demonstrated that the proposed method combining requantization and frame-skipping can generate a transcoded MPEG-4 bitstream that is much closer to the desired low bit-rate than the conventional method along with a superior objective quality.
Zhibin PAN Koji KOTANI Tadahiro OHMI
Conventional vector quantization (VQ) encoding method by full search (FS) is very heavy computationally but it can reach the best PSNR. In order to speed up the encoding process, many fast search methods have been developed. Base on the concept of multi-resolutions, the FS equivalent fast search methods using mean-type pyramid data structure have been proposed already in. In this Letter, an enhanced sum pyramid data structure is suggested to improve search efficiency further, which benefits from (1) exact computing in integer form, (2) one more 2-dimensional new resolution and (3) an optimal pair selecting way for constructing the new resolution. Experimental results show that a lot of codewords can be rejected efficiently by using this added new resolution that features lower dimensions and earlier difference check order.
A novel modified midtread quantizer is proposed for number-controlled oscillator frequency quantization in digital phase-locked loops (DPLLs). We show that DPLLs employing the proposed quantizer provide significantly improved cycle slip performance compared to those employing conventional midtread or midrise quantizers, especially when the number of quantization bits is small and the magnitude of input signal frequency normalized by the quantization interval is less than 0.5.
Ahmed SWILEM Kousuke IMAMURA Hideo HASHIMOTO
In this paper, we propose two fast codebook generation algorithms for entropy-constrained vector quantization. The first algorithm uses the angular constraint to reduce the search area and to accelerate the search process in the codebook design. It employs the projection angles of the vectors to a reference line. The second algorithm has feature of using a suitable hyperplane to partition the codebook and image data. These algorithms allow significant acceleration in codebook design process. Experimental results are presented on image block data. These results show that our new algorithms perform better than the previously known methods.
Shu-Chuan CHU John F. RODDICK Zhe-Ming LU Jeng-Shyang PAN
This paper presents a novel digital image watermarking algorithm based on the labeled bisecting clustering technique. Each cluster is labeled either '0' or '1' based on the labeling key. Each input image block is then assigned to the nearest codeword or cluster centre whose label is equal to the watermark bit. The watermark extraction can be performed blindly. The proposed method is robust to JPEG compression and some spatial-domain processing operations. Simulation results demonstrate the effectiveness of the proposed algorithm.
Zhe-Ming LU Wen XING Dian-Guo XU Sheng-He SUN
This Letter presents a novel VQ-based digital image watermarking method. By modifying the conventional GLA algorithm, a codeword-labeled codebook is first generated. Each input image block is then reconstructed by the nearest codeword whose label is equal to the watermark bit. The watermark extraction can be performed blindly. Simulation results show that the proposed method is robust to JPEG compression, vector quantization (VQ) compression and some spatial-domain processing operations.
Digital Subtraction Angiography (DSA) is a technique used for enhancement of small details in angiogram imaging systems. In this approach, X-ray images of a subject, after injection, are subtracted from a reference X-ray image, taken from the same subject before injection. Due to the exponential absorption property of X-rays, effects of small details at different depth appear differently on X-ray images. Consequently, image subtraction cannot be employed on the original images without any adjustment or modification. Proper modification, in this case, is to use some form of logarithmic operation on images before subtraction. In medical imaging systems, the system designer has a choice to implement this logarithmic operation in the analog domain, before digitization of the video signal, or in the digital domain after analog-to-digital conversion (ADC) of the original video signal. In this paper, the difference between these two approaches is studied and upper bounds for quantization error in both cases are calculated. Based on this study, the best approach for utilization of the logarithmic function is proposed. The overall effects of these two approaches on the inherent signal noise are also addressed.
Zhibin PAN Koji KOTANI Tadahiro OHMI
A fast winner search method based on separating all codewords in the original codebook completely into a promising group and an impossible group is proposed. Group separation is realized by using sorted both L1 and L2 norms independently. As a result, the necessary search scope that guarantees full search equivalent PSNR can be limited to the common part of the 2 individual promising groups. The high search efficiency is confirmed by experimental results.
Toshio FUKUTA Yuuichi HAMASUNA Ichi TAKUMI Masayasu HATA Takahiro NAKANISHI
Given the importance of the traffic on modern communication networks, advanced error correction methods are needed to overcome the changes expected in channel quality. Conventional countermeasures that use high dimensionality parity codes often fail to provide sufficient error correction capability. We propose a parity code with high dimensionality that is iteratively decoded. It provides better error correcting capability than conventional decoding methods. The proposal uses the steepest descent method to increase code bit reliability and the coherency between parities and code bits gradually. Furthermore, the quantization of the decoding algorithm is discussed. It is found that decoding with quantization can keep the error correcting capability high.
Heng-Iang HSU Wen-Whei CHANG Xiaobei LIU Soo Ngee KOH
An approach to minimum mean-squared error (MMSE) decoding for vector quantization over channels with memory is presented. The decoder is based on the Gilbert channel model that allows the exploitation of both intra- and inter-block correlation of bit error sequences. We also develop a recursive algorithm for computing the a posteriori probability of a transmitted index sequence, and illustrate its performance in quantization of Gauss-Markov sources under noisy channel conditions.
Zhibin PAN Koji KOTANI Tadahiro OHMI
A fast winner search method for VQ based on 2-pixel-merging sum pyramid is proposed in order to reject a codeword at an earlier stage to reduce the computational burden. The necessary search scope of promising codewords is meanwhile narrowed by using sorted real sums. The high search efficiency is confirmed by experimental results.
In this paper color image compression using a fuzzy Hopfield-model net based on rough-set reasoning is created to generate optimal codebook based on Vector Quantization (VQ) in Discrete Wavelet Transform (DWT). The main purpose is to embed rough-set learning scheme into the fuzzy Hopfield network to construct a compression system named Rough Fuzzy Hopfield Net (RFHN). First a color image is decomposed into 3-D pyramid structure with various frequency bands. Then the RFHN is used to create different codebooks for various bands. The energy function of RFHN is defined as the upper- and lower-bound fuzzy membership grades between training samples and codevectors. Finally, near global-minimum codebooks in frequency domain can be obtained when the energy function converges to a stable state. Therefore, only 32/N pixels are selected as the training samples if a 3N-dimensional color image was used. In the simulation results, the proposed network not only reduces the consuming time but also preserves the compression performance.
Shinya FUKUMOTO Noritaka SHIGEI Michiharu MAEDA Hiromi MIYAJIMA
Neural networks for Vector Quantization (VQ) such as K-means, Neural-Gas (NG) network and Kohonen's Self-Organizing Map (SOM) have been proposed. K-means, which is a "hard-max" approach, converges very fast. The method, however, devotes itself to local search, and it easily falls into local minima. On the other hand, the NG and SOM methods, which are "soft-max" approaches, are good at the global search ability. Though NG and SOM exhibit better performance in coming close to the optimum than that of K-means, the methods converge slower than K-means. In order to the disadvantages that exist when K-means, NG and SOM are used individually, this paper proposes hybrid methods such as NG-K, SOM-K and SOM-NG. NG-K performs NG adaptation during short period of time early in the learning process, and then the method performs K-means adaptation in the rest of the process. SOM-K and SOM-NG are similar as NG-K. From numerical simulations including an image compression problem, NG-K and SOM-K exhibit better performance than other methods.