IEICE globals.ieice.org Site

Keyword Search Result

[Keyword] quantization(223hit)

201-220hit(223hit)

Destructive Fuzzy Modeling Using Neural Gas Network
Kazuya KISHIDA Hiromi MIYAJIMA Michiharu MAEDA

PAPER

Vol:
E80-A No:9
Page(s):
1578-1584
In order to construct fuzzy systems automatically, there are many studies on combining fuzzy inference with neural networks. In these studies, fuzzy models using self-organization and vector quantization have been proposed. It is well known that these models construct fuzzy inference rules effectively representing distribution of input data, and not affected by increment of input dimensions. In this paper, we propose a destructive fuzzy modeling using neural gas network and demonstrate the validity of a proposed method by performing some numerical examples.
Fingerprint Compression Using Wavelet Packet Transform and Pyramid Lattice Vector Quantization
Shohreh KASAEI Mohamed DERICHE Boualem BOASHASH

PAPER

Vol:
E80-A No:8
Page(s):
1446-1452
A new compression algorithm for fingerprint images is introduced. A modified wavelet packet scheme which uses a fixed decomposition structure, matched to the statistics of fingerprint images, is used. Based on statistical studies of the subbands, different compression techniques are chosen for different subbands. The decision is based on the effect of each subband on reconstructed image, taking into account the characteristics of the Human Visual System (HVS). A noise shaping bit allocation procedure which considers the HVS, is then used to assign the bit rate among subbands. Using Lattice Vector Quantization (LVQ), a new technique for determining the largest radius of the Lattice and its scaling factor is presented. The design is based on obtaining the smallest possible Expected Total Distortion (ETD) measure, using the given bit budget. At low bit rates, for the coefficients with high-frequency content, we propose the Positive-Negative Mean (PNM) algorithm to improve the resolution of the reconstructed image. Furthermore, for the coefficients with low-frequency content, a lossless predictive compression scheme is developed. The proposed algorithm results in a high compression ratio and a high reconstructed image quality with a low computational load compared to other available algorithms.
A Memory-Based Parallel Processor for Vector Quantization: FMPP-VQ
Kazutoshi KOBAYASHI Masayoshi KINOSHITA Hidetoshi ONODERA Keikichi TAMARU

PAPER-Multi Processors

Vol:
E80-C No:7
Page(s):
970-975
We propose a memory-based processor called a Functional Memory Type Parallel Processor for vector quantization (FMPP-VQ). The FMPP-VQ is intended for low bit-rate image compression using vector quantization. It accelerates the nearest neighbor search on vector quantization. In the nearest neighbor search, we look for a vector nearest to an input one among a large number of code vectors. The FMPP-VQ has as many PEs (processing elements, also called "blocks") as code vectors. Thus distances between an input vector and code vectors are computed simultaneously in every PE. The minimum value of all the distances is searched in parallel, as in conventional CAMs. The computation time does not depend on the number of code vectors. In this paper, we explain the detail of the architecture of the FMPP-VQ, its performance and its layout density. We designed and fabricated an LSI including four PEs. The test results and performance estimation of the LSI are also reported.
An Adaptive Learning and Self-Deleting Neural Network for Vector Quantization
Michiharu MAEDA Hiromi MIYAJIMA Sadayuki MURASHIMA

PAPER-Nonlinear Problems

Vol:
E79-A No:11
Page(s):
1886-1893
This paper describes an adaptive neural vector quantization algorithm with a deleting approach of weight (reference) vectors. We call the algorithm an adaptive learning and self-deleting algorithm. At the beginning, we introduce an improved topological neighborhood and an adaptive vector quantization algorithm with little depending on initial values of weight vectors. Then we present the adaptive learning and self-deleting algorithm. The algorithm is represented as the following descriptions: At first, many weight vectors are prepared, and the algorithm is processed with Kohonen's self-organizing feature map. Next, weight vectors are deleted sequentially to the fixed number of them, and the algorithm processed with competitive learning. At the end, we discuss algorithms with neighborhood relations compared with the proposed one. The proposed algorithm is also good in the case of a poor initialization of weight vectors. Experimental results are given to show the effectiveness of the proposed algorithm.
Combining Multiple Classifiers in a Hybrid System for High Performance Chinese Syllable Recognition
Liang ZHOU Satoshi IMAI

PAPER-Speech Processing and Acoustics

Vol:
E79-D No:11
Page(s):
1570-1578
A multiple classifier system can be a powerful solution for robust pattern recognition. It is expected that the appropriate combination of multiple classifiers may reduce errors, provide robustness, and achieve higher performance. In this paper, high performance Chinese syllable recognition is presented using combinations of multiple classifiers. Chinese syllable recognition is divided into base syllable recognition (disregarding the tones) and recognition of 4 tones. For base syllable recognition, we used a combination of two multisegment vector quantization (MSVQ) classifiers based on different features (instantaneous and transitional features of speech). For tone recognition, vector quantization (VQ) classifier was first used, and was comparable to multilayer perceptron (MLP) classifier. To get robust or better performance, a combination of distortion-based classifier (VQ) and discriminant-based classifier (MLP) is proposed. The evaluations have been carried out using standard syllable database CRDB in China, and experimental results have shown that combination of multiple classifiers with different features or different methodologies can improve recognition performance. Recognition accuracy for base syllable, tone, and tonal syllable is 96.79%, 99.82% and 96.24% respectively. Since these results were evaluated on a standard database, they can be used as a benchmark that allows direct comparison against other approaches.
A Pattern Vector Quantization Scheme for Mid-range Frequency DCT Coefficients
Dennis Chileshe MWANSA Satoshi MIZUNO Makoto FUJIMURA Hideo KURODA

PAPER

Vol:
E79-B No:10
Page(s):
1452-1458
In DCT transform coding it is usually necessary to discard some of the ac coefficients obtained after the transform operation for data compression reasons. Although most of the energy is usually compacted in the few coefficients that are transmitted, there are many instances where the discarded coefficients contain significant information. The absence of these coefficients at the decoder can lead to visible degradation of the reconstructed image especially around slow moving objects. We propose a simple but effective method which uses an indirect form of vector quantization to supplement scalar quantization in the transform domain. The distribution pattern of coefficients that fall below a fixed threshold is vector quantized and an index of the pattern chosen from a codebook is transmitted together with two averages; one for the positive coefficients and the other for negative coefficients. In the reconstruction, the average values are used instead of setting the corresponding coefficients to zero. This is tantamount to quantizing the mid range frequency coefficients with 1 bit but the resulting bit-rate is much less. We aim to propose an alternative to using traditional vector quantization which entails computational complexities and large time and memory requirements.
A Method Quantizing Filter Coefficients with Genetic Algorithm and Simulated Annealing
Miki HASEYAMA Yoshihiro AKETA Hideo KITAJIMA

PAPER

Vol:
E79-A No:8
Page(s):
1130-1134
In this paper, quantization method which can keep the phase and gain characteristics of a reference filter is proposed. The proposed method uses a genetic algorithm and a simulated annealing algorithm. The objective function used in this method is described with two kinds of weighting functions for identifying the phase and gain characteristics respectively. Therefore, the quantization accuracy on the gain characteristic is independent of the accuracy on the phase characteristic. Further, the proposed algorithm can be applied to any types of filters, because the chromosome expresses only their coefficients values. The efficiency of the proposed algorithm is verified by some experiments.
A Region-Based Adaptive Perceptual Quantization Technique for MPEG Coder^*
Hyun Duk CHO Sun CHOI Kyoung Won LIM Seong Deuk KIM Jong Beom RA

PAPER

Vol:
E79-D No:6
Page(s):
737-742
A region-based adaptive perceptual quantization technique is proposed for video sequence coding, and applied to the MPEG coder. The visibility of coding artifacts in a macroblock (MB) is affected by perceptual characteristics of neighboring MBs as well as the MB itself. Therefore spacial and temporal activities of the MB and its surroundings are used to decide the quantization scaling factor. In comparison with the adaptive scheme in the encoding algorithm specified in MPEG-2 Test Model 5 (TM5), the proposed scheme is proven to improve perceptual quality further in video coding.
Multisegment Multiple VQ Codebooks-Based Speaker Independent Isolated-Word Recognition Using Unbiased Mel Cepstrum
Liang ZHOU Satoshi IMAI

PAPER-Speech Processing and Acoustics

Vol:
E78-D No:9
Page(s):
1178-1187
In this paper, we propose a new approach to speaker independent isolated-word speech recognition using multisegment multiple vector quantization (VQ) codebooks. In this approach, words are recognized by means of multisegment multiple VQ codebooks, a separate multisegment multiple VQ codebooks are designed for each word in the recognition vocabulary by dividing equally the word into multiple segments which is correlative with number of syllables or phonemes of the word, and designing two individual VQ codebooks consisting of both instantaneous and transitional speech features for each segment. Using this approach, the influence of the within-word coarticulation can be minimized, the time-sequence information of speech can be used, and the word length differences in the vocabulary or speaking rates variations can be adapted automatically. Moreover, the mel-cepstral coefficients based on unbiased estimation of log spectrum (UELS) are used, and comparison experiment with LPC derived mel cepstral coefficients is made. Recognition experiments Using testing databases consisting of 100 Japanese words (Waseda database) and 216 phonetically balanced words (ATR database), confirmed the effectiveness of the new method and the new speech features. The approach is described, computational complexity as well as memory requirements are analyzed, the experimental results are presented.
4 kbps Improved Pitch Prediction CELP Speech Coding with 20 msec Frame
Masahiro SERIZAWA Kazunori OZAWA

PAPER

Vol:
E78-D No:6
Page(s):
758-763
This paper proposes a new pitch prediction method for 4 kbps CELP (Code Excited LPC) speech coding with 20 msec frame, for the future ITU-T 4 kbps speech coding standardization. In the conventional CELP speech coding, synthetic speech quality deteriorates rapidly at 4 kbps, especially for female and children's speech with short pitch period. The pitch prediction performance is significantly degraded for such speech. The important reason is that when the pitch period is shorter than the subframe length, the simple repetition of the past excitation signal based on the estimated lag, not the pitch prediction, is usually carried out in the adaptive codebook operation. The proposed pitch prediction method can carry out the pitch prediction without the above approximation by utilizing the current subframe excitation codevector signal, when the pitch prediction parameters are determined. To further improve the performance, a split vector synthesis and perceptually spectral weighting method, and a low-complexity perceptually harmonic and spectral weighting method have also been developed. The informal listening test result shows that the 4 kbps speech coder with 20 msec frame, utilizing all of the proposed improvements, achieves 0.2 MOS higher results than the coder without them.
Off-Line Handwritten Word Recognition with Explicit Character Juncture Modeling
Wongyu CHO Jin H. KIM

PAPER-Image Processing, Computer Graphics and Pattern Recognition

Vol:
E78-D No:2
Page(s):
143-151
In this paper, a new off-line handwritten word recognition method based on the explicit modeling of character junctures is presented. A handwritten word is regarded as a sequence of characters and junctures of four types. Hence both characters and junctures are explicitly modeled. A handwriting system employing hidden Markov models as the main statistical framework has been developed based on this scheme. An interconnection network of character and ligature models is constructed to model words of indefinite length. This model can ideally describe any form of hamdwritten words including discretely spaced words, pure cursive words, and unconstrained words of mixed styles. Also presented are efficient encoding and decoding schemes suitable for this model. The system has shown encouraging performance with a standard USPS database.
Askant Vision Architecture Using Warp Model of Hough Transform--For Realizing Dynamic & Central/Peripheral Camera Vision--
Hiroyasu KOSHIMIZU Munetoshi NUMADA Kazuhito MURAKAMI

PAPER

Vol:
E77-D No:11
Page(s):
1206-1212
The warp model of the extended Hough transform (EHT) has been proposed to design the explicit expression of the transform function of EHT. The warp model is a skewed parameter space (R(µ,ξ), φ(µ,ξ)) of the space (µ,ξ), which is homeomorphic to the original (ρ,θ) parameter space. We note that the introduction of the skewness of the parameter space defines the angular and positional sensitivity characteristics required in the detection of lines from the pattern space. With the intent of contributing some solutions to basic computer vision problems, we present theoretically a dynamic and centralfine/peripheral-coarse camera vision architecture by means of this warp model of Hough transform. We call this camera vision architecture askant vision' from an analogy to the human askant glance. In this paper, an outline of the EHT is briefly shown by giving three functional conditions to ensure the homeomorphic relation between (µ,ξ) and (ρ,θ) parameter spaces. After an interpretation of the warp model is presented, a procedure to provide the transform function and a central-coarse/peripheralfine Hough transform function are introduced. Then in order to realize a dynamic control mechanism, it is proposed that shifting of the origin of the pattern space leads to sinusoidal modification of the Hough parameter space.
M-LCELP Speech Coding at 4kb/s with Multi-Mode and Multi-Codebook
Kazunori OZAWA Masahiro SERIZAWA Toshiki MIYANO Toshiyuki NOMURA Masao IKEKAWA Shin-ichi TAUMI

PAPER

Vol:
E77-B No:9
Page(s):
1114-1121
This paper presents the M-LCELP (Multi-mode Learned Code Excited LPC) speech coder, which has been developed for the next generation half-rate digital cellular telephone systems. M-LCELP develops the following techniques to achieve high-quality synthetic speech at 4kb/s with practically reasonable computation and memory requirements: (1) Multi-mode and multi-codebook coding to improve coding efficiency, (2) Pitch lag differential coding with pitch tracking to reduce lag transmission rate, (3) A two-stage joint design regular-pulse codebook with common phase structure in voiced frames, to drastically reduce computation and memory requirements, (4) An efficient vector quantization for LSP parameters, (5) An adaptive MA type comb filter to suppress excitation signal inter-harmonic noise. The MOS subjective test results demonstrate that 4.075kb/s M-LCELP synthetic speech quality is mostly equivalent to that for a North American full-rate standard VSELP coder. M-LCELP codec requires 18 MOPS computation amount. The codec has been implemented using 2 floating-point dsp chips.
Speech Recognition of lsolated Digits Using Simultaneous Generative Histogram
Yasuhisa HAYASHI Akio OGIHARA Kunio FUKUNAGA

LETTER

Vol:
E76-A No:12
Page(s):
2052-2054
We propose a recognition method for HMM using a simultaneous generative histogram. Proposed method uses the correlation between two features, which is expressed by a simultaneous generative histogram. Then output probabilities of integrated HMM are conditioned by the codeword of another feature. The proposed method is applied to isolated digit word recognition to confirm its validity.
ECG Data Compression by Using Wavelet Transform
Jie CHEN Shuichi ITOH Takeshi HASHIMOTO

PAPER

Vol:
E76-D No:12
Page(s):
1454-1461
A new method for the compression of electrocardiographic (ECG) data is presented. The method is based on the orthonormal wavelet analysis recently developed in applied mathematics. By using wavelet transform, the original signal is decomposed into a set of sub-signals with different frequency channels corresponding to the different physical features of the signal. By utilizing the optimum bit allocation scheme, each decomposed sub-signal is treated according to its contribution to the total reconstruction distortion and to the bit rate. In our experiments, compression ratios (CR) from 13.5: 1 to 22.9: 1 with the corresponding percent rms difference (PRD) between 5.5% and 13.3% have been obtained at a clinically acceptable signal quality. Experimental results show that the proposed method seems suitable for the compression of ECG data in the sense of high compression ratio and high speed.
Coding of LSP Parameters Using Interframe Moving Average Prediction and Multi-Stage Vector Quantization
Hitoshi OHMURO Takehiro MORIYA Kazunori MANO Satoshi MIKI

LETTER

Vol:
E76-A No:7
Page(s):
1181-1183
This letter proposes an LSP quantizing method which uses interframe correlation of the parameters. The quantized parameters are represented as a moving average of code vectors. Using this method, LSP parameters are quantized efficiently and the degradation of decoded parameters caused by bit errors affects only a few following frames.
Subband Coding of Super High Definition Images Using Entropy Coded Vector Quantization
Mitsuru NOMURA Isao FURUKAWA Tetsurou FUJII Sadayasu ONO

PAPER-Image Coding and Compression

Vol:
E75-A No:7
Page(s):
861-870
This paper discusses the bit-rate compression of super high definition still images with subband coding. Super high definition (SHD) images with more than 20482048 pixels or resolution are introduced as the next generation imaging system beyond HDTV. In order to develop bit-rate reduction algorithms, an image evaluation system for super high definition images is assembled. Signal characteristics are evaluated and the optimum subband analysis/synthesis system for the SHD images is clarified. A scalar quantization combined with run-length and Huffman coding is introduced as a conventional subband coding algorithm, and its coding performance is evaluated for SHD images. Finally, new coding algorithms based on block Huffman coding and entropy coded vector quantization are proposed. SNR improvement of 0.5 dB and 1.0 dB can be achieved with the proposed block Huffman coding and the vector quantization algorithm, respectively.
An SVQ-HMM Training Method Using Simultaneous Generative Histogram
Yasuhisa HAYASHI Satoshi KONDO Nobuyuki TAKASU Akio OGIHARA Shojiro YONEDA

LETTER

Vol:
E75-A No:7
Page(s):
905-907
This study proposes a new training method for hidden Markov model with separate vector quantization (SVQ-HMM) in speech recognition. The proposed method uses the correlation of two different kinds of features: cepstrum and delta-cepstrum. The correlation is used to decrease the number of reestimation for two features thus the total computation time for training models decreases. The proposed method is applied to Japanese language isolated dgit recognition.
Image Compression and Regeneration by Nonlinear Associative Silicon Retina
Mamoru TANAKA Yoshinori NAKAMURA Munemitsu IKEGAMI Kikufumi KANDA Taizou HATTORI Yasutami CHIGUSA Hikaru MIZUTANI

PAPER-Neural Systems

Vol:
E75-A No:5
Page(s):
586-594
Threre are two types of nonlinear associative silicon retinas. One is a sparse Hopfield type neural network which is called a H-type retina and the other is its dual network which is called a DH-type retina. The input information sequences of H-type and HD-type retinas are given by nodes and links as voltages and currents respectively. The error correcting capacity (minimum basin of attraction) of H-type and DH-type retinas is decided by the minimum numbers of links of cutset and loop respectively. The operation principle of the regeneration is based on the voltage or current distribution of the neural field. The most important nonlinear operation in the retinas is a dynamic quantization to decide the binary value of each neuron output from the neighbor value. Also, the edge is emphasized by a line-process. The rates of compression of H-type and DH-type retinas used in the simulation are 1/8 and (2/3) (1/8) respectively, where 2/3 and 1/8 mean rates of the structural and binarizational compression respectively. We could have interesting and significant simulation results enough to make a chip.
Perceptually Transparent Coding of Still Images
V. Ralph ALGAZI Todd R. REED Gary E. FORD Eric MAURINCOMME Iftekhar HUSSAIN Ravindra POTHARLANKA

PAPER

Vol:
E75-B No:5
Page(s):
340-348
The encoding of high quality and super high definition images requires new approaches to the coding problem. The nature of such images and the applications in which they are used prohibits the introduction of perceptible degradation by the coding process. In this paper, we discuss techniques for the perceptually transparent coding of images. Although technically lossy methods, images encoded and reconstructed using these techniques appear identical to the original images. The reconstructed images can be postprocessed (e.g., enhanced via anisotropic filtering), due to the absence of structured errors, commonly introduced by conventional lossy methods. The compression, ratios obtained are substantially higher than those achieved using lossless means.