Aboul-Ella HASSANIEN Masayuki NAKAJIMA
In this paper a new snake model for image morphing with semiautomated delineation which depends on Hermite's interpolation theory, is presented. The snake model will be used to specify the correspondence between features in two given images. It allows a user to extract a contour that defines a facial feature such as the lips, mouth, and profile, by only specifying the endpoints of the contour around the feature which we wish to define. We assume that the user can specify the endpoints of a curve around the features that serve as the extremities of a contour. The proposed method automatically computes the image information around these endpoints which provides the boundary conditions. Then the contour is optimized by taking this information into account near its extremities. During the iterative optimization process, the image forces are turned on progressively from the contour extremities toward the center to define the exact position of the feature. The proposed algorithm helps the user to easily define the exact position of a feature. It may also reduce the time required to establish the features of an image.
Takeyasu SAKAI Hiromasa NAGAI Takashi MATSUMOTO
Multi-input floating gate differential amplifier (FGDA) is proposed which can perform any convolution operation with differential structure and feedback loop. All operations are in the voltage mode. Only one terminal is required for the negative feedback which can suppress distortions due to mismatches of active elements. Possible applications include intelligent image sensor, where fully parallel DCT operation can be performed. A prototype chip is fabricated which is functional. A preliminary test result is reported.
We discuss optimal estimation of the current location of a mobile robot by matching an image of the scene taken by the robot with the model of the environment. We first present a theoretical accuracy bound and then give a method that attains that bound, which can be viewed as describing the probability distribution of the current location. Using real images, we demonstrate that our method is superior to the naive least-squares method. We also confirm the theoretical predictions of our theory by applying the bootstrap procedure.
Noise greatly degrades the image quality and performance of image compression algorithms. This paper presents an approach for the representation and compression of noisy synthetic images. A new concept region-based prediction (RBP) model is first introduced, and then the RBP model is utilized on noisy images. In the conventional predictive coding techniques, the context for prediction is always composed of individual pixels surrounding the pixel to be processed. The RBP model uses regions instead of individual pixels as the context for prediction. An algorithm for the implementation of RBP is proposed and applied to noisy synthetic images in our experiments. Using RBP to find the residual data and encoding them, we achieve a bit rate of 1.10 bits/pixel for the noisy synthetic image. The decompressed image achieves a peak SNR of 42.59 dB. Compared with a peak SNR of 41.01 dB for the noisy synthetic image, the quality of the decompressed synthetic image is improved by 1.58 dB in the MSE sense. In contrast to our proposed compression algorithm with its improvement in image quality, conventional coding methods can compress image data only at the expense of lower image quality. At the same bit rate, the image compression standard JPEG provides a peak SNR of 33.17 dB for the noisy synthetic image, and the conventional median filter with a 33 window provides a peak SNR of 25.89 dB.
Dong Joong KANG Chang Yong KIM Yang Seok SEO In So KWEON
A discrete dynamic model for defining contours in 2-D medical images is presented. An active contour in this objective is optimized by a dynamic programming algorithm, for which a new constraint that has fast and stable properties is introduced. The internal energy of the model depends on local behavior of the contour, while the external energy is derived from image features. The algorithm is able to rapidly detect convex and concave objects even when the image quality is poor.
Hisashi INOUE Akio MIYAZAKI Akihiro YAMAMOTO Takashi KATSURA
In this paper, we propose two methods of digital watermark for image signals based on the wavelet transform. We classify wavelet coefficients as insignificant or significant by using zerotree which is defined in the embedded zerotree wavelet (EZW) algorithm . In the first method, information data are embedded as watermark in the location of insignificant coefficients. In the second method, information data can be embedded by thresholding and modifying significant coefficients at the coarser scales in perceptually important spectral components of image signals. Information data are detected by using the position of zerotree's root and the threshold value after the wavelet decomposition of an image in which data hide. It is shown from the numerical experiments that the proposed methods can extract the watermark from images that have degraded through several common signal and geometric processing procedures.
We propose a new method of progressive transmission of continuous tone images using multi-level error diffusion method. Assuming that the pixels are ordered and the error is diffused to later pixels, multi-level error-diffused images are resolved into a multiple number of bit planes. In an image with 8 bits per pixel, the number of the bit planes that we construct is 9, and the 2-level, 3-level, 5-level,, error-diffused images are produced by a successive use of the bit planes. The original image is finally achieved precisely.
Hiroshi OHYAMA Tadahiko KIMOTO Shin'ichi USUI Toshiaki FUJII Masayuki TANIMOTO
A fractal image coding scheme using classified range regions is proposed. Two classes of range regions, shade and nonshade, are defined here, A shade range region is encoded by the average gray level, while a nonshade range region is encoded by IFS parameters. To obtain classified range regions, the two-stage block merging scheme is proposed. Each range region is produced by merging primitive square blocks. Shade range regions are obtained at the first stage, and from the rest of primitive blocks nonshade range regions are obtained at the second stage. Furthermore, for increasing the variety of region shape, the 8-directional block merging scheme is defined by extension of the 4-directional scheme. Also, two similar schemes for encoding region shapes, each corresponding to the 4-directional block merging scheme and the 8-directional block merging scheme, are proposed. From the results of simulation by using a test image, it was demonstrated that the variety of region shape allows large shade range regions to be extracted efficiently, and these large shade range regions are more effective in reduction of total amount of codebits with less increase of degradation of reconstructed image quality than large nonshade range regions. The 8-directional merging and coding scheme and the 4-directional scheme reveal almost the same coding performance, which is improved than that of the quad-tree partitioning scheme. Also, these two schemes achieve almost the same reconstructed image quality.
In this paper we have introduced a new method for signature pattern recognition, taking advantage of some image moment transformations combined with fuzzy logic approach. For this purpose first we tried to model the noise embedded in signature patterns inherently and separate it from environmental effects. Based on the first step results, we have performed a mapping into the unit circle using the error least mean square (LMS) error criterion, to get ride of the variations caused by shifting or scaling. Then we derived some orientation invariant moments introduced in former reports and studied their statistical properties in our special input space. Later we defined a fuzzy complex space and also a fuzzy complex similarity measure in this space and constructed a new training algorithm based on fuzzy learning vector quantization (FLVQ) method. A comparison method has also been proposed so that any input pattern could be compared to the learned prototypes through the pre-defined fuzzy similarity measure. Each set of the above image moments were used by the fuzzy classifier separately and the mis-classifications were detected as a measure of error magnitude. The efficiency of the proposed FLVQ model has been numerically shown compared to the conventional FLVQs reported so far. Finally some satisfactory results are derived and also a comparison is made between the above considered image transformations.
A unified source coding method is highly desired for many systems that deal with images diversifying from 1 bit/pel bi-level documents to SHD (Super High Definition) images of 12 bit/pel for each color component, and progressive coding that allows images to be reconstructed with increasing pixel accuracy or spatial resolution is essential for many applications including World Wide Web, medical images archive, digital library, pre-press and quick look applications. In this paper, we propose a unified continuous-tone and bi-level image coding method with pyramidal and progressive transmission feature. Hierarchical structure is constructed by interlacing subsampling, and each hierarchy is encoded by DPCM combined with reduced Markov model. Simulation results show that the proposed method is a little inferior than JBIG for bi-level image coding but can achieve better lossless compression ratio for gray-level image coding than CREW, in which wavelet transform is exploited to construct hierarchical structure.
Retrieving the unknown parameters of scattering objects from measured field data is the subject of microwave imaging. This is naturally and usually posed as an optimization problem. In this paper, micro genetic algorithm coupled with deterministic method is applied to the shape reconstruction of perfectly conducting cylinders. The combined approach, with a very small population like the micro genetic algorithm, performs much better than the conventional large population genetic algorithms (GA's) in reaching the optimal region. In addition, we propose a criterion for switching the micro GA to the deterministic optimizer. The micro GA is utilized to effectively locate the vicinity of the global optimum, while the deterministic optimizer is employed to efficiently reach the optimum after inside this region. Therefore, the combined approach converges to the optimum much faster than the micro GA. The proposed approach is first tested by a function optimization problem, then applied to reconstruct perfectly conducting cylinders from both synthetic data and real data. Impressive and satisfactory results are obtained for both cases, which demonstrate the validity and effectiveness of the proposed approach.
Supatana AUETHAVEKIAT Kiyoharu AIZAWA Mitsutoshi HATORI
A novel image improving algorithm for compressed image sequence by merging a reference image is presented. A high quality still image of the same scene is used as a reference image. The degraded images are improved by merging reference image with them. Merging amount is controlled by the resemblance between the reference image and compressed image after applying motion compensation. Experiments conducted on sequences of JPEG images are given. This technique does not need a prior knowledge of compression technique so it can be applied to other techniques as well.
Sanghyun JOO Hisakazu KIKUCHI Shigenobu SASAKI Jaeho SHIN
A zerotree image-coding scheme is introduced that effectively exploits the inter-scale self-similarities found in the octave decomposition by a wavelet transform. A zerotree is useful for efficiently coding wavelet coefficients; its efficiency was proved by Shapiro's EZW. In the EZW coder, wavelet coefficients are symbolized, then entropy-coded for further compression. In this paper, we analyze the symbols produced by the EZW coder and discuss the entropy for a symbol. We modify the procedure used for symbol-stream generation to produce lower entropy. First, we modify the fixed relation between a parent and children used in the EZW coder to raise the probability that a significant parent has significant children. The modified relation is flexibly modified again based on the observation that a significant coefficient is more likely to have significant coefficients in its neighborhood. The three relations are compared in terms of the number of symbols they produce.
A method of visualization of multimodal images by one monochromatic image is presented on the basis of the projection pursuit approach of the inverse process of the anisotropic diffusion which is a method of image restoration enhancing contrasts at edges. The extension of the projection from a linear one to nonlinear sigmoidal functions enhances the contrast further. The deterministic annealing technique is also incorporated into the optimization process for improving the contrast enhancement ability of the projection. An application of this method to a pair of MRI images of brains reveals its promising performance of superior visualization of tissues.
The objective of thinning is to reduce the amount of information in image patterns to the minimum needed for recognition. Thinned image helps the extraction of important features such as end points, junction points, and connections from image patterns. The ultimate goal of parallel algorithms is to minimize the execution time while producing high quality thinned image. Though much research has been performed for parallel thinning algorithms, there has been no systematical approach for comparing the execution speed of parallel thinning algorithms. Several rough comparisons have been done in terms of iteration numbers. But, such comparisons may lead to wrong guides since the time required for iterations varies from one algorithm to the other algorithm. This paper proposes a formal method to analyze the performance of parallel thinning algorithms based on PRAM (Parallel Random Access Machine) model. Besides, the quality of skeletons, robustness to boundary noise sensitivity, and execution speed are considered. Six parallel algorithms, which shows relatively high performance, are selected, and analyzed based on the proposed analysis method. Experiments show that the proposed analysis method is sufficiently accurate to evaluate the performance of parallel thinning algorithms.
The performance of AC plasma displays has been improved in the area of brightness and contrast, while significant advances in image quality are still required for the HDTV quality. In particular, in full color motion video, motion artifacts and lack of color depth are still visible in some situations. These motional artifacts are mitigated as the number of the subfields increases, usually at the cost of losing brightness or increasing driving circuitry. Therefore, it is still one of our great concerns to find out the optimized subfield configuration through weighting and order of each subfield, and their coding of combination. For evaluation and improvement of motion picture disturbance, we have established a procedure that fully simulates the image quality of displays which utilize the subfield driving scheme. The simulation features virtually located sensor pixels on human retina, eye-tracking sensor windows, and a built-in spatial low pass filter. The model pixelizes the observers retina like a sensor chip in a CCD camera. An eye-tracking sensor window is assigned to every light emission from the display, to calculate the emissions from one to four adjoining pixels along the trajectory of motion. Through this model, a scene from original motion picture without disturbance is transformed into the still image with simulated disturbance. The integration of the light emission from adjoining pixels through the window, also functions as a built-in spatial low pass filter to secure the robust output, considering the MTF of the human eye. Both simulation and actual 42-in-diagonal PDPs showed close results under various conditions, showing that the model is simple, but reasonable. Through the simulation, general properties of the subfield driving scheme for gray scale have been elucidated. For example, a PWM-like coding offers a better performance than an MSB-split coding in many cases. The simulation also exemplifies the motion picture disturbance as a non-linear filter process caused by the dislocation of bit weightings, suggesting that tradeoffs between disturbance and resolution in motion area are mandatory.
Masahiro OKUDA Masaaki IKEHARA Shin-ichi TAKAHASHI
Since IIR filters have lower computational complexity than FIR filters, some design methods for IIR filter banks have been presented in the recent literatures. Smith et al. have proposed a class of linear phase IIR filter banks. However this method restricts the order of the numerator to be odd and has some drawbacks. In this paper, we present two design methods for linear phase IIR filter banks. One is based on Lagrange-Multiplier method, and optimal IIR filter banks in least squares sense are obtained. In an other approach, IIR filter banks with the maximum number of zeros are derived analytically.
Image classification is an important task in document image analysis and understanding, page segmentation-based document image compression, and image retrieval. In this paper, we present a new approach for distinguishing textual images from pictorial images using the Kolmogorov Complexity (KC) measure with randomly extracted blocks. In this approach, a number of blocks are extracted randomly from a binarized image and each block image is converted into a one-dimensional binary sequence using either horizontal or vertical scanning. The complexities of these blocks are then computed and the mean value and standard deviation of the block complexities are used to classify the image into textual or pictorial image based on two simple fuzzy rules. Experimental results on different textual and pictorial images show that the KC measure with randomly extracted blocks can efficiently classified 29 out 30 images. The performance of our approach, where an explicit training process is not needed, is comparable favorably to that of a neural network-based approach.
Makoto MIYAHARA Takao INO Hideki SHIRAI Shuji TANIHO Ralph ALGAZI
The coming information society will require images at the high end of the quality range. By using a new method which focuses on the assessment words of the high order sensation, we are investigating the important physical factors for the difficult reproduction of high level, high quality sensation in the electronic capture and display of images. We have found a key assessment word "image depth" that describes appropriately the high order subjective sensation that is indispensable for the display of extra high quality images. Related to the depth of images, we have discovered a new physical factor and the degree of precision required of already known physical factors for the display of extra high quality images. The cross modulation among R, G and B signals is the newly discovered important physical factor affecting the quality of an electronic display. In addition, we have found that very strict control of distortion in the gamma and the step response and the strict suppression of the halation in a CRT display are necessary. We note that aliasing of the displayed images also destroys the sensation of depth. This paper first outlines the overall objective of our work, second considers the important physical factors as important for extra high quality imaging, and then describes the specific effects of cross modulation distortion, gamma, step response, halation and aliasing as they relate to image depth. Finally, the relation of the discussed physical factors with the high order sensation are discussed broadly.
This paper describes a factorization-based algorithm that reconstructs 3D object structure as well as motion from a set of multiple uncalibrated perspective images. The factorization method introduced by Tomasi-Kanade is believed to be applicable under the assumption of linear approximations of imaging system. In this paper we describe that the method can be extended to the case of truly perspective images if projective depths are recovered. We established this fact by interpreting their purely mathematical theory in terms of the projective geometry of the imaging system and thereby, giving physical meanings to the parameters involved. We also provide a method to recover them using the fundamental matrices and epipoles estimated from pairs of images in the image set. Our method is applicable for general cases where the images are not taken by a single moving camera but by different cameras having individual camera parameters. The experimental results clearly demonstrates the feasibility of the proposed method.