Yoshifumi SASAKI Michitaka KAMEYAMA
In robot vision system, enormously large computation power is required to perform three-dimensional (3-D) instrumentation and object recognition. However, many kinds of complex and irregular operations are required to make accurate 3-D instrumentation and object recognition in the conventional method for software implementation. In this paper, a VLSI-oriented Model-Based Robot Vision (MBRV) processor is proposed for high-speed and accurate 3-D instrumentation and object recognition. An input image is compared with two-dimensional (2-D) silhouette images which are generated from the 3-D object models by means of perspective projection. Because the MBRV algorithm always gives the candidates for the accurate 3-D instrumentation and object recognition result with simple and regular procedures, it is suitable for the implementation of the VLSI processor. Highly parallel architecture is employed in the VLSI processor to reduce the latency between the image acquisition and the output generation of the 3-D instrumentation and object recognition results. As a result, 3-D instrumentation and object recognition can be performed 10000 times faster than a 28.5 MIPS workstation.
Hsiao-Jing CHEN Yoshiaki SHIRAI
A method is presented to perform image segmentation by accumulatively observing apparent motion in a long image sequence of a dynamic scene. In each image in the sequence, locations are grouped into small patches of approximately uniform optical flow. To reduce the noise in computed flow vectors, a local image motion vector of each patch is computed by averaging flow vectors in the corresponding patches in several images. A segment contains patches belonging to the same 3-D plane in the scene. Initial segments are obtained in the image, and then an attempt to merge or split segments is iterated to update the segments. In order to remove inherent ambiguities in motion-based segmentation, temporal coherence between the local image motion of a patch and the apprent motion of every plane is investigated over long time. In each image, a patch is grouped into the segment of the plane whose apparent motion is temporally most coherent with the local image motion of the patch. When apparent motions of two planes are temporally incoherent, segments of the planes are retained as individual ones.
Saprangsit MRUETUSATORN Hirotsugu KINOSHITA Yoshinori SAKAI
This paper discusses a new image resolution conversion method which converts not only spatial resolution but also amplitude resolution. This method involves considering impulse responses of image devices and human visual characteristics, and can preserve high image quality. This paper considers a system that digitizes the multilevel input image with high spatial resolution and low amplitude resolution using an image scanner, and outputs the image with low spatial resolution and high amplitude resolution on a CRT display. The algorithm thus reduces the number of pixels while increasing the number of brightness levels. Since a CRT display is chosen as the output device, the distribution of each spot in the display, which is modeled as a Gaussian function, is taken as the impulse response. The output image is then expressed as the summation of various amplitudes of the impulse response. Furthermore, human visual perception, which bears a nonlinear relationship to the spatial frequency component, is simplified and modeled with a cascade combination of low-pass and high-pass filters. The output amplitude is determined so that the error between the output image and the input image, after passing through the visual perception filter, is minimized. According to the results of a simulation, it is shown that image quality can be largely preserved by the proposed method, while significant image information is lost by conventional methods.
Sadayuki HONGO Isamu YOROIZAWA
We propose a fast computation method of stochastic relaxation for the continuous-valued Markov random field (MRF) whose energy function is represented in the quadratic form. In the case of regularization in visual information processing, the probability density function of a state transition can be transformed to a Gaussian function, therefore, the probablistic state transition is realized with Gaussian random numbers whose mean value and variance are calculated based on the condition of the input data and the neighborhood. Early visual information processing can be represented with a coupled MRF model which consists of continuity and discontinuity processes. Each of the continuity or discontinuity processes represents a visual property, which is like an intensity pattern, or a discontinuity of the continuity process. Since most of the energy function for early visual information processing can be represented by the quadratic form in the continuity process, the probability density of local computation variables in the continuity process is equivalent to the Gaussian function. If we use this characteristic, it is not necessary for the discrimination function computation to calculate the summation of the probabilities corresponding to all possible states, therefore, the computation load for the state transition is drastically decreased. Furthermore, if the continuous-valued discontinuity process is introduced, the MRF model can directly represent the strength of discontinuity. Moreover, the discrimination function of this energy function in the discontinuity process, which is linear, can also be calculated without probability summation. In this paper, a fast method for calculating the state transition probability for the continuous-valued MRF on the visual informtion processing is theoretically explained. Next, initial condition dependency, computation time and dependency on the statistical estimation of the condition are investigated in comparison with conventional methods using the examples of the data restoration for a corrupted square wave and a corrupted one-dimensional slice of a natural image.
Analysis of satellite images requires classificatio of image objects. Since different categories may have almost the same brightness or feature in high dimensional remote sensing data, many object categories overlap with each other. How to segment the object categories accurately is still an open question. It is widely recognized that the assumptions required by many classification methods (maximum likelihood estimation, etc.) are suspect for textural features based on image pixel brightness. We propose an image feature based neural network approach for the segmentation of AVHRR images. The learning algoriothm is a modified backpropagation with gain and weight decay, since feedforward networks using the backpropagation algorithm have been generally successful and enjoy wide popularity. Destructive algorithms that adapt the neural architecture during the training have been developed. The classification accuracy of 100% is reached for a validation data set. Classification result is compared with that of Kohonen's LVQ and basic backpropagation algorithm based pixel-by-pixel method. Visual investigation of the result images shows that our method can not only distinguish the categories with similar signatures very well, but also is robustic to noise.
Koji NAKAMAE Hirohisa TANAKA Hideharu KUBOTA Hiromu FUJITA
A method to improve the efficiency of dynamic fault imaging (DFI) by fully utilizing the CAD data in the CAD-linked electron beam test system is proposed. In the method, in order to shorten the long acquisition time of the stroboscopic voltage contrast images over the whole area of the chip during the entire test cycle, only the area and phase (time) required for fault tracing are selected by utilizing the CAD data. Furthermore, image processing techniques are combined with the method to improve the efficiency of the DFI. In particular, the signal averaging technique is used in order to improve the signal-to-noise ratio in the stroboscopic images where all voltage information data on the equipotential electrode recognized by the CAD layout data are averaged. This enables us to reduce the acquisition time of images. Moreover, the experimental system is set up so that the image processing can be performed in parallel with the acquisition of the stroboscopic images. The proposed method is applied to part of a 2k-transistor block of a nonpassivated CMOS LSI where a marginal fault is detected. The result shows that the method is an efficient approach to the fully automatic fault diagnosis in the CAD-linked electron beam test system. The proposed method could improve the efficiency of the conventional DFI by a factor of more than 1000.
A new image-based diagnostic method is proposed for use with an E-beam tester. The method features a static fault imaging technique and a navigation map for fault tracing. Static Fault imaging with a dc E-beam enables the fast acquisition of images without any additional hardware. Then, guided by the navigation map derived from CAD data, marginal timing faults can be easily pinpointed. A statistical estimation of the average count of static fault images for various LSI circuits shows that the proposed method can diagnose marginal faults by observing less than thirty faulty images and that a faulty area can be localized with up to five times fewer observations than with the guided-probe method. The proposed method was applied to a 19k-gate CMOS-logic LSI circuit and a marginal timing fault was successfully located.
A method of range image segmentation using four Markov random field(MRF)s is described in this paper. MRFs are used in depth smoothing, gradient smoothing, edge detection and surface type labeling stage. First, range and its gradient images are smoothed preserving jump and roof edges respectively using line process concept one after another. Then jump and roof edges are extracted, combined and refined using penalizing undesirable edge patterns. Finally, curvatures are computed and the surface types are labeled according to the signs of principal curvatures. The surface type labels are refined using winner-takes-all layers in the stage. The final output is a set of regions with its exact surface type. The energy function is used in order to represent constraints of each stage and the minimum energy state is found using iterative method. Several experimental results show the generality of our approach and the execution speed of the proposed method is faster than that of a typical region merging method. This promises practical applications of our method.
Hironori OKII Noriaki KANEKI Hiroshi HARA Koichi ONO
This paper describes a color segmentation method which is essential for automatic diagnosis of stained images. This method is applicable to the variance of input images using a three-layered neural network model. In this network, a back-propagation algorithm was used for learning, and the training data sets of RGB values were selected between the dark and bright images of normal mammary glands. Features of both normal mammary glands and breast cancer tissues stained with hematoxylin-eosin (HE) staining were segmented into three colors. Segmented results indicate that this network model can successfully extract features at various brightness levels and magnifications as long as HE staining is used. Thus, this color segmentation method can accommodate change in brightness levels as well as hue values of input images. Moreover, this method is effective to the variance of scaling and rotation of extracting targets.
Masaya OITA Yoshikazu NITTA Shuichi TAI Kazuo KYUMA
This paper presents a novel model of optical associative memory using an optoelectronic neurochips, which detects and processes a two-dimensional input image at the same time. The original point of this model is that the optoelectronic neurochips allow direct image processing in terms of parallel input/output interface and parallel neural processing. The operation principle is based on the nonlinear transformation of the input image to the corresponding the point attractor of a fully connected neural network. The learning algorithm is the simulated annealing and the energy of the network state is used as its cost function. The computer simulations show its usefulness and that the maximum number of stored images is 150 in the network with 64 neurons. Moreover, we experimentally demonstrate an optical implementation of the model using the optoelectronic neurochip. The chip consists of two-dimensional array of variable sensitivity photodetectors with 8 16 elements. The experimental results shows that 3 images of size 8 8 were successfully stored in the system. In the case of the input image of size 64 64, the estimated processing speed is 100 times higher than that of the conventional optoelectronic neurochips.
Shanjun ZHANG Toshio KAWASIMA Yoshinao AOKI
A two-cascaded image processing approach to enhance the subtle differences in X-ray CT image is proposed. In the method, an asymmetrical non-linear subfilter is introduced to reduce the noise inherent in the image while preserving local edges and directional structural information. Then, a subfilter is used to compress the global dynamic range of the image and emphasize the details in the homogeneous regions by performing a modular transformation on local image den-sities. The modular transformation is based on a dynamically defined contrast fator and the histogram distributions of the image. The local contrast factor is described in accordance with Weber's fraction by a two-layer neighborhood system where the relative variances of the medians for eight directions are computed. This method is suitable for low contrast images with wide dynamic ranges. Experiments on X-ray CT images of the head show the validity of the method.
Farhad Fuad ISLAM Keikichi TAMARU
Multiplication-accumulation is the basic computation required for image filtering operations. For real-time image filtering, very high throughput computation is essential. This work proposes a hardware algorithm for an application-specific VLSI architecture which realizes an area-efficient high throughput multiplier-accumulator. The proposed algorithm utilizes a priori knowledge of filter mask coefficients and optimizes number of basic hardware components (e.g., full adders, pipeline latches, etc.). This results in the minimum area VLSI architecture under certain input/output constraints.
Masaharu AKEI Masato NIWA Mituyoshi SHINONAGA Hiroshi MIYAUCHI Masanori MATUMURA
In the ISAR (Inverse Synthetic Aperture Radar), when a target is to be recognized by use of the radar image produced from the radar echoes, it is important first to estimate the scale of the target. To estimate the scale, the rotating motion of the target must be estimated. This paper describes a method for estimating the scale of the target from the information on the radar image by converting the target figure into a simple model and estimating the rotating motion of the target.
Yasuharu JIN Yuichiro GOTO Yoshiro NISHIMOTO Hiroyuki NAITO Akio IWAKE
As in other fields, the automatization of railway maintenance work is a firm requirement. The authors have developed a system detecting obstacles around a railway for practical railway inspection. The system is based on an original laser-sectioning method and characterized by high accuracy with wide view and in-motion operation. It was confirmed that a static calibration was performed at an accuracy of within 5 mm. Furthermore, a theoretical estimation predicted that dynamic errors can be eliminated within a resolution of 4 mm by means of rail movement detection. In field tests on the Chuo Line, facilities were successfully inspected at speeds up to 40km/h.
Yasuko TAKAHASHI Akio SHIO Kenichiro ISHII
The character binarization method MTC is developed for enhancing the recognition of characters in general outdoor images. Such recognition is traditionally difficult because of the influence of illumination changes, especially strong shadow, and also changes in character, such as apparent character sizes. One way to overcome such difficulties is to restrict objects to be processed by using strong hypotheses, such as type of object, object orientation and distance. Several systems for automatic license plate reading are being developed using such strong hypotheses. However. their strong assumptions limit their applications and complicate the extension of the systems. The MTC method assumes the most reasonable hypotheses possible for characters: they occupy plane areas, consist of narrow lines, and external shadow is considerably larger than character lines. The first step is to eliminate the effect of local brightness changes by enhancing feature including characters. This is achieved by applying mathematical morphology by using a logarithmic function. The enhanced gray-scale image is then binarized. Accurate binarization is achieved because local thresholds are determined from the edges detected in the image. The MTC method yields stable binary results under illumination changes, and, consequently, ensures high character reading rates. This is confirmed with a large number of images collected under a wide variety of weather conditions. It is also shown experimentally that MTC permits stable recognition rate even if the characters vary in size.
This paper proposes a new method for automatic improvement in image quality through adjusting the image sharpness. This method does not need prior knowledge about image blur. To improve image quality, the sharpness must be adjusted to an optimal value. This paper shows a new method to evaluate sharpness without MTF. It is considered that the human visual system judges image sharpness mainly based upon edge area features. Therefore, attention is paid to the high spatial frequency components in the edge area. The value is defined by the average intensity of the high spatial fequency components in the edge area. This is called the image edge sharpness" value. Using several images, edge sharpness values are compared with experimental results for subjective sharpness. According to the experiments, the calculated edge sharpness values show a good linear relation with subjective sharpness. Subjective image sharpness does not have a monotonic relation with subjective image quality. If the edge sharpness value is in a particular range, the image quality is judged to be good. According to the subjective experiments, an optimal edge sharpness value for image quality was obtained. This paper also shows an algorithm to alter an image into one which has another edge sharpness value. By altering the image, which achieves optimal edge sharpness using this algorithm, image sharpness can be optimally adjusted automatically. This new image improving method was applied to several images obtained by scanning photographs. The experimental results were quite good.
Masataka AJIRO Hiroyuki MIYATA Takashi KAN Masakazu SOGA Makoto ONO
Since its successful launch in February of 1992, the Japan Earth Resources Satellite-1 (JERS-1) has been sending back high resolution images of the earth for various studies, including the investigation of earth resources, the preservation of environments and the observation of coastal lines. Currently, received images are processed using the Earth Resources Satellite Data Information System (ERSDIS). The ERSDIS is a high speed image processing system utilizing an extended cellular array processor as its main processing module. The extended cellular array processor (CAP), consisting of 4096 processing elements configured into a two-dimensional array, is designed to have many parallel processing optimizing capabilities targetting large-scale image processing at a high speed. This paper desctribes a typical image processing flow, the structure of the ERSDIS, and the details of the CAP design.
As a new method to generate a homogeneous, random, binary image with a rational power spectrum, this paper proposes a discrete-valued auto-regressive equation, of which random coefficients and white noise excitation are all discrete-valued. The average and spectrum of the binary image are explicitly obtained in terms of the random coefficients. Some computer results are illustrated in figures.
Shigeru AKAMATSU Tsutomu SASAKI Hideo FUKAMACHI Yasuhito SUENAGA
This paper proposes a scheme that offers robust extraction of target images in standard view from input facial images, in order to realize accurate and automatic identification of human faces. A standard view for target images is defined using internal facial features, i.e., the two eyes and the mouth, as steady reference points of the human face. Because reliable detection of such facial features is not an easy task in practice, the proposed scheme is characterized by a combination of two steps: first, all possible regions of facial features are extracted using a color image segmentation algorithm, then the target image is selected from among the candidates defined by tentative combination of the three reference points, through applying the classification framework using the sub-space method. Preliminary experiments on the scheme's flexibility based on subjective assessment indicate a stability of nearly 100% in consistent extraction of target images in the standard view, not only for familiar faces but also for unfamiliar faces, when the input face image roughly matches the front view. By combining this scheme for normalizing images into the standard view with an image matching technique for identification, an experimental system for identifying faces among a limited number of subjects was implemented on a commercial engineering workstation. High success rates achieved in the identification of front view face images obtained under uncontrolled conditions have objectively confirmed the potential of the scheme for accurate extraction of target images.
Toshio WAKAYAMA Toru SATO Iwane KIMURA
Radar imaging technique is one of the most powerful tool for underground detection. However, performance of conventional methods is not sufficiently high when the observational direction or the aperture size is restricted. In the present paper, an image reconstruction method based on a model fitting with nonlinear least-squares has been developed, which is applicable to arbitrarily arranged arrays. Reconstruction is executed on the assumption that targets consist of discrete point scatterers embedded in a homogeneous medium. Model fitting is iterated as the number of point target in the assumed model is increased, until the residual in fitting becomes unchanged or small enough. A penalty function is used in nonlinear least-squares to make the algorithm stable. Fundamental characteristics of the method revealed with computer simulation are described. This method focuses a much sharper image than that obtained by the conventional aperture synthesis technique.