Kazuharu YAMATO Toshihide ASADA Yutaka HATA
In this letter we propose an interpolation technique for low-quality fingerprint images for highly reliable feature extraction. To improve the feature extraction rate, we extract fingerprint features by referring to both the interpolated image obtained by using a directional Laplacian filter and the high-contrast image obtained by using histogram equalization. Experimental results show the applicability of our method.
Hiroshi OHNO Kiyoharu AIZAWA Mitsutoshi HATORI
Fractal image coding using iterated transformations compresses image data by exploiting the self–similarity of an image. Its compression performance has already been discussed in [2] and several other papers. However the relation between the performance and the self–similarity remains unclear. In this paper, we evaluate fractal coding from the perspective of this relationship.
In the design of 3-D filter detecting Linear Trajectory Signal (LTS), there may be paid little attention to the noise rejective characteristics. In this paper, we treat the noise rejection ability of the filter detecting LTS having margins both in its velocity and direction.
Mitsuru TAKEUCHI Takayoshi KUBONO
This paper describes a simple system of measuring the spatial distributions of spectral intensities with AgI-421 nm and AgI-546 nm among many optical spectrums emitted from an arc discharge between separating Ag contacts. In order to detect the intensities of two optical spectrums, the prototype equipment has two sets assembled with a CCD color linear image sensor, a lens and optical filters, which are arranged on rectangularity. The intensities of two spectrums can be recorded with 2 ms time-resolution within a long arc duration on a digital memory. The recorded digital signals are processed by using a personal computer in order to reconstruct two spatial distributions of spectral intensities in a cross section of arc column with the Algebraic Reconstruction Technique.
Toshiaki FUJII Hiroshi HARASHIMA
This paper is concerned with the data compression and interpolation of multi-view image set. In this paper, we propose a novel disparity compensation scheme based on geometric relationship. We first investigate the geometric relationship between a point in the object space and its projection onto view images. Then, we propose the disparity compensation scheme which utilize the geometric constraints between view images. This scheme is used to compress the multi-view image into the structure of the triangular patches and the texture data on the surface of patches. This scheme not only compresses the multi-view image but also synthesize the view images from any viewpoints in the viewing zone. Also, this scheme is fast and have compatibility with 2-D interframe coding. Finally, we report the experiment, where two sets multi-view image were used as original images and the amount of data was reduced to 1/19 and 1/20 with SNR 34 dB and 20 dB, respectively.
Jong-Il PARK Nobuyuki YAGI Kazumasa ENAMI
This paper describes an image synthesis method based on an estimation of camera parameters. In order to acquire high quality images using image synthesis, we take some constraints into account, which include angle of view, synchronization of change of scale and change of viewing direction. The proposed method is based on an investigation that any camera operation containing a change of scale and a pure 3D rotation can be represented by a 2D geometric transformation. The transformation can explain all the synthesis procedure consisting of locating, synchronizing, and operating images. The procedure is described based on a virtual camera which is constituted of a virtual viewing point and a virtual image plain. The method can be efficiently implemented in such a way that each image to be synthesized undergoes the transformation only one time. The parameters in the image transformation are estimated from image sequence. The estimation scheme consists of first establishing correspondence and then estimating the parameters by fitting the correspondence data to the transformation model. We present experimental results and show the validity of the proposed method.
In this paper, we first discuss on a framework for a 3D image display system which is the combination of passive sensing and active display technologies. The passive sensing enables to capture real scenes under natural condition. The active display enables to present arbitrary views with proper motion parallax following the observer's motion. The requirements of passive sensing technology for 3D image displays are discussed in comparison with those for robot vision. Then, a new stereo algorithm, called SEA (Stereo by Eye Array), which satisfies the requirements is described in detail. The SEA uses nine images captured by a 33 camera array. It has the following features for depth estimation: 1) Pixel-based correspondence search enables to obtain a dense and high-spatial-resolution depth map. 2) Correspondence ambiguity for linear edges with the orientation parallel to a particular baseline is eliminated by using multiple baselines with different orientations. 3) Occlusion can be easily detected and an occlusion-free depth map with sharp object boundaries is generated. The feasibility of the SEA is demonstrated by experiments by using real image data.
Takashi WATANABE Hitoshi SUZUKI Sumio TANBA Ryuzo YOKOYAMA
Contextual classification of multispectral image data in remote sensing is discussed and concretely two improved contextual classifiers are proposed. The first is the extended adaptive classifier which partitions an image successively into homogeneously distributed square regions and applies a collective classification decision to each region. The second is the accelerated probabilistic relaxation which updates a classification result fast by adopting a pixelwise stopping rule. The evaluation experiment with a pseudo LANDSAT multispectral image shows that the proposed methods give higher classification accuracies than the compound decision method known as a standard contextual classifier.
An unsupervised segmentation technique is presented that is based on a layered statistical model for both region shapes and the region internal texture signals. While the image partition is modelled as a sample of a Gibbs/Markov random field, the texture inside each image segment is described using functional approximation. The segmentation and the unknown parameters are estimated through iterative optimization of an MAP objective function. The obtained tesults are subjectively agreeable and well suited for the requirements of region-oriented transform image coding.
Seiichiro DAN Toshiyasu NAKAO Tadahiro KITAHASHI
We can understand and recover a scene even from a picture or a line drawing. A number of methods have been developed for solving this problem. They have scarcely aimed to deal with scenes of multiple objects although they have ability to recognize three-dimensional shapes of every object. In this paper, challenging to solve this problem, we describe a method for deciding configurations of multiple objects. This method employs the assumption of coplanarity and the constraint of occlusion. The assumption of coplanarity generates the candidates of configurations of multiple objects and the constraint of occlusion prunes impossible configurations. By combining this method with a method of shape recovery for individual objects, we have implemented a system acquirig a three-dimensional information of scene including multiple objects from a monocular image.
This paper presents a new approach to the recovery of 3-D structure from multiple pairs of images from different viewpoints. Searching for the corresponding points between images, which is common in stereopsis, is avoided. Extracted edges from input images are projected back into 3-D space, and their intersections are calculated directly. Many false intersections may appear, but if we have many pair images, true intersections are extracted by appropriate thresholding. Octree representation of the intersections enables this approach. We consider a way to treat adjacent edge piexels as a line segment rather than as individual points, which differs from previous works and leads to a new algorithm. Experimental results using both synthetic and actual images are also described.
Noriko SUZUKI Taroh SASAKI Ryuji KOHNO Hideki IMAI
This paper proposes and investigates an intelligent error-controlling scheme according to different importance of segmental information. In particular, the scheme is designed for facial images encoded by model-based coding that is a kind of intelligent compression coding. Intelligent communication systems regard the contents of information to be transmitted with extremely high compression and reliability. After highly efficient information compression by model-beaed coding, errors in the compressed information lead to severe semantic errors. The proposed scheme reduces semantic errors of information for the receiver. In this paper, we consider Action Unit (AU) as a segment of model-based coded facial image of human being and define the importance for each AU. According to the importance, an AU is encoded by an appropriated code among codes with different error-correcting capabilities. For encoding with different error controlling codes, we use three kinds of constructions to obtain unequal error protection (UEP) codes in this paper. One of them is the direct sum construction and the others are the proposed constructions which are based on joint and double coding. These UEP codes can have higher coderate than other UEP codes when minimum Hamming distance is small. By using these UEP codes, the proposed intelligent error-controlling scheme can protect information in segment in order to reduce semantic errors over a conventional error-controlling scheme in which information is uniformly protected by an error-correcting code.
An annoying problem encountered in automatic seal imprint verification is that for seal imprints may have a lot of variations, even if they are all produced from a single seal. This paper proposes a new automatic seal imprint verification system which adds an imprint quality assessment function to our previous system in order to solve this problem, and also examines the verification performance of this system experimentally. This system consists of an imprint quality assessment process and a verification process. In the imprint quality assessment process, an examined imprint is first divided into partial regions. Each partial region is classified into one of three quality classes (good quality region, poor quality region, and background) on the basis of characteristics of its gray level histogram. In the verification process, only good quality partial regions of an examined imprint are verified with registered one. Finally, the examined imprint is classified as one of two types: a genuine and a forgery. However, as a result of quality assessment, if the partial regions classified as poor quality are too many, the examined imprint is classified as ambiguous" without verification processing. A major advantage of this verification system is that this system can verify seal imprints of various qualities efficiently and accurately. Computer experiments with real seal imprints were performed by using this system, previous system (without image quality assessment function) and document examiners of a bank. The results of these experiments show that this system is superior in the verification performance to our previous system, and has a similar verification performance to that of document examiners (i.e., the experimental results show the effectiveness of adding the image quality assessment function to a seal imprint verification system).
Mamoru TANAKA Kenneth R. CROUNSE Tamás ROSKA
This paper describes highly parallel analog image coding and decoding by cellular neural networks (CNNs). The communication system in which the coder (C-) and decoder (D-) CNNs are embedded consists of a differential transmitter with an internal receiver model in the feedback loop. The C-CNN encodes the image through two cascaded techniques: structural compression and halftoning. The D-CNN decodes the received data through a reconstruction process, which includes a dynamic current distribution, so that the original input to the C-CNN can be recognized. The halftoning serves as a dynamic quantization to convert each pixel to a binary value depending on the neighboring values. We approach halftoning by the minimization of error energy between the original gray image and reconstructed halftone image, and the structural compression from the viewpoints of topological and regularization theories. All dynamics are described by CNN state equations. Both the proposed coding and decoding algorithms use only local image information in a space inveriant manner, therefore errors are distributed evenly and will not introduce the blocking effects found in DCT-based coding methods. In the future, the use of parallel inputs from on-chip photodetectors would allow direct dynamic quantization and compression of image sequences without the use of multiple bit analog-to-digital converters. To validate our theory, a simulation has been performed by using the relaxation method on an 150 frame image sequence. Each input image was 256256 pixels whth 8 bits per pixel. The simulated fixed compression rate, not including the Huffman coding, was about 1/16 with a PSNR of 31[dB]35[dB].
Naohiko SHIMIZU Gui-Xin CHENG Munemitsu IKEGAMI Yoshinori NAKAMURA Mamoru TANAKA
This paper describes a pipelining universal system of discrete time cellular neural networks (DTCNNs). The new relaxation-based algorithm which is called a Pipelining Gauss Seidel (PGS) method is used to solve the CNN state equations in pipelining. In the systolic system of N processor elements {PEi}, each PEi performs the convolusional computation (CC) of all cells and the preceding PEi-1 performs the CC of all cells taking precedence over it by the precedence interval number p. The expected maximum number of PE's for the speeding up is given by n/p where n means the number of cells. For its application, the encoding and decoding process of moving images is simulated.
Kazuki NAKASHIMA Masashi KOGA Katsumi MARUKAWA Yoshihiro SHIMA Yasuaki NAKANO
This paper proposes a new, high-speed method of filling in the contours of alpha-numeric characters to produce correct binary image patterns. We call this method the improved edge-fill method because it improves on a previously developed edge-fill method. Ambiguity of the conventional edge-fill method on binary images are eliminated by selecting fill pixels from combinations of Freeman's chain code, which expresses contour lines. Consequently, the areas inside the contour lines are filled in rapidly and correctly. With the new method, the processing time for character image generation is reduced by ten to tewnty percent over the conventional method. The effectiveness of the new method is examined in experiments using both Arabic numerals and letters from the Roman alphabet. Results show that this fill method is able to produce correct image patterns and that it can be applied to alpha-numeric-character contour filling.
This paper discusses the role of knowledge in document image understanding from the viewpoints of representation, utilization and acquisition. For the representation of knowledge, we propose two models, a layout model and a content model, which represent knowledge about the layout structure and content of a document, respectively. For the utilization of knowledge, we implement layout analysis and content analysis which utilize a layout model and a content model, respectively. The strategy of hypothesis generation and verification is introduced in order to integrate these two kinds of analysis. For the acquisition of knowledge, we propose a method of incremental acquisition of a layout model from a stream of example documents. From the experimental results of document image understanding and knowledge acquisition using 50 samples of visiting cards, we verified the effectiveness of the proposed method.
Toshihiro MATSUI Ikuo YAMASHITA Toru WAKAHARA
The Institute for Posts and Telecommunications Policy (IPTP) held its first character recognition competition in 1992 to ascertain the present status of ongoing research in character recognition and to find promising algorithms for handwritten numerals. In this paper, we report and analyze the results of this competition. In the competition, we adopted 3-digit handwritten postal code images gathered from live mail as recognition objects. Prior to the competition, 2,500 samples (7,500 characters) were distributed to the participants as traning data. By using about 10,000 different samples (29,883 characters), we tested 13 recognition programs submitted by five universities and eight manufacturing companies. According to the four kinds of evaluation criteria: recognition accuracy, recognition speed, robustness against degradation, and theroretical originality, we selected the best three recognition algorithms as the Prize of Highest Excellence. Interestingly enough, the best three recognition algorithms showed considerable diversity in their methodologies and had very few commonly substituted or rejected patterns. We analyzed the causes for these commonly substituted or rejected patterns and, moreover, examined the human ability to discriminate between these patterns. Next, by considering the complementary characteristics of each recognition algorithm, we studied a multi-expert recognition strategy using the best three recognition algorithms. Three kinds of combination rules: voting on the first candidate rule, minimal sum of candidate order rule, and minimal sum of dissimilarities rule were examined, and the latter two rules decreased the substitution rate to one third of that obtained by one-expert in the competition. Furthermore, we proposed a candidate appearance likelihood method which utilizes the conditional probability of each of ten digits given the candidate combination obtained by each algorithm. From the experiments, this method achieved surprisingly low values of both substitution and rejection rates. By taking account of its learning ability, the candidate appearance likelihood method is considered one of the most promising multi-expert systems.
Takashi SAITOH Toshifumi YAMAAI Michiyoshi TACHIKAWA
A system for segmentation of document image and ordering text areas is described, and applied to complex printed page layouts of both Japanese and English. There is no need to make any assumptions about the shape of blocks, hence the segmentation technique can handle not only skewed images without skew-correction but also documents where columns are not rectangular. In this technique, based on the bottom-up strategy, the connected components are extracted from the reduced image, and classiferd according to their local information. The connected components calssified as characters are then merged into lines, and the lines are merged into areas. Extracted text areas are classified as body, caption, header or footer. A tree graph of the layout of the body texts is made, and the texts ordered by preorder traversal on the graph. We introduce the concept of an influence range of each node, a procedure for handling titles, thus obtaining good results on various documents. The total system is fast and compact.
Masaji KATAGIRI Masakazu NAGURA
We apply neural networks to implement a line shape recognition/classification system. The purpose of employing neural networks is to eliminate target-specific algorithms from the system and to simplify the system. The system needs only to be trained by samples. The shapes are captured by the following operations. Lines to be processed are segmented at inflection points. Each segment is extended from both ends of it in a certain percentage. The shape of each extended segment is captured as an approximate curvature. Curvature sequence is normalized by size in order to get a scale-invariant measure. Feeding this normalized curvature date to a neural network leads to position-, rotation-, and scale-invariant line shape recognition. According to our experiments, almost 100% recognition rates are achieved against 5% random modification and 50%-200% scaling. The experimental results show that our method is effective. In addition, since this method captures shape locally, partial lines (caused by overlapping etc.) can also be recognized.