1-7hit |
Masahiro TSUKADA Yuya UTSUMI Hirokazu MADOKORO Kazuhito SATO
This paper presents an unsupervised learning-based method for selection of feature points and object category classification without previous setting of the number of categories. Our method consists of the following procedures: 1)detection of feature points and description of features using a Scale-Invariant Feature Transform (SIFT), 2)selection of target feature points using One Class-Support Vector Machines (OC-SVMs), 3)generation of visual words of all SIFT descriptors and histograms in each image of selected feature points using Self-Organizing Maps (SOMs), 4)formation of labels using Adaptive Resonance Theory-2 (ART-2), and 5)creation and classification of categories on a category map of Counter Propagation Networks (CPNs) for visualizing spatial relations between categories. Classification results of static images using a Caltech-256 object category dataset and dynamic images using time-series images obtained using a robot according to movements respectively demonstrate that our method can visualize spatial relations of categories while maintaining time-series characteristics. Moreover, we emphasize the effectiveness of our method for category classification of appearance changes of objects.
Makoto KIMURA Hideo SAITO Takeo KANADE
In the field of computer vision and computer graphics, Image-Based-Rendering (IBR) methods are often used to synthesize images from real scene. The image synthesis by IBR requires dense correct matching points in the images. However, IBR does not require 3D geometry reconstruction or camera calibration in Euclidean geometry. On the other hand, 3D reconstructed model can easily point out the occlusion in images. In this paper, we propose an approach to reconstruct 3D shape in a voxel space, which is named Projective Voxel Space (PVS). Since PVS is defined by projective geometry, it requires only weak calibration. PVS is determined by rectifications of the epipolar lines in three images. Three rectified images are orthogonal projected images of a scene in PVS, so processing about image projection is easy in PVS. In both PVS and Euclidean geometry, a point in an image is on a projection from a point on a surface of the object in the scene. Then the other image might have a correct matching point without occlusion, or no matching point because of occlusion. This is a kind of restriction about searching matching points or surface of the object. Taking advantage of simplicity of projection in PVS, the correlation values of points in images are computed, and the values are iteratively refined using the restriction described above. Finally, the shapes of the objects in the scene are acquired in PVS. The reconstructed shape in PVS does not have similarity to 3D shape in Euclidean geometry. However, it denotes consistent matching points in three images, and also indicates the existence of occluded points. Therefore, the reconstructed shape in PVS is sufficient for image synthesis by IBR.
Kouichirou NISHIMURA Masao IZUMI Kunio FUKUNAGA
In case of object recognition using 3-D configuration data, the scale and poses of the object are important factors. If they are not known, we can not compare the object with the models in the database. Hence we propose a strategy for object recognition independently of its scale and poses, which is based on Hopfield neural network. And we also propose a strategy for estimation of the camera motion to reconstruct 3-D configuration of the object. In this strategy, the camera motion is estimated only with the sequential images taken by a moving camera. Consequently, the 3-D configuration of the object is reconstructed only with the sequential images. And we adopt the multiple regression analysis for estimation of the camera motion parameters so as to reduce the errors of them.
Junghyun HWANG Yoshiteru OOI Shinji OZAWA
An approach to estimate the information of moving objects is described in terms of their kinetic and static properties such as 2D velocity, acceleration, position, and the size of each object for the features of motion snd shape. To obtain the information of motion/shape of multiple objects, an advanced contour matching scheme is developed, which includes the synthesis of edge images and the analysis of object shape with a high matching confidence as well as a low computation cost. The scheme is composed of three algorithms: a motion estimation by an iterative triple cross-correlation, an image synthesis by shifting and masking the object, and a shape analysis for determining the object size. Implementing fuzzy membership functions to the object shape, the scheme gets improved in accuracy of capturing motion and shape of multiple moving objects. Experimental result shows that the proposed method is valid for several walking men in real scene.
Junghyun HWANG Yoshiteru OOI Shinji OZAWA
This paper describes an adaptive sensing system with tracking and zooming a moving object in the stable environment. Both the close contour matching technique and the effective determination of zoom ratio by fuzzy control are proposed for achieving the sensing system. First, the estimation of object feature parameters, 2-dimensional velocity and size, is based on close contour matching. The correspondence problem is solved with cross-correlation in projections extracted from object contours in the specialized difference images. In the stable environment, these contours matching, capable of eliminating occluded contours or random noises as well as background, works well without heavy-cost optical flow calculation. Next, in order to zoom the tracked object in accordance with the state of its shape or movement practically, fuzzy control is approached first. Three sets of input membership function--the confidence of object shape, the variance of object velocity, and the object size--are evaluated with the simplified implementation. The optimal focal length is achieved of not only desired size but safe tracking in combination with fuzzy rule matrix constituted of membership functions. Experimental results show that the proposed system is robust and valid for numerous kind of moving object in real scene with system period 1.85 sec.
Yasushi YAGI Yoshimitsu NISHIZAWA Masahiko YACHIDA
We have proposed a new omnidirectional image sensor COPIS (COnic Projection Image Sensor) for guiding navigation of a mobile robot. Its feature is passive sensing of the omnidirectional image of the environment in real-time (at the frame rate of a TV camera) using a conic mirror. COPIS is a suitable sensor for visual navigation in real world environment with moving objects. This paper describes a method for estimating the location and the motion of the robot by detecting the azimuth of each object in the omnidirectional image. In this method, the azimuth is matched with the given environmental map. The robot can always estimate its own location and motion precisely because COPIS observes a 360 degree view around the robot even if all edges are not extracted correctly from the omnidirectional image. We also present a method to avoid collision against unknown obstacles and estimate their locations by detecting their azimuth changes while the robot is moving in the environment. Using the COPIS system, we performed several experiments in the real world.
Makoto HONDA Michitaka KAMEYAMA Tatsuo HIGUCHI
The demand for high-speed image processing is obvious in many real-world computations such as robot vision. Not only high throughput but also small latency becomes an important factor of the performance, because of the requirement of frequent visual feedback. In this paper, a high-performance VLSI image processor based on the multiple-valued residue arithmetic circuit is proposed for such applications. Parallelism is hierarchically used to realize the high-performance VLSI image processor. First, spatially parallel architecture that is different from pipeline architecture is considered to reduce the latency. Secondly, residue number arithmetic is introduced. In the residue number arithmetic, data communication between the mod mi arithmetic units is not necessary, so that multiple mod mi arithmetic units can be completely separated to different chips. Therefore, a number of mod mi multiply adders can be implemented on a single VLSI chip based on the modulus-slice concept. Finally, each mod mi arithmetic unit can be effectively implemented in parallel structure using the concept of a pseudoprimitive root and the multiple-valued current-mode circuit technology. Thus, it is made clear that the throughout use of parallelism makes the latency 1/3 in comparison with the ordinary binary implementation.