1-20hit |
Feature detection and matching procedure require most of processing time in image matching where the time dramatically increases according to the number of feature points. The number of features is needed to be controlled for specific applications because of their processing time. This paper proposes a feature detection method based on significancy of local features. The feature significancy is computed for all pixels and higher significant features are chosen considering spatial distribution. The method contributes to reduce the number of features in order to match two images with maintaining high matching accuracy. It was shown that this approach was faster about two times in average processing time than FAST detector for natural scene images in the experiments.
Yoichi SASAKI Tetsuo SHIBUYA Kimihito ITO Hiroki ARIMURA
In this paper, we study the approximate point set matching (APSM) problem with minimum RMSD score under translation, rotation, and one-to-one correspondence in d-dimension. Since most of the previous works about APSM problems use similality scores that do not especially care about one-to-one correspondence between points, such as Hausdorff distance, we cannot easily apply previously proposed methods to our APSM problem. So, we focus on speed-up of exhaustive search algorithms that can find all approximate matches. First, we present an efficient branch-and-bound algorithm using a novel lower bound function of the minimum RMSD score for the enumeration version of APSM problem. Then, we modify this algorithm for the optimization version. Next, we present another algorithm that runs fast with high probability when a set of parameters are fixed. Experimental results on both synthetic datasets and real 3-D molecular datasets showed that our branch-and-bound algorithm achieved significant speed-up over the naive algorithm still keeping the advantage of generating all answers.
Kaimin CHEN Wei LI Zhaohuan ZHAN Binbin LIANG Songchen HAN
Since camera networks for surveillance are becoming extremely dense, finding the most informative and desirable views from different cameras are of increasing importance. In this paper, we propose a camera selection method to achieve the goal of providing the clearest visibility possible and selecting the cameras which exactly capture targets for the far-field surveillance. We design a benefit function that takes into account image visibility and the degree of target matching between different cameras. Here, visibility is defined using the entropy of intensity histogram distribution, and the target correspondence is based on activity features rather than photometric features. The proposed solution is tested in both artificial and real environments. A performance evaluation shows that our target correspondence method well suits far-field surveillance, and our proposed selection method is more effective at identifying the cameras that exactly capture the surveillance target than existing methods.
Accurate visual correspondence is the foundation of many computer vision based applications. Since existing image matching algorithms generate mismatches inevitably, a reliable mismatch-removal algorithm is highly desired to remove mismatches and preserve true matches. This paper proposes a hierarchical progressive trust (HPT) model to solve this problem. The HPT model first adopts a “trust the most trustworthy ones” strategy to select anchor inliers in its bottom layer, and then progressively propagates the trust from bottom layer to other layers in a bottom-up way: 1) bottom layer verifies anchor inliers with the guidance of local features; 2) middle layers progressively estimate local transformations and perform local verifications; 3) top layer estimates a global transformation with an anchor-inliers-guided expectation maximization (EM) algorithm and performs global verifications. Experimental results show that the proposed HPT model achieves higher performance than state-of-the-art mismatch-removal methods under both rigid transformations and non-rigid deformations.
We propose a new visual tracking method, where the target appearance is represented by combining color distribution and keypoints. Firstly, the object is localized via a keypoint-based tracking and matching strategy, where a new clustering method is presented to remove outliers. Secondly, the tracking confidence is evaluated by the color template. According to the tracking confidence, the local and global keypoints matching can be performed adaptively. Finally, we propose a target appearance update method in which the new appearance can be learned and added to the target model. The proposed tracker is compared with five state-of-the-art tracking methods on a recent benchmark dataset. Both qualitative and quantitative evaluations show that our method has favorable performance.
Huiyun JING Xin HE Qi HAN Xiamu NIU
The research of detecting co-saliency over multiple images is just beginning. The existing methods multiply the saliency on single image by the correspondence over multiple images to estimate co-saliency. They have difficulty in highlighting the co-salient object that is not salient on single image. It is caused by two problems. (1) The correspondence computation lacks precision. (2) The co-saliency multiplication formulation does not fully consider the effect of correspondence for co-saliency. In this paper, we propose a novel co-saliency detection scheme linearly combining foreground correspondence and single-view saliency. The progressive graph matching based foreground correspondence method is proposed to improve the precision of correspondence computation. Then the foreground correspondence is linearly combined with single-view saliency to compute co-saliency. According to the linear combination formulation, high correspondence could bring about high co-saliency, even when single-view saliency is low. Experiments show that our method outperforms previous state-of-the-art co-saliency methods.
In this paper we aim to group visual correspondences in order to detect objects or parts of objects commonly appearing in a pair of images. We first extract visual keypoints from images and establish initial point correspondences between two images by comparing their descriptors. Our method is based on two types of graphs, named relational graphs and correspondence graphs. A relational graph of a point is constructed by thresholding geometric and topological distances between the point and its neighboring points. A threshold value of a geometric distance is determined according to the scale of each keypoint, and a topological distance is defined as the shortest path on a Delaunay triangulation built from keypoints. We also construct a correspondence graph whose nodes represent two pairs of matched points or correspondences and edges connect consistent correspondences. Two correspondences are consistent with each other if they meet the local consistency induced by their relational graphs. The consistent neighborhoods should represent an object or a part of an object contained in a pair of images. The enumeration of maximal cliques of a correspondence graph results in groups of keypoint pairs which therefore involve common objects or parts of objects. We apply our method to common visual pattern detection, object detection, and object recognition. Quantitative experimental results demonstrate that our method is comparable to or better than other methods.
Deying FENG Jie YANG Cheng YANG Congxin LIU
We propose a retrieval method using scale invariant visual phrases (SIVPs). Our method encodes spatial information into the SIVPs which capture translation, rotation and scale invariance, and employs the SIVPs to determine the spatial correspondences between query image and database image. To compute the spatial correspondences efficiently, the SIVPs are introduced into the inverted index, and SIVP verification is investigated to refine the candidate images returned from inverted index. Experimental results demonstrate that our method improves the retrieval accuracy while increasing the retrieval efficiency.
Ching-Chi CHEN Wei-Yen HSU Shih-Hsuan CHIU Yung-Nien SUN
Image registration is an important topic in medical image analysis. It is usually used in 2D mosaics to construct the whole image of a biological specimen or in 3D reconstruction to build up the structure of an examined specimen from a series of microscopic images. Nevertheless, owing to a variety of factors, including microscopic optics, mechanisms, sensors, and manipulation, there may be great differences between the acquired image slices even if they are adjacent. The common differences include the chromatic aberration as well as the geometry discrepancy that is caused by cuts, tears, folds, and deformation. They usually make the registration problem a difficult challenge to achieve. In this paper, we propose an efficient registration method, which consists of a feature-based registration approach based on analytic robust point matching (ARPM) and a refinement procedure of the feature-based Levenberg-Marquardt algorithm (FLM), to automatically reconstruct 3D vessels of the rat brains from a series of microscopic images. The registration algorithm could speedily evaluate the spatial correspondence and geometric transformation between two point sets with different sizes. In addition, to achieve subpixel accuracy, an FLM method is used to refine the registered results. Due to the nonlinear characteristic of FLM method, it converges much faster than most other methods. We evaluate the performance of proposed method by comparing it with well-known thin-plate spline robust point matching (TPS-RPM) algorithm. The results indicate that the ARPM algorithm together with the FLM method is not only a robust but efficient method in image registration.
This paper introduces a fast image mosaicing technique that does not require costly search on image domain (e.g., pixel-to-pixel correspondence search on the image domain) and the iterative optimization (e.g., gradient-based optimization, iterative optimization, and random optimization) of geometric transformation parameter. The proposed technique is organized in a two-step manner. At both steps, histograms are fully utilized for high computational efficiency. At the first step, a histogram of pixel feature values is utilized to detect pairs of pixels with the same rare feature values as candidates of corresponding pixel pairs. At the second step, a histogram of transformation parameter values is utilized to determine the most reliable transformation parameter value. Experimental results showed that the proposed technique can provide reasonable mosaicing results in most cases with very feasible computations.
In this paper we propose an efficient line feature-based 2D object recognition algorithm using a novel entropy correspondence measure (ECM) that encodes the probabilistic similarity between two line feature sets. Since the proposed ECM-based method uses the whole structural information of objects simultaneously for matching, it overcomes the common drawbacks of the conventional techniques that are based on feature to feature correspondence. Moreover, since ECM is endowed with probabilistic attribute, it shows quite robust performance in the noisy environment. In order to enhance the recognition performance and speed, line features are pre-clustered into several groups according to their inclination by an eigen analysis, and then ECM is applied to each corresponding group individually. Experimental results on real images demonstrate that the proposed algorithm has superior performance to those of the conventional algorithms in both the accuracy and the computational efficiency, in the noisy environment.
This paper presents an approach that uses the Viterbi algorithm in a stereo correspondence problem. We propose a matching process which is visualized as a trellis diagram to find the maximum a posterior result. The matching process is divided into two parts: matching the left scene to the right scene and matching the right scene to the left scene. The last result of stereo problem is selected based on the minimum error for uniqueness by a comparison between the results of the two parts of matching process. This makes the stereo matching possible without explicitly detecting occlusions. Moreover, this stereo matching algorithm can improve the accuracy of the disparity image, and it has an acceptable running time for practical applications since it uses a trellis diagram iteratively and bi-directionally. The complexity of our proposed method is shown approximately as O(N2P), in which N is the number of disparity, and P is the length of the epipolar line in both the left and right images. Our proposed method has been proved to be robust when applied to well-known samples of stereo images such as random dot, Pentagon, Tsukuba image, etc. It provides a 95.7 percent of accuracy in radius 1 (differing by 1) for the Tsukuba images.
Alireza BEHRAD Seyed AHMAD MOTAMEDI
A new algorithm for fast detection and tracking of moving targets using a mobile video camera is presented. Our algorithm is based on image feature detection and matching. To detect features, we used edge points and their accumulated curvature. When the features are detected they are matched with their corresponding points using a new method called fuzzy-edge based feature matching. The proposed algorithm has two modes: detection and tracking. In the detection mode, background motion is estimated and compensated using an affine transformation. The resultant motion-rectified image is used for detection of the target location using split and merge algorithm. We also checked other features for precise detection of the target. When the target is identified, algorithm switches to the tracking mode, which also has two phases. In the first phase, the algorithm tracks the target with the intention to recover the target bounding-box more precisely and when the target bounding-box is determined precisely, the second phase of tracking algorithm starts to track the specified target more accurately. The algorithm has good performance in the environment with noise and illumination change.
A new dynamic programming (DP) based algorithm for monotonic and continuous two-dimensional warping (2DW) is presented. This algorithm searches for the optimal pixel-to-pixel mapping between a pair of images subject to monotonicity and continuity constraints with by far less time complexity than the algorithm previously reported by the authors. This complexity reduction results from a refinement of the multi-stage decision process representing the 2DW problem. As an implementation technique, a polynomial order approximation algorithm incorporated with beam search is also presented. Theoretical and experimental comparisons show that the present approximation algorithm yields better performance than the previous approximation algorithm.
Iris FERMIN Atsushi IMIYA Akira ICHIKAWA
We introduce two probabilistic algorithms to determine the motion parameters of a planar shape without knowing a priori the point-to-point correspondences. If the target is limited to rigid objects, an Euclidean transformation can be expressed as a linear equation with six parameters, i.e. two translational parameters and four rotational parameters (the axis of rotation and the rotational speed about the axis). These parameters can be determined by applying the randomized Hough transform. One remarkable feature of our algorithms is that the calculations of the translation and rotation parameters are performed by using points randomly selected from two image frames that are acquired at different times. The estimation of rotation parameters is done using one of two approaches, which we call the triangle search and the polygon search algorithms respectively. Both methods focus on the intersection points of a boundary of the 2D shape and the circles whose centers are located at the shape's centroid and whose radii are generated randomly. The triangle search algorithm randomly selects three different intersection points in each image, such that they form congruent triangles, and then estimates the rotation parameter using these two triangles. However, the polygon search algorithm employs all the intersection points in each image, i.e. all the intersection points in the two image frames form two polygons, and then estimates the rotation parameter with aid of the vertices of these two polygons.
Correspondence problem in road image sequence is discussed and a method to establish road correspondence from its perspective image sequence is suggested. The proposed method is mainly based on the features of turn angles of road edge points, while the turn angle for each edge point at one time can be computed from the frame based on the determination of matching points whin that frame. The turn angles will change from frame to frame according to the panning rotation of the camera and, each stationary edge point, the difference of turn angles between two frames equals the panning angle of the camera. Thus we develop an algorithm to estimate the value of panning angle of the camera by which correspondence in road image sequence can be established.
Toru WAKAHARA Akira SUZUKI Naoki NAKAJIMA Sueharu MIYAHARA Kazumi ODAKA
This paper describes an on-line Kanji character recognition method that solves the one-to-one stroke correspondence problem with both the stroke-number and stroke-order variations common in cursive Japanese handwriting. We propose two kinds of complementary algorithms: one dissolves excessive mapping and the other dissolves deficient mapping. Their joint use realizes stable optimal stroke correspondence without combinatorial explosion. Also, three kinds of inter-stroke distances are devised to deal with stroke concatenation or splitting and heavy shape distortion. These new ideas greatly improve the stroke matching ability of the selective stroke linkage method reported earlier by the authors. In experiments, only a single reference pattern for each of 2,980 Kanji character categories is generated by using training data composed of 120 patterns written carefully with the correct stroke-number and stroke-order. Recognition tests are made using the training data and two kinds of test data in the square style and in the cursive style written by 36 different people; recognition rates of 99.5%, 97.6%, and 94.1% are obtained, respectively. Moreover, comparative results obtained by the current OCR technique as applied to bitmap patterns of on-line character data are presented. Finally, future work for enhancing the stroke matching approach to cursive Kanji character recognition is discussed.
Nobuaki MINEMATSU Keikichi HIROSE
A new clustering method was proposed to increase the effect of duration modeling on the HMM-based phoneme recognition. A precise observation on the temporal correspondences between a phoneme HMM with output probabilities by single Gaussian modeling and its training data indicated that there were two extreme cases, one with several types of correspondences in a phoneme class completely different from each other, and the other with only one type of correspondence. Although duration modeling was commonly used to incorporate the temporal information in the HMMs, a good modeling could not be obtained for the former case. Further observation for phoneme HMMs with output probabilities by Gaussian mixture modeling also showed that some HMMs still had multiple temporal correspondences, though the number of such phonemes was reduced as compared to the case of single Gaussian modeling. An appropriate duration modeling cannot be obtained for these phoneme HMMs by the conventional methods, where the duration distribution for each HMM state is represented by a distribution function. In order to cope with the problem, a new method was proposed which was based on the clustering of phoneme classes with plural types of temporal correspondences into sub-classes. The clustering was conducted so as to reduce the variations of the temporal correspondences in sub-classes. After the clustering, an HMM was constructed for each sub-class. Using the proposed method, speaker dependent recognition experiments were performed for phonemes segmented from isolated words. A few-percent increase was realized in the recognition rate, which was not obtained by another method based on the duration modeling with a Gaussian mixture.
Shin-Chung WANG Chung-Lin HUANG
This paper presents a modified disparity measurement to recover the depth and a robust method to estimate motion parameters. First, this paper considers phase correspondence for the computation of disparity. It has less computation for disparity than previous methods that use the disparity from correspondence and from correlation. This modified disparity measurement uses the Gabor filter to analyze the local phase property and the exponential filter to analyze the global phase property. These two phases are added to make quasi-linear phases of the stereo image channels which are used for the stereo disparity finding and the structure recovery of scene. Then, we use feature-based correspondence to find the corresponding feature points in temporal image pair. Finally, we combine the depth map and use disparity motion stereo to estimate 3-D motion parameters.
Hiroshi SAKO Hadar Itzhak AVI-ITZHAK
A problem which often arises in computer vision is that of matching corresponding points of images. In the case of object recognition, for example, the computer compares new images to templates from a library of known objects. A common way to perform this comparison is to extract feature points from the images and compare these points with the template points. Another common example is the case of motion detection, where feature points of a video image are compared to those of the previous frame. Note that in both of these example, the point correspondence is complicated by the fact that the point sets are not only randomly ordered but have also been distorted by an unknown transformation and having quite different coordinates. In the case of object recognition, there exists a transformation from the object being viewed, to its projection onto the camera's imaging plane, while in the motion detection case, this transformation represents the motion (translation and rotation) of the ofject. If the parameters of the transformation are completely unknow, then all n! permutations must be compared (n : number of feature points). For each permutation, the ensuing transformation is computed using the least-squared projection method. The exponentially large computation required for this is prohibitive. A neural computational method is propopsed to solve these combinatorial problems. This method obtains the best correspondence matching and also finds the associated transform parameters. The method was applied to two dimensional point correspondence and three-to-two dimensional correspondence. Finally, this connectionist approach extends readily to a Boltzmann machine implementation. This implementation is desirable when the transformation is unknown, as it is less sensitive to local minima regardless of initial conditions.