1-4hit |
Chengcheng JIANG Xinyu ZHU Chao LI Gengsheng CHEN
Pre-trained CNNs on ImageNet have been widely used in object tracking for feature extraction. However, due to the domain mismatch between image classification and object tracking, the submergence of the target-specific features by noise largely decreases the expression ability of the convolutional features, resulting in an inefficient tracking. In this paper, we propose a robust tracking algorithm with low-dimensional target-specific feature extraction. First, a novel cascaded PCA module is proposed to have an explicit extraction of the low-dimensional target-specific features, which makes the new appearance model more effective and efficient. Next, a fast particle filter process is raised to further accelerate the whole tracking pipeline by sharing convolutional computation with a ROI-Align layer. Moreover, a classification-score guided scheme is used to update the appearance model for adapting to target variations while at the same time avoiding the model drift that caused by the object occlusion. Experimental results on OTB100 and Temple Color128 show that, the proposed algorithm has achieved a superior performance among real-time trackers. Besides, our algorithm is competitive with the state-of-the-art trackers in precision while runs at a real-time speed.
Xinyu ZHU Jun ZHANG Gengsheng CHEN
Recent top-performing object detectors usually depend on a two-stage approach, which benefits from its region proposal and refining practice but suffers low detection speed. By contrast, one-stage approaches have the advantage of high efficiency while sacrifice their accuracies to some extent. In this paper, we propose a novel single-shot object detection network which inherits the merits of both. Motivated by the idea of semantic enrichment to the convolutional features within a typical deep detector, we propose two novel modules: 1) by modeling the semantic interactions between channels and the long-range dependencies between spatial positions, the self-attending module generates both channel and position attention, and enhance the original convolutional features in a self-guided manner; 2) leveraging the class-discriminative localization ability of classification-trained CNN, the semantic activating module learns a semantic meaningful convolutional response which augments low-level convolutional features with strong class-specific semantic information. The so called self-attending and semantic activating network (ASAN) achieves better accuracy than two-stage methods and is able to fulfil real-time processing. Comprehensive experiments on PASCAL VOC indicates that ASAN achieves state-of-the-art detection performance with high efficiency.
Gengsheng CHEN Chenxi QIAN Jun TAO
In this paper, a complete SSTA scheme is proposed to calculate the output waveform of a logic cell on any random selected point in the process variational space, or the mean value and variance of the output signal with very high accuracy and acceptable CPU cost. At first, Miller capacitances between the input nodes and internal nodes of a logic cell are introduced to construct the improved MCSM model so as to improve the modeling accuracy. Secondly, the stochastic collocation method jointed with the Modified Nested Sparse Grid technique is adopted for SSTA procedure to avoid the exponential increase of the collocation points number caused by tensor product. Thirdly, a Nominal waveform based Fast Simulation Method is developed to speedup the simulation on each collocation point. At last, Automatic Waveform Construction Technique is developed to construct the output waveform with the approximation points as little as possible to decrease the computational cost while guaranteeing high accuracy. Numerical results are also given to demonstrate the efficiency of the proposed algorithm.
In this paper, an analysis of the basic process of a class of interactive-graph-cut-based image segmentation algorithms indicates that it is unnecessary to construct n-links for all adjacent pixel nodes of an image before calculating the maximum flow and the minimal cuts. There are many pixel nodes for which it is not necessary to construct n-links at all. Based on this, we propose a new algorithm for the dynamic construction of all necessary n-links that connect the pixel nodes explored by the maximum flow algorithm. These n-links are constructed dynamically and without redundancy during the process of calculating the maximum flow. The Berkeley segmentation dataset benchmark is used to prove that this method can reduce the average running time of segmentation algorithms on the premise of correct segmentation results. This improvement can also be applied to any segmentation algorithm based on graph cuts.