1-11hit |
Qingbo WU Jian XIONG Bing LUO Chao HUANG Linfeng XU
In this paper, we propose a novel joint rate distortion optimization (JRDO) model for intra prediction coding. The spatial prediction dependency is exploited by modeling the distortion propagation with a linear fitting function. A novel JRDO based Lagrange multiplier (LM) is derived from this model. To adapt to different blocks' distortion propagation characteristics, we also introduce a generalized multiple Lagrange multiplier (MLM) framework where some candidate LMs are used in the RDO process. Experiment results show that our proposed JRDO-MLM scheme is superior to the H.264/AVC encoder.
Yinan LIU Qingbo WU Linfeng XU Bo WU
Traditional action recognition approaches use pre-defined rigid areas to process the space-time information, e.g. spatial pyramids, cuboids. However, most action categories happen in an unconstrained manner, that is, the same action in different videos can happen at different places. Thus we need a better video representation to deal with the space-time variations. In this paper, we introduce the idea of mining spatial temporal saliency. To better handle the uniqueness of each video, we use a space-time over-segmentation approach, e.g. supervoxel. We choose three different saliency measures that take not only the appearance cues, but also the motion cues into consideration. Furthermore, we design a category-specific mining process to find the discriminative power in each action category. Experiments on action recognition datasets such as UCF11 and HMDB51 show that the proposed spatial temporal saliency video representation can match or surpass some of the state-of-the-art alternatives in the task of action recognition.
In this letter, we propose a new semantic parts learning approach to address the object detection problem with only the bounding boxes of object category labels. Our main observation is that even though the appearance and arrangement of object parts might have variations across the instances of different object categories, the constituent parts still maintain geometric consistency. Specifically, we propose a discriminative clustering method with sparse representation refinement to discover the mid-level semantic part set automatically. Then each semantic part detector is learned by the linear SVM in a one-vs-all manner. Finally, we utilize the learned part detectors to score the test image and integrate all the response maps of part detectors to obtain the detection result. The learned class-generic part detectors have the ability to capture the objects across different categories. Experimental results show that the performance of our approach can outperform some recent competing methods.
Wenzhu WANG Kun JIANG Yusong TAN Qingbo WU
Hierarchical scheduling for multiple resources is partially responsible for the performance achievements in large scale datacenters. However, the latest scheduling technique, Hierarchy Dominant Resource Fairness (H-DRF)[1], has some shortcomings in heterogeneous environments, such as starving certain jobs or unfair resource allocation. This is because a heterogeneous environment brings new challenges. In this paper, we propose a novel scheduling algorithm called Dominant Fairness Fairness (DFF). DFF tries to keep resource allocation fair, avoid job starvation, and improve system resource utilization. We implement DFF in the YARN system, a most commonly used scheduler for large scale clusters. The experimental results show that our proposed algorithm leads to higher resource utilization and better throughput than H-DRF.
Rongzhen LI Qingbo WU Yusong TAN Junyang ZHANG
Software-defined networking (SDN) has emerged as a promising approach to enable network innovation, which can provide network virtualization through a hypervisor plane to share the same cloud datacenter network among multiple virtual networks. While, this attractive approach may bring some new problem that leads to more susceptible to the failure of network component because of the separated control and forwarding planes. The centralized control and virtual network sharing the same physical network are becoming fragile and prone to failure if the topology of virtual network and the control path is not properly designed. Thus, how to map virtual network into physical datacenter network in virtualized SDN while guaranteeing the survivability against the failure of physical component is extremely important and should fully consider more influence factors on the survivability of virtual network. In this paper, combining VN with SDN, a topology-aware survivable virtual network embedding approach is proposed to improve the survivability of virtual network by an enhanced virtual controller embedding strategy to optimize the placement selection of virtual network without using any backup resources. The strategy explicitly takes account of the network delay and the number of disjoint path between virtual controller and virtual switch to minimize the expected percentage of control path loss with survivable factor. Extensive experimental evaluations have been conducted and the results verify that the proposed technology has improved the survivability and network delay while keeping the other within reasonable bounds.
Kai TAN Qingbo WU Fanman MENG Linfeng XU
Saliency quality assessment aims at estimating the objective quality of a saliency map without access to the ground-truth. Existing works typically evaluate saliency quality by utilizing information from saliency maps to assess its compactness and closedness while ignoring the information from image content which can be used to assess the consistence and completeness of foreground. In this letter, we propose a novel multi-information fusion network to capture the information from both the saliency map and image content. The key idea is to introduce a siamese module to collect information from foreground and background, aiming to assess the consistence and completeness of foreground and the difference between foreground and background. Experiments demonstrate that by incorporating image content information, the performance of the proposed method is significantly boosted. Furthermore, we validate our method on two applications: saliency detection and segmentation. Our method is utilized to choose optimal saliency map from a set of candidate saliency maps, and the selected saliency map is feeded into an segmentation algorithm to generate a segmentation map. Experimental results verify the effectiveness of our method.
Qingbo WU Linfeng XU Zhengning WANG
In this letter, we propose a novel intra prediction coding scheme for H.264/AVC. Based on our proposed minimum distance prediction (MDP) scheme, the optimal reference samples for predicting the current pixel can be adaptively updated corresponding to different video contents. The experimental results show that up to 2 dB and 1 dB coding gains can be achieved with the proposed method for QCIF and CIF sequences respectively.
Yurui XIE Qingbo WU Bing LUO Chao HUANG Liangzhi TANG
In this letter, we exploit a new framework for detecting the non-specific object via combing the top-down and bottom-up cues. Specifically, a novel supervised discriminative dictionaries learning method is proposed to learn the coupled dictionaries for the object and non-object feature spaces in terms of the top-down cue. Different from previous dictionary learning methods, the new data reconstruction residual terms of coupled feature spaces, the sparsity penalty measures on the representations and an inconsistent regularizer for the learned dictionaries are all incorporated in a unitized objective function. Then we derive an iterative algorithm to alternatively optimize all the variables efficiently. Considering the bottom-up cue, the proposed discriminative dictionaries learning is then integrated with an unsupervised dictionary learning to capture the objectness windows in an image. Experimental results show that the non-specific object detection problem can be effectively solved by the proposed dictionary leaning framework and outperforms some established methods.
Bing LUO Chao HUANG Lei MA Wei LI Qingbo WU
This paper proposes a novel method to segment the object of a specific class based on a rough detection window (such as Deformable Part Model (DPM) in this paper), which is robust to the positions of the bounding boxes. In our method, the DPM is first used to generate the root and part windows of the object. Then a set of object part candidates are generated by randomly sampling windows around the root window. Furthermore, an undirected graph (the minimum spanning tree) is constructed to describe the spatial relationships between the part windows. Finally, the object is segmented by grouping the part proposals on the undirected graph, which is formulated as an energy function minimization problem. A novel energy function consisting of the data term and the smoothness term is designed to characterize the combination of the part proposals, which is globally minimized by the dynamic programming on a tree. Our experimental results on challenging dataset demonstrate the effectiveness of the proposed method.
Nii L. SOWAH Qingbo WU Fanman MENG Liangzhi TANG Yinan LIU Linfeng XU
In this paper, we improve upon the accuracy of existing tracklet generation methods by repairing tracklets based on their quality evaluation and detection propagation. Starting from object detections, we generate tracklets using three existing methods. Then we perform co-tracklet quality evaluation to score each tracklet and filtered out good tracklet based on their scores. A detection propagation method is designed to transfer the detections in the good tracklets to the bad ones so as to repair bad tracklets. The tracklet quality evaluation in our method is implemented by intra-tracklet detection consistency and inter-tracklet detection completeness. Two propagation methods; global propagation and local propagation are defined to achieve more accurate tracklet propagation. We demonstrate the effectiveness of the proposed method on the MOT 15 dataset
Yinan LIU Qingbo WU Liangzhi TANG Linfeng XU
In this paper, we propose a novel self-supervised learning of video representation which is capable to anticipate the video category by only reading its short clip. The key idea is that we employ the Siamese convolutional network to model the self-supervised feature learning as two different image matching problems. By using frame encoding, the proposed video representation could be extracted from different temporal scales. We refine the training process via a motion-based temporal segmentation strategy. The learned representations for videos can be not only applied to action anticipation, but also to action recognition. We verify the effectiveness of the proposed approach on both action anticipation and action recognition using two datasets namely UCF101 and HMDB51. The experiments show that we can achieve comparable results with the state-of-the-art self-supervised learning methods on both tasks.