Duc NGUYEN Tran THUY HIEN Huyen T. T. TRAN Truong THU HUONG Pham NGOC NAM
Distance-aware quality adaptation is a potential approach to reduce the resource requirement for the transmission and rendering of textured 3D meshes. In this paper, we carry out a subjective experiment to investigate the effects of the distance from the camera on the perceptual quality of textured 3D meshes. Besides, we evaluate the effectiveness of eight image-based objective quality metrics in representing the user's perceptual quality. Our study found that the perceptual quality in terms of mean opinion score increases as the distance from the camera increases. In addition, it is shown that normalized mutual information (NMI), a full-reference objective quality metric, is highly correlated with subjective scores.
Yu WANG Tao LU Feng YAO Yuntao WU Yanduo ZHANG
In recent years, single face image super-resolution (SR) using deep neural networks have been well developed. However, most of the face images captured by the camera in a real scene are from different views of the same person, and the existing traditional multi-frame image SR requires alignment between images. Due to multi-view face images contain texture information from different views, which can be used as effective prior information, how to use this prior information from multi-views to reconstruct frontal face images is challenging. In order to effectively solve the above problems, we propose a novel face SR network based on multi-view face images, which focus on obtaining more texture information from multi-view face images to help the reconstruction of frontal face images. And in this network, we also propose a texture attention mechanism to transfer high-precision texture compensation information to the frontal face image to obtain better visual effects. We conduct subjective and objective evaluations, and the experimental results show the great potential of using multi-view face images SR. The comparison with other state-of-the-art deep learning SR methods proves that the proposed method has excellent performance.
Zhi LIU Yifan SU Shuzhong YANG Mengmeng ZHANG
Cross-component linear model (CCLM) chromaticity prediction is a new technique introduced in Versatile Video Coding (VVC), which utilizes the reconstructed luminance component to predict the chromaticity parts, and can improve the coding performance. However, it increases the coding complexity. In this paper, how to accelerate the chroma intra-prediction process is studied based on texture characteristics. Firstly, two observations have been found through experimental statistics for the process. One is that the choice of the chroma intra-prediction candidate modes is closely related to the texture complexity of the coding unit (CU), and the other is that whether the direct mode (DM) is selected is closely related to the texture similarity between current chromaticity CU and the corresponding luminance CU. Secondly, a fast chroma intra-prediction mode decision algorithm is proposed based on these observations. A modified metric named sum modulus difference (SMD) is introduced to measure the texture complexity of CU and guide the filtering of the irrelevant candidate modes. Meanwhile, the structural similarity index measurement (SSIM) is adopted to help judging the selection of the DM mode. The experimental results show that compared with the reference model VTM8.0, the proposed algorithm can reduce the coding time by 12.92% on average, and increases the BD-rate of Y, U, and V components by only 0.05%, 0.32%, and 0.29% respectively.
Hiroshi HAGA Takuya ASAI Shin TAKEUCHI Harue SASAKI Hirotsugu YAMAMOTO Koji SHIGEMURA
We developed an 8.4-inch electrostatic-tactile touch display using a segmented-electrode array (30×20) as both tactile pixels and touch sensors. Each pixel can be excited independently so that the electrostatic-tactile touch display allows presenting real localized tactile textures in any shape. A driving scheme in which the tactile strength is independent of the grounding state of the human body by employing two-phased actuation was also proposed and demonstrated. Furthermore, tactile crosstalk was investigated to find it was due to the voltage fluctuation in the human body and it was diminished by applying the aforementioned driving scheme.
Zhaolin LU Ziyan ZHANG Yi WANG Liang DONG Song LIANG
This letter presents an image quality assessment (IQA) metric for scanning electron microscopy (SEM) images based on texture inpainting. Inspired by the observation that the texture information of SEM images is quite sensitive to distortions, a texture inpainting network is first trained to extract texture features. Then the weights of the trained texture inpainting network are transferred to the IQA network to help it learn an effective texture representation of the distorted image. Finally, supervised fine-tuning is conducted on the IQA network to predict the image quality score. Experimental results on the SEM image quality dataset demonstrate the advantages of the presented method.
Luis Rafael MARVAL-PÉREZ Koichi ITO Takafumi AOKI
Access control and surveillance applications like walking-through security gates and immigration control points have a great demand for convenient and accurate biometric recognition in unconstrained scenarios with low user cooperation. The periocular region, which is a relatively new biometric trait, has been attracting much attention for recognition of an individual in such scenarios. This paper proposes a periocular recognition method that combines Phase-Based Correspondence Matching (PB-CM) with a texture enhancement technique. PB-CM has demonstrated high recognition performance in other biometric traits, e.g., face, palmprint and finger-knuckle-print. However, a major limitation for periocular region is that the performance of PB-CM degrades when the periocular skin has poor texture. We address this problem by applying texture enhancement and found out that variance normalization of texture significantly improves the performance of periocular recognition using PB-CM. Experimental evaluation using three public databases demonstrates the advantage of the proposed method compared with conventional methods.
As the display resolution increases, an effective image upscaling technique is required for recent displays such as an ultra-high-definition display. Even though various image super-resolution algorithms have been developed for the image upscaling, they still do not provide the excellent performance in the ultra-high-definition display. This is because the texture creation capability in the algorithms is not sufficient. Hence, this paper proposes an efficient texture creation algorithm for enhancing the texture super-resolution performance. For the texture creation, we build a database with random patches in the off-line processing and we then synthesize fine textures by employing guided filter in the on-line real-time processing, based on the database. Experimental results show that the proposed texture creation algorithm provides sharper and finer textures compared with the existing state-of-the-art algorithms.
Jingjing SI Jing XIANG Yinbo CHENG Kai LIU
Generalized approximate message passing (GAMP) can be applied to compressive phase retrieval (CPR) with excellent phase-transition behavior. In this paper, we introduced the cartoon-texture model into the denoising-based phase retrieval GAMP(D-prGAMP), and proposed a cartoon-texture model based D-prGAMP (C-T D-prGAMP) algorithm. Then, based on experiments and analyses on the variations of the performance of D-PrGAMP algorithms with iterations, we proposed a 2-stage D-prGAMP algorithm, which makes tradeoffs between the C-T D-prGAMP algorithm and general D-prGAMP algorithms. Finally, facing the non-convergence issues of D-prGAMP, we incorporated adaptive damping to 2-stage D-prGAMP, and proposed the adaptively damped 2-stage D-prGAMP (2-stage ADD-prGAMP) algorithm. Simulation results show that, runtime of 2-stage D-prGAMP is relatively equivalent to that of BM3D-prGAMP, but 2-stage D-prGAMP can achieve higher image reconstruction quality than BM3D-prGAMP. 2-stage ADD-prGAMP spends more reconstruction time than 2-stage D-prGAMP and BM3D-prGAMP. But, 2-stage ADD-prGAMP can achieve PSNRs 0.2∼3dB higher than those of 2-stage D-prGAMP and 0.3∼3.1dB higher than those of BM3D-prGAMP.
Xiaoyuan REN Libing JIANG Xiaoan TANG Junda ZHANG
Extracting 3D information from a single image is an interesting but ill-posed problem. Especially for those artificial objects with less texture such as smooth metal devices, the decrease of object detail makes the problem more challenging. Aiming at the texture-less object with symmetric structure, this paper proposes a novel method for 3D pose estimation from a single image by introducing implicit structural symmetry and context constraint as priori-knowledge. Firstly, by parameterized representation, the texture-less object is decomposed into a series of sub-objects with regular geometric primitives. Accordingly, the problem of 3D pose estimation is converted to a parameter estimation problem, which is implemented by primitive fitting algorithm. Then, the context prior among sub-objects is introduced for parameter refinement via the augmentedLagrange optimization. The effectiveness of the proposed method is verified by the experiments based on simulated and measured data.
Dynamic textures are sequences of images of moving scenes that exhibit certain stationarity properties in time. Hidden Markov model (HMM) is a statistical model, which has been used to model the dynamic texture. However, the texture is a region property. The traditional HMM models the property of a single pixel along the time, and does not consider the dependence of the spatial adjacent pixels of the dynamic texture. In this paper, the multivariate hidden Markov model (MHMM) is proposed to characterize and classify the dynamic textures. Specifically, the spatial adjacent pixels are modeled with multivariate hidden Markov model, in which the hidden states of those pixels are modeled with the multivariate Markov chain, and the intensity values of those pixels are modeled as the observation variables. Then the model parameters are used to describe the dynamic texture and the classification is based on the maximum likelihood criterion. The experiments on two benchmark datasets demonstrate the effectiveness of the introduced method.
Kangru WANG Lei QU Lili CHEN Jiamao LI Yuzhang GU Dongchen ZHU Xiaolin ZHANG
In this paper, a novel approach is proposed for stereo vision-based ground plane detection at superpixel-level, which is implemented by employing a Disparity Texture Map in a convolution neural network architecture. In particular, the Disparity Texture Map is calculated with a new Local Disparity Texture Descriptor (LDTD). The experimental results demonstrate our superior performance in KITTI dataset.
Chao LIANG Wenming YANG Fei ZHOU Qingmin LIAO
In this letter, we propose a novel framework to estimate the joint distribution of multiple Local Binary Patterns (LBPs). Multiple LBPs extracted from the same central pixel are first encoded using handcrafted encoding schemes to achieve rotation invariance, and the outputs are further encoded through a pre-trained Restricted Boltzmann Machine (RBM) to reduce the dimension of features. RBM has been successfully used as binary feature detectors and the binary-valued units of RBM seamlessly adapt to LBP. The proposed feature is called RBM-LBP. Experiments on the CUReT and Outex databases show that RBM-LBP is superior to conventional handcrafted encodings and more powerful in estimating the joint distribution of multiple LBPs.
A novel method for illumination-invariant face representation is presented based on the orthogonal decomposition of the local image structure. One important advantage of the proposed method is that image gradients and corresponding intensity values are simultaneously used with our decomposition procedure to preserve the original texture while yielding the illumination-invariant feature space. Experimental results demonstrate that the proposed method is effective for face recognition and verification even with diverse lighting conditions.
Wenming YANG Wenyang JI Fei ZHOU Qingmin LIAO
Automated biometrics identification using finger vein images has increasingly generated interest among researchers with emerging applications in human biometrics. The traditional feature-level fusion strategy is limited and expensive. To solve the problem, this paper investigates the possible use of infrared hybrid finger patterns on the back side of a finger, which includes both the information of finger vein and finger dorsal textures in original image, and a database using the proposed hybrid pattern is established. Accordingly, an Intersection enhanced Gabor based Direction Coding (IGDC) method is proposed. The Experiment achieves a recognition ratio of 98.4127% and an equal error rate of 0.00819 on our newly established database, which is fairly competitive.
Esmaeil POURJAM Daisuke DEGUCHI Ichiro IDE Hiroshi MURASE
Human body segmentation has many applications in a wide variety of image processing tasks, from intelligent vehicles to entertainment. A substantial amount of research has been done in the field of segmentation and it is still one of the active research areas, resulting in introduction of many innovative methods in literature. Still, until today, a method that can overcome the human segmentation problems and adapt itself to different kinds of situations, has not been introduced. Many of methods today try to use the graph-cut framework to solve the segmentation problem. Although powerful, these methods rely on a distance penalty term (intensity difference or RGB color distance). This term does not always lead to a good separation between two regions. For example, if two regions are close in color, even if they belong to two different objects, they will be grouped together, which is not acceptable. Also, if one object has multiple parts with different colors, e.g. humans wear various clothes with different colors and patterns, each part will be segmented separately. Although this can be overcome by multiple inputs from user, the inherent problem would not be solved. In this paper, we have considered solving the problem by making use of a human probability map, super-pixels and Grab-cut framework. Using this map relives us from the need for matching the model to the actual body, thus helps to improve the segmentation accuracy. As a result, not only the accuracy has improved, but also it also became comparable to the state-of-the-art interactive methods.
Chao LIANG Wenming YANG Fei ZHOU Qingmin LIAO
In this letter, we propose a novel texture descriptor that takes advantage of an anisotropic neighborhood. A brand new encoding scheme called Reflection and Rotation Invariant Uniform Patterns (rriu2) is proposed to explore local structures of textures. The proposed descriptor is called Oriented Local Binary Patterns (OLBP). OLBP may be incorporated into other varieties of Local Binary Patterns (LBP) to obtain more powerful texture descriptors. Experimental results on CUReT and Outex databases show that OLBP not only significantly outperforms LBP, but also demonstrates great robustness to rotation and illuminant changes.
Tetsuya MANABE Takaaki HASEGAWA Takashi SERIZAWA Nobuhiro MACHIDA Yuichi YOSHIDA Takayuki FUJIWARA
This paper presents two new types of markers of M-CubITS (M-sequence Multimodal Markers for ITS; M-Cubed for ITS) that is a ground-based positioning system, in order to advance the WYSIWYAS (What You See Is What You Are Suggested) navigation environments providing intuitive guidance. One of the new markers uses warning blocks of textured paving blocks that are often at important points as for pedestrian navigation, for example, the top and bottom of stairs, branch points, and so on. The other uses interlocking blocks that are often at wide spaces, e.g., pavements of plazas, parks, sidewalks and so on. Furthermore, we construct the integrated pedestrian navigation system equipped with the automatic marker-type identification function of the three types of markers (the warning blocks, the interlocking blocks, and the conventional marker using guidance blocks of textured paving blocks) in order to enhance the spatial availability of the whole M-CubITS and the navigation system. Consequently, we show the possibility to advance the WYSIWYAS navigation environments through the performance evaluation and the operation confirmation of the integrated system.
Jie SUN Lijian ZHOU Zhe-Ming LU Tingyuan NIE
In this Letter, a new iris recognition approach based on local Gabor orientation feature is proposed. On one hand, the iris feature extraction method using the traditional Gabor filters can cause time-consuming and high-feature dimension. On the other hand, we can find that the changes of original iris texture in angle and radial directions are more obvious than the other directions by observing the iris images. These changes in the preprocessed iris images are mainly reflected in vertical and horizontal directions. Therefore, the local directional Gabor filters are constructed to extract the horizontal and vertical texture characteristics of iris. First, the iris images are preprocessed by iris and eyelash location, iris segmentation, normalization and zooming. After analyzing the variety of iris texture and 2D-Gabor filters, we construct the local directional Gabor filters to extract the local Gabor features of iris. Then, the Gabor & Fisher features are obtained by Linear Discriminant Analysis (LDA). Finally, the nearest neighbor method is used to recognize the iris. Experimental results show that the proposed method has better iris recognition performance with less feature dimension and calculation time.
Perceptually optimized missing texture reconstruction via neighboring embedding (NE) is presented in this paper. The proposed method adopts the structural similarity (SSIM) index as a measure for representing texture reconstruction performance of missing areas. This provides a solution to the problem of previously reported methods not being able to perform perceptually optimized reconstruction. Furthermore, in the proposed method, a new scheme for selection of the known nearest neighbor patches for reconstruction of target patches including missing areas is introduced. Specifically, by monitoring the SSIM index observed by the proposed NE-based reconstruction algorithm, selection of known patches optimal for the reconstruction becomes feasible even if target patches include missing pixels. The above novel approaches enable successful reconstruction of missing areas. Experimental results show improvement of the proposed method over previously reported methods.
The paper proposes an algorithm to expose spliced photographs. Firstly, a graph-based segmentation, which defines a predictor to measure boundary evidence between two neighbor regions, is used to make greedy decision. Then the algorithm gets prediction error image using non-negative linear least-square prediction. For each pair of segmented neighbor regions, the proposed algorithm gathers their statistic features and calculates features of gray level co-occurrence matrix. K-means clustering is applied to create a dictionary, and the vector quantization histogram is taken as the result vector with fixed length. For a tampered image, its noise satisfies Gaussian distribution with zero mean. The proposed method checks the similarity between noise distribution and a zero-mean Gaussian distribution, and follows with the local flatness and texture measurement. Finally, all features are fed to a support vector machine classifier. The algorithm has low computational cost. Experiments show its effectiveness in exposing forgery.