Qi QI Zi TENG Hongmei HUO Ming XU Bing BAI
To super-resolve low-resolution (LR) face image suffering from strong noise and fuzzy interference, we present a novel approach for noisy face super-resolution (SR) that is based on three-level information representation constraints. To begin with, we develop a feature distillation network that focuses on extracting pertinent face information, which incorporates both statistical anti-interference models and latent contrast algorithms. Subsequently, we incorporate a face identity embedding model and a discrete wavelet transform model, which serve as additional supervision mechanisms for the reconstruction process. The face identity embedding model ensures the reconstruction of identity information in hypersphere identity metric space, while the discrete wavelet transform model operates in the wavelet domain to supervise the restoration of spatial structures. The experimental results clearly demonstrate the efficacy of our proposed method, which is evident through the lower Learned Perceptual Image Patch Similarity (LPIPS) score and Fréchet Inception Distances (FID), and overall practicability of the reconstructed images.
Yuhao LIU Zhenzhong CHU Lifei WEI
In the realm of Single Image Super-Resolution (SISR), the meticulously crafted Nonlocal Sparse Attention-based block demonstrates its efficacy in noise reduction and computational cost reduction for nonlocal (global) features. However, it neglect the traditional Convolutional-based block, which proficient in handling local features. Thus, merging both the Nonlocal Sparse Attention-based block and the Convolutional-based block to concurrently manage local and nonlocal features poses a significant challenge. To tackle the aforementioned issues, this paper introduces the Channel Contrastive Attention-based Local-Nonlocal Mutual block (CCLN) for Super-Resolution (SR). (1) We introduce the CCLN block, encompassing the Local Sparse Convolutional-based block for local features and the Nonlocal Sparse Attention-based network block for nonlocal features. (2) We introduce Channel Contrastive Attention (CCA) blocks, incorporating Sparse Aggregation into Convolutional-based blocks. Additionally, we introduce a robust framework to fuse these two blocks, ensuring that each branch operates according to its respective strengths. (3) The CCLN block can seamlessly integrate into established network backbones like the Enhanced Deep Super-Resolution network (EDSR), achieving in the Channel Attention based Local-Nonlocal Mutual Network (CCLNN). Experimental results show that our CCLNN effectively leverages both local and nonlocal features, outperforming other state-of-the-art algorithms.
Qi QI Liuyi MENG Ming XU Bing BAI
In face super-resolution reconstruction, the interference caused by the texture and color of the hair region on the details and contours of the face region can negatively affect the reconstruction results. This paper proposes a semantic-based, dual-branch face super-resolution algorithm to address the issue of varying reconstruction complexities and mutual interference among different pixel semantics in face images. The algorithm clusters pixel semantic data to create a hierarchical representation, distinguishing between facial pixel regions and hair pixel regions. Subsequently, independent image enhancement is applied to these distinct pixel regions to mitigate their interference, resulting in a vivid, super-resolution face image.
Tania SULTANA Sho KUROSAKI Yutaka JITSUMATSU Shigehide KUHARA Jun'ichi TAKEUCHI
We assess how well the recently created MRI reconstruction technique, Multi-Resolution Convolutional Neural Network (MRCNN), performs in the core medical vision field (classification). The primary goal of MRCNN is to identify the best k-space undersampling patterns to accelerate the MRI. In this study, we use the Figshare brain tumor dataset for MRI classification with 3064 T1-weighted contrast-enhanced MRI (CE-MRI) over three categories: meningioma, glioma, and pituitary tumors. We apply MRCNN to the dataset, which is a method to reconstruct high-quality images from under-sampled k-space signals. Next, we employ the pre-trained VGG16 model, which is a Deep Neural Network (DNN) based image classifier to the MRCNN restored MRIs to classify the brain tumors. Our experiments showed that in the case of MRCNN restored data, the proposed brain tumor classifier achieved 92.79% classification accuracy for a 10% sampling rate, which is slightly higher than that of SRCNN, MoDL, and Zero-filling methods have 91.89%, 91.89%, and 90.98% respectively. Note that our classifier was trained using the dataset consisting of the images with full sampling and their labels, which can be regarded as a model of the usual human diagnostician. Hence our results would suggest MRCNN is useful for human diagnosis. In conclusion, MRCNN significantly enhances the accuracy of the brain tumor classification system based on the tumor location using under-sampled k-space signals.
Masahiro MURAYAMA Toyohiro HIGASHIYAMA Yuki HARAZONO Hirotake ISHII Hiroshi SHIMODA Shinobu OKIDO Yasuyoshi TARUTA
High-quality depth images are required for stable and accurate computer vision. Depth images captured by depth cameras tend to be noisy, incomplete, and of low-resolution. Therefore, increasing the accuracy and resolution of depth images is desirable. We propose a method for reducing the noise and holes from depth images pixel by pixel, and increasing resolution. For each pixel in the target image, the linear space from the focal point of the camera through each pixel to the existing object is divided into equally spaced grids. In each grid, the difference from each grid to the object surface is obtained from multiple tracked depth images, which have noisy depth values of the respective image pixels. Then, the coordinates of the correct object surface are obtainable by reducing the depth random noise. The missing values are completed. The resolution can also be increased by creating new pixels between existing pixels and by then using the same process as that used for noise reduction. Evaluation results have demonstrated that the proposed method can do processing with less GPU memory. Furthermore, the proposed method was able to reduce noise more accurately, especially around edges, and was able to process more details of objects than the conventional method. The super-resolution of the proposed method also produced a high-resolution depth image with smoother and more accurate edges than the conventional methods.
Hiroya YAMAMOTO Daichi KITAHARA Hiroki KURODA Akira HIRABAYASHI
This paper addresses single image super-resolution (SR) based on convolutional neural networks (CNNs). It is known that recovery of high-frequency components in output SR images of CNNs learned by the least square errors or least absolute errors is insufficient. To generate realistic high-frequency components, SR methods using generative adversarial networks (GANs), composed of one generator and one discriminator, are developed. However, when the generator tries to induce the discriminator's misjudgment, not only realistic high-frequency components but also some artifacts are generated, and objective indices such as PSNR decrease. To reduce the artifacts in the GAN-based SR methods, we consider the set of all SR images whose square errors between downscaling results and the input image are within a certain range, and propose to apply the metric projection onto this consistent set in the output layers of the generators. The proposed technique guarantees the consistency between output SR images and input images, and the generators with the proposed projection can generate high-frequency components with few artifacts while keeping low-frequency ones as appropriate for the known noise level. Numerical experiments show that the proposed technique reduces artifacts included in the original SR images of a GAN-based SR method while generating realistic high-frequency components with better PSNR values in both noise-free and noisy situations. Since the proposed technique can be integrated into various generators if the downscaling process is known, we can give the consistency to existing methods with the input images without degrading other SR performance.
Kanghui ZHAO Tao LU Yanduo ZHANG Yu WANG Yuanzhi WANG
In recent years, compared with the traditional face super-resolution (SR) algorithm, the face SR based on deep neural network has shown strong performance. Among these methods, attention mechanism has been widely used in face SR because of its strong feature expression ability. However, the existing attention-based face SR methods can not fully mine the missing pixel information of low-resolution (LR) face images (structural prior). And they only consider a single attention mechanism to take advantage of the structure of the face. The use of multi-attention could help to enhance feature representation. In order to solve this problem, we first propose a new pixel attention mechanism, which can recover the structural details of lost pixels. Then, we design an attention fusion module to better integrate the different characteristics of triple attention. Experimental results on FFHQ data sets show that this method is superior to the existing face SR methods based on deep neural network.
Yu WANG Tao LU Zhihao WU Yuntao WU Yanduo ZHANG
Exploring the structural information as prior to facial images is a key issue of face super-resolution (SR). Although deep convolutional neural networks (CNNs) own powerful representation ability, how to accurately use facial structural information remains challenges. In this paper, we proposed a new residual fusion network to utilize the multi-scale structural information for face SR. Different from the existing methods of increasing network depth, the bottleneck attention module is introduced to extract fine facial structural features by exploring correlation from feature maps. Finally, hierarchical scales of structural information is fused for generating a high-resolution (HR) facial image. Experimental results show the proposed network outperforms some existing state-of-the-art CNNs based face SR algorithms.
Yu WANG Tao LU Feng YAO Yuntao WU Yanduo ZHANG
In recent years, single face image super-resolution (SR) using deep neural networks have been well developed. However, most of the face images captured by the camera in a real scene are from different views of the same person, and the existing traditional multi-frame image SR requires alignment between images. Due to multi-view face images contain texture information from different views, which can be used as effective prior information, how to use this prior information from multi-views to reconstruct frontal face images is challenging. In order to effectively solve the above problems, we propose a novel face SR network based on multi-view face images, which focus on obtaining more texture information from multi-view face images to help the reconstruction of frontal face images. And in this network, we also propose a texture attention mechanism to transfer high-precision texture compensation information to the frontal face image to obtain better visual effects. We conduct subjective and objective evaluations, and the experimental results show the great potential of using multi-view face images SR. The comparison with other state-of-the-art deep learning SR methods proves that the proposed method has excellent performance.
Kazuya URAZOE Nobutaka KUROKI Yu KATO Shinya OHTANI Tetsuya HIROSE Masahiro NUMA
This paper presents an image super-resolution technique using a convolutional neural network (CNN) and multi-task learning for multiple image categories. The image categories include natural, manga, and text images. Their features differ from each other. However, several CNNs for super-resolution are trained with a single category. If the input image category is different from that of the training images, the performance of super-resolution is degraded. There are two possible solutions to manage multi-categories with conventional CNNs. The first involves the preparation of the CNNs for every category. This solution, however, requires a category classifier to select an appropriate CNN. The second is to learn all categories with a single CNN. In this solution, the CNN cannot optimize its internal behavior for each category. Therefore, this paper presents a super-resolution CNN architecture for multiple image categories. The proposed CNN has two parallel outputs for a high-resolution image and a category label. The main CNN for the high-resolution image is a normal three convolutional layer-architecture, and the sub neural network for the category label is branched out from its middle layer and consists of two fully-connected layers. This architecture can simultaneously learn the high-resolution image and its category using multi-task learning. The category information is used for optimizing the super-resolution. In an applied setting, the proposed CNN can automatically estimate the input image category and change the internal behavior. Experimental results of 2× image magnification have shown that the average peak signal-to-noise ratio for the proposed method is approximately 0.22 dB higher than that for the conventional super-resolution with no difference in processing time and parameters. We have ensured that the proposed method is useful when the input image category is varying.
Jianfei CHEN Xiaowei ZHU Yuehua LI
Synthetic aperture interferometric radiometer (SAIR) is a powerful sensors for high-resolution imaging. However, because of the observation errors and small number of visibility sampling points, the accuracy of reconstructed images is usually low. To overcome this deficiency, a novel super-resolution imaging (SrI) method based on super-resolution reconstruction idea is proposed in this paper. In SrI method, sparse visibility functions are first measured at different observation locations. Then the sparse visibility functions are utilized to simultaneously construct the fusion visibility function and the fusion imaging model. Finally, the high-resolution image is reconstructed by solving the sparse optimization of fusion imaging model. The simulation results demonstrate that the proposed SrI method has higher reconstruction accuracy and can improve the imaging quality of SAIR effectively.
Kazuya URAZOE Nobutaka KUROKI Yu KATO Shinya OHTANI Tetsuya HIROSE Masahiro NUMA
Convolutional neural network (CNN)-based image super-resolutions are widely used as a high-quality image-enhancement technique. However, in general, they show little to no luminance isotropy. Thus, we propose two methods, “Luminance Inversion Training (LIT)” and “Luminance Inversion Averaging (LIA),” to improve the luminance isotropy of CNN-based image super-resolutions. Experimental results of 2× image magnification show that the average peak signal-to-noise ratio (PSNR) using Luminance Inversion Averaging is about 0.15-0.20dB higher than that for the conventional super-resolution.
Toyotaro TOKIMOTO Shintaro TOKIMOTO Kengo FUJII Shogo MORITA Hirotsugu YAMAMOTO
We propose a method to realize a subjective super-resolution on a high-speed LED display, which dynamically shows a set of four neighboring pixels on every LED pixel. We have experimentally confirmed the subjective super-resolution effect. This paper proposes a subjective super-resolution hypothesis in human visual system and reports simulation results with pseudo fixation eye movements.
Taito MANABE Yuichiro SHIBATA Kiyoshi OGURI
The super-resolution technology is one of the solutions to fill the gap between high-resolution displays and lower-resolution images. There are various algorithms to interpolate the lost information, one of which is using a convolutional neural network (CNN). This paper shows an FPGA implementation and a performance evaluation of a novel CNN-based super-resolution system, which can process moving images in real time. We apply horizontal and vertical flips to input images instead of enlargement. This flip method prevents information loss and enables the network to make the best use of its patch size. In addition, we adopted the residue number system (RNS) in the network to reduce FPGA resource utilization. Efficient multiplication and addition with LUTs increased a network scale that can be implemented on the same FPGA by approximately 54% compared to an implementation with fixed-point operations. The proposed system can perform super-resolution from 960×540 to 1920×1080 at 60fps with a latency of less than 1ms. Despite resource restriction of the FPGA, the system can generate clear super-resolution images with smooth edges. The evaluation results also revealed the superior quality in terms of the peak signal-to-noise ratio (PSNR) and the structural similarity (SSIM) index, compared to systems with other methods.
Reo AOKI Kousuke IMAMURA Akihiro HIRANO Yoshio MATSUDA
Recently, Super-resolution convolutional neural network (SRCNN) is widely known as a state of the art method for achieving single-image super resolution. However, performance problems such as jaggy and ringing artifacts exist in SRCNN. Moreover, in order to realize a real-time upconverting system for high-resolution video streams such as 4K/8K 60 fps, problems such as processing delay and implementation cost remain. In the present paper, we propose high-performance super-resolution via patch-based deep neural network (SR-PDNN) rather than a convolutional neural network (CNN). Despite the very simple end-to-end learning system, the SR-PDNN achieves higher performance than the conventional CNN-based approach. In addition, this system is suitable for ultra-low-delay video processing by hardware implementation using an application-specific integrated circuit (ASIC) or a field-programmable gate array (FPGA).
As the display resolution increases, an effective image upscaling technique is required for recent displays such as an ultra-high-definition display. Even though various image super-resolution algorithms have been developed for the image upscaling, they still do not provide the excellent performance in the ultra-high-definition display. This is because the texture creation capability in the algorithms is not sufficient. Hence, this paper proposes an efficient texture creation algorithm for enhancing the texture super-resolution performance. For the texture creation, we build a database with random patches in the off-line processing and we then synthesize fine textures by employing guided filter in the on-line real-time processing, based on the database. Experimental results show that the proposed texture creation algorithm provides sharper and finer textures compared with the existing state-of-the-art algorithms.
Di BAI Zhenghai WANG Mao TIAN Xiaoli CHEN
A triangular decomposition-based multipath super-resolution method is proposed to improve the range resolution of small unmanned aerial vehicle (UAV) radar altimeters that use a single channel with continuous direct spread waveform. In the engineering applications of small UAV radar altimeter, multipath scenarios are quite common. When the conventional matched filtering process is used under these environments, it is difficult to identify multiple targets in the same range cell due to the overlap between echoes. To improve the performance, we decompose the overlapped peaks yielded by matched filtering into a series of basic triangular waveforms to identify various targets with different time-shifted correlations of the pseudo-noise (PN) sequence. Shifting the time scale enables targets in the same range resolution unit to be identified. Both theoretical analysis and experiments show that the range resolution can be improved significantly, as it outperforms traditional matched filtering processes.
Masanari NOTO Fang SHANG Shouhei KIDERA Tetsuo KIRIMOTO
There is a strong demand for super-resolution time of arrival (TOA) estimation techniques for radar applications that can that can exceed the theoretical limits on range resolution set by frequency bandwidth. One of the most promising solutions is the use of compressed sensing (CS) algorithms, which assume only the sparseness of the target distribution but can achieve super-resolution. To preserve the reconstruction accuracy of CS under highly correlated and noisy conditions, we introduce a random resampling approach to process the received signal and thus reduce the coherent index, where the frequency-domain-based CS algorithm is used as noise reduction preprocessing. Numerical simulations demonstrate that our proposed method can achieve super-resolution TOA estimation performance not possible with conventional CS methods.
Hui Jung LEE Dong-Yoon CHOI Kyoung Won LIM Byung Cheol SONG
This paper presents a single image super-resolution (SR) algorithm based on self-similarity using non-local-mean (NLM) metric. In order to accurately find the best self-example even under noisy environment, NLM weight is employed as a self-similarity metric. Also, a pixel-wise soft-switching is presented to overcome an inherent drawback of conventional self-example-based SR that it seldom works for texture areas. For the pixel-wise soft-switching, an edge-oriented saliency map is generated for each input image. Here, we derived the saliency map which can be robust against noises by using a specific training. The proposed algorithm works as follows: First, auxiliary images for an input low-resolution (LR) image are generated. Second, self-examples for each LR patch are found from the auxiliary images on a block basis, and the best match in terms of self-similarity is found as the best self-example. Third, a preliminary high-resolution (HR) image is synthesized using all the self-examples. Next, an edge map and a saliency map are generated from the input LR image, and pixel-wise weights for soft-switching of the next step are computed from those maps. Finally, a super-resolved HR image is produced by soft-switching between the preliminary HR image for edges and a linearly interpolated image for non-edges. Experimental results show that the proposed algorithm outperforms state-of-the-art SR algorithms qualitatively and quantitatively.
Huan HAO Huali WANG Wanghan LV Liang CHEN
This paper proposes an effective continuous super-resolution (CSR) algorithm for the multipath channel estimation. By designing a preamble including up-chirp and down-chirp symbols, the Doppler shift and multipath delay are estimated jointly by using convex programming. Simulation results show that the proposed CSR can achieve better detection probability of the number of multipaths than the eigenvalue based methods. Moreover, compared with conventional super-resolution techniques, such as MUSIC and ESPRIT methods, the proposed CSR algorithm demonstrates its advantage in root mean square error of the Doppler shift and multipath delay, especially for the closely located paths within low SNR.