Limin CHEN Jing XU Peter Xiaoping LIU Hui YU
Compressive spectral imaging (CSI) systems capture the 3D spatiospectral data by measuring the 2D compressed focal plane array (FPA) coded projection with the help of reconstruction algorithms exploiting the sparsity of signals. However, the contradiction between the multi-dimension of the scenes and the limited dimension of the sensors has limited improvement of recovery performance. In order to solve the problem, a novel CSI system based on a coded aperture snapshot spectral imager, RGB-CASSI, is proposed, which has two branches, one for CASSI, another for RGB images. In addition, considering that conventional reconstruction algorithms lead to oversmoothing, a RGB-guided low-rank (RGBLR) method for compressive hyperspectral image reconstruction based on compressed sensing and coded aperture spectral imaging system is presented, in which the available additional RGB information is used to guide the reconstruction and a low-rank regularization for compressive sensing and a non-convex surrogate of the rank is also used instead of nuclear norm for seeking a preferable solution. Experiments show that the proposed algorithm performs better in both PSNR and subjective effects compared with other state-of-art methods.
Akira YAMAWAKI Seiichi SERIKAWA
This paper shows a describing method of an image processing software in C for high-level synthesis (HLS) technology considering function chaining to realize an efficient hardware. A sophisticated image processing would be built on the sequence of several primitives represented as sub-functions like the gray scaling, filtering, binarization, thinning, and so on. Conventionally, generic describing methods for each sub-function so that HLS technology can generate an efficient hardware module have been shown. However, few studies have focused on a systematic describing method of the single top function consisting of the sub-functions chained. According to the proposed method, any number of sub-functions can be chained, maintaining the pipeline structure. Thus, the image processing can achieve the near ideal performance of 1 pixel per clock even when the processing chain is long. In addition, implicitly, the deadlock due to the mismatch of the number of pushes and pops on the FIFO connecting the functions is eliminated and the interpolation of the border pixels is done. The case study on a canny edge detection including the chain of some sub-functions demonstrates that our proposal can easily realize the expected hardware mentioned above. The experimental results on ZYNQ FPGA show that our proposal can be converted to the pipelined hardware with moderate size and achieve the performance gain of more than 70 times compared to the software execution. Moreover, the reconstructed C software program following our proposed method shows the small performance degradation of 8% compared with the pure C software through a comparative evaluation preformed on the Cortex A9 embedded processor in ZYNQ FPGA. This fact indicates that a unified image processing library using HLS software which can be executed on CPU or hardware module for HW/SW co-design can be established by using our proposed describing method.
Hejiu ZHANG Ningmei YU Nan LYU Keren LI
This letter presents a 12-bit column-parallel hybrid two-step successive approximation register/single-slope analog-to-digital converter (SAR/SS ADC) for CMOS image sensor (CIS). For achieving a high conversion speed, a simple SAR ADC is used in upper 6-bit conversion and a conventional SS ADC is used in lower 6-bit conversion. To reduce the power consumption, a comparator is shared in each column, and a 6-bit ramp generator is shared by all columns. This ADC is designed in SMIC 0.18µm CMOS process. At a clock frequency of 22.7MHz, the conversion time is 3.2µs. The ADC has a DNL of -0.31/+0.38LSB and an INL of -0.86/+0.8LSB. The power consumption of each column ADC is 89µW and the ramp generator is 763µW.
Masayuki KINOSHITA Takaya YAMAZATO Hiraku OKADA Toshiaki FUJII Shintaro ARAI Tomohiro YENDO Koji KAMAKURA
Image sensor communication (ISC), derived from visible light communication (VLC) is an attractive solution for outdoor mobile environments, particularly for intelligent transport systems (ITS). In ITS-ISC, tracking a transmitter in the image plane is critical issue since vehicle vibrations make it difficult to selsct the correct pixels for data reception. Our goal in this study is to develop a precise tracking method. To accomplish this, vehicle vibration modeling and its parameters estimation, i.e., represetative frequencies and their amplitudes for inherent vehicle vibration, and the variance of the Gaussian random process represnting road surface irregularity, are required. In this paper, we measured actual vehicle vibration in a driving situation and determined parameters based on the frequency characteristics. Then, we demonstrate that vehicle vibration that induces transmitter displacement in an image plane can be modeled by only Gaussian random processes that represent road surface irregularity when a high frame rate (e.g., 1000fps) image sensor is used as an ISC receiver. The simplified vehicle vibration model and its parameters are evaluated by numerical analysis and experimental measurement and obtained result shows that the proposed model can reproduce the characteristics of the transmitter displacement sufficiently.
Taichi YOSHIDA Masahiro IWAHASHI Hitoshi KIYA
In this paper, we propose a 2-layer lossless coding method for high dynamic range (HDR) images based on range compression and adaptive inverse tone-mapping. Recently, HDR images, which have a wider range of luminance than conventional low dynamic range (LDR) ones, have been frequently used in various fields. Since commonly used devices cannot yet display HDR images, 2-layer coding methods that decode not only HDR images but also their LDR versions have been proposed. We have previously proposed a state-of-the-art 2-layer lossless coding method for HDR images that unfortunately has huge HDR file size. Hence, we introduce two ideas to reduce the HDR file size to less than that of the previous method. The proposed method achieves high compression ratio and experiments show that it outperforms the previous method and other conventional methods.
Object detection has been a hot topic of image processing, computer vision and pattern recognition. In recent years, training a model from labeled images using machine learning technique becomes popular. However, the relationship between training samples is usually ignored by existing approaches. To address this problem, a novel approach is proposed, which trains Siamese convolutional neural network on feature pairs and finely tunes the network driven by a small amount of training samples. Since the proposed method considers not only the discriminative information between objects and background, but also the relationship between intraclass features, it outperforms the state-of-arts on real images.
Yuqiang CAO Weiguo GONG Bo ZHANG Fanxin ZENG Sen BAI
Block compressed sensing (CS) with optimal permutation is a promising method to improve sampling efficiency in CS-based image compression. However, the existing optimal permutation scheme brings a large amount of extra data to encode the permutation information because it needs to know the permutation information to accomplish signal reconstruction. When the extra data is taken into consideration, the improvement in sampling efficiency of this method is limited. In order to solve this problem, a new optimal permutation strategy for block CS (BCS) is proposed. Based on the proposed permutation strategy, an improved optimal permutation based BCS method called BCS-NOP (BCS with new optimal permutation) is proposed in this paper. Simulation results show that the proposed approach reduces the amount of extra data to encode the permutation information significantly and thereby improves the sampling efficiency compared with the existing optimal permutation based BCS approach.
This paper newly proposes a fast computation technique on the method of image Green's function for p-characteristic calculations, when a plane wave with the transverse wavenumber p is incident on a periodic rough surface having perfect conductivity. In the computation of p-characteristics, based on a spectral domain periodicity of the periodic image Green's function, the image integral equation for a given incidence p maintains the same form for other particular incidences except for the excitation term. By means of a quadrature method, such image integral equations lead to matrix equations. Once the first given matrix equation is performed by a solution procedure as calculations of its matrix elements and its inverse matrix, the other matrix equations for other particular incidences no longer need such a solution procedure. Thus, the total CPU time for the computation of p-characteristics is largely reduced in complex shaped surface cases, huge roughness cases or large period cases.
A robust identification scheme for JPEG images is proposed in this paper. The aim is to robustly identify JPEG images that are generated from the same original image, under various compression conditions such as differences in compression ratios and initial quantization matrices. The proposed scheme does not provide any false negative matches in principle. In addition, secure features, which do not have any visual information, are used to achieve not only a robust identification scheme but also secure one. Conventional schemes can not avoid providing false negative matches under some compression conditions, and are required to manage a secret key for secure identification. The proposed scheme is applicable to the uploading process of images on social networks like Twitter for image retrieval and forensics. A number of experiments are carried out to demonstrate that the effectiveness of the proposed method. The proposed method outperforms conventional ones in terms of query performances, while keeping a reasonable security level.
Ming XU Xiaosheng YU Chengdong WU Dongyue CHEN
A robust pedestrian detection approach in thermal infrared imageries for an all-day surveillance is proposed. Firstly, the candidate regions which are likely to contain pedestrians are extracted based on a saliency detection method. Then a deep convolutional network with a multi-task loss is constructed to recognize the pedestrians. The experimental results show the superiority of the proposed approach in pedestrian detection.
Strategic Dual Image method (SDI) for three-dimensional magnetic field problems is proposed. The basic idea of the SDI method is that the open boundary solution is in-between the Dirichlet and Neumann solutions. The relationship between the specific topology (e.g. sphere, and ellipsoid) of the boundary and the averaging weight has been discussed in the previous literature, however no discussions on the arbitrary topology. In this paper, combined with “the perturbation approach using equivalence theorem”, the methodology to derive the averaging weight of Dirichlet and Neumann solutions on the arbitrary topology has been proposed. Some numerical examples are also demonstrated.
Ryo FUJIMOTO Takanori FUJISAWA Masaaki IKEHARA
This paper proposes a novel method to estimate non-integer shift of images based on least squares approximation in the phase region. Conventional methods based on Phase Only Correlation (POC) take correlation between an image and its shifted image, and then estimate the non-integer shift by fitting the model equation. The problem when estimating using POC is that the estimated peak of the fitted model equation may not match the true peak of the POC function. This causes error in non-integer shift estimation. By calculating the phase difference directly in the phase region, the proposed method allows the estimation of sub-pixel shift through least squares approximation. Also by utilizing the characteristics of natural images, the proposed method limits adoption range for least squares approximation. By these improvements, the proposed method achieves high accuracy, and we validate through some examples.
Viet-Hang DUONG Manh-Quan BUI Jian-Jiun DING Yuan-Shan LEE Bach-Tung PHAM Pham The BAO Jia-Ching WANG
This work presents a new approach which derives a learned data representation method through matrix factorization on the complex domain. In particular, we introduce an encoding matrix-a new representation of data-that satisfies the simplicial constraint of the projective basis matrix on the field of complex numbers. A complex optimization framework is provided. It employs the gradient descent method and computes the derivative of the cost function based on Wirtinger's calculus.
Tsubasa MIYAUCHI Ayato ONO Hiroki YOSHIMURA Masashi NISHIYAMA Yoshio IWAI
We propose a method for embedding the awareness state and response state in an image-based avatar to smoothly and automatically start an interaction with a user. When both states are not embedded, the image-based avatar can become non-responsive or slow to respond. To consider the beginning of an interaction, we observed the behaviors between a user and receptionist in an information center. Our method replayed the behaviors of the receptionist at appropriate times in each state of the image-based avatar. Experimental results demonstrate that, at the beginning of the interaction, our method for embedding the awareness state and response state increased subjective scores more than not embedding the states.
Natsuki TAKAYAMA Hiroki TAKAHASHI
Partial blur segmentation is one of the most interesting topics in computer vision, and it has practical value. The generation of blur maps is a crucial part of partial blur segmentation because partial blur segmentation involves producing a blur map and applying a segmentation algorithm to the blur map. In this study, we address two important issues in order to improve the discrimination of blur maps: (1) estimating a robust local blur feature to consider variations in the intensity amplitude and (2) a scheme for generating blur maps. We propose the ANGHS (Amplitude-Normalized Gradient Histogram Span) as a local blur feature. ANGHS represents the heavy-tailedness of a gradient distribution, where it is calculated from an image gradient normalized using the intensity amplitude. ANGHS is robust to variations in the intensity amplitude, and it can handle local regions in a more appropriate manner than previously proposed local blur features. Blur maps are affected by local blur features but also by the contents and sizes of local regions, and the assignment of blur feature values to pixels. Thus, multiple-sized grids and the EAI (Edge-Aware Interpolation) are employed in each task to improve the discrimination of blur maps. The discrimination of the generated blur maps is evaluated visually and statistically using numerous partial blur images. Comparisons with the results obtained by state-of-the-art methods demonstrate the high discrimination of the blur maps generated using the proposed method.
Yang LI Zhuang MIAO Jiabao WANG Yafei ZHANG Hang LI
The latest deep hashing methods perform hash codes learning and image feature learning simultaneously by using pairwise or triplet labels. However, generating all possible pairwise or triplet labels from the training dataset can quickly become intractable, where the majority of those samples may produce small costs, resulting in slow convergence. In this letter, we propose a novel deep discriminative supervised hashing method, called DDSH, which directly learns hash codes based on a new combined loss function. Compared to previous methods, our method can take full advantages of the annotated data in terms of pairwise similarity and image identities. Extensive experiments on standard benchmarks demonstrate that our method preserves the instance-level similarity and outperforms state-of-the-art deep hashing methods in the image retrieval application. Remarkably, our 16-bits binary representation can surpass the performance of existing 48-bits binary representation, which demonstrates that our method can effectively improve the speed and precision of large scale image retrieval systems.
Mingye JU Zhenfei GU Dengyin ZHANG Jian LIU
In this letter, we propose a novel technique to increase the visibility of the hazy image. Benefiting from the atmospheric scattering model and the invariance principle for scene structure, we formulate structure constraint equations that derive from two simulated inputs by performing gamma correction on the input image. Relying on the inherent boundary constraint of the scattering function, the expected scene albedo can be well restored via these constraint equations. Extensive experimental results verify the power of the proposed dehazing technique.
Preserving hue is an important issue for color image processing. In order to preserve hue, color image processing is often carried out in HSI or HSV color space which is translated from RGB color space. Transforming from RGB color space to another color space and processing in this space usually generate gamut problem. We propose image enhancement methods which conserve hue and preserve the range (gamut) of the R, G, B channels in this paper. First we show an intensity processing method while preserving hue and saturation. In this method, arbitrary gray-scale transformation functions can be applied to the intensity component. Next, a saturation processing method while preserving hue and intensity is proposed. Arbitrary gray-scale transform methods can be also applied to the saturation component. Two processing methods are completely independent. Therefore, two methods are easily combined by applying two processing methods in succession. The combination method realizes the hue-preserving color image processing with a high arbitrariness without gamut problem. Furthermore, the concrete enhancement algorithm based on the proposed processing methods is proposed. Numerical results confirm our theoretical results and show that our processing algorithm performs much better than the conventional hue-preserving methods.
In photoacoustic imaging, laser power variation is one of the major factors in the degradation of the quality of reproduced images. A simple, but efficient method of compensating for the variations in laser pulse energy is proposed here where the characteristics of the adopted optical sensor and acoustic sensor were estimated in order to minimize the average local variation in optically homogeneous regions. Phantom experiments were carried out to validate the effectiveness of the proposed method.
Ryo OYAMA Shouhei KIDERA Tetsuo KIRIMOTO
Microwave imaging techniques, in particular, synthetic aperture radar (SAR), are promising tools for terrain surface measurement, irrespective of weather conditions. The coherent change detection (CCD) method is being widely applied to detect surface changes by comparing multiple complex SAR images captured from the same scanning orbit. However, in the case of a general damage assessment after a natural disaster such as an earthquake or mudslide, additional about surface change, such as surface height change, is strongly required. Given this background, the current study proposes a novel height change estimation method using a CCD model based on the Pauli decomposition of fully polarimetric SAR images. The notable feature of this method is that it can offer accurate height change beyond the assumed wavelength, by introducing the frequency band-divided approach, and so is significantly better than InSAR based approaches. Experiments in an anechoic chamber on a 1/100 scaled model of the X-band SAR system, show that our proposed method outputs more accurate height change estimates than a similar method that uses single polarimetric data, even if the height change amount is over the assumed wavelength.