Toru SUMI Yuta INAMURA Yusuke KAMEDA Tomokazu ISHIKAWA Ichiro MATSUDA Susumu ITOH
We previously proposed a lossless image coding scheme using example-based probability modeling, wherein the probability density function of image signals was dynamically modeled pel-by-pel. To appropriately estimate the peak positions of the probability model, several examples, i.e., sets of pels whose neighborhoods are similar to the local texture of the target pel to be encoded, were collected from the already encoded causal area via template matching. This scheme primarily makes use of non-local information in image signals. In this study, we introduce a prediction technique into the probability modeling to offer a better trade-off between the local and non-local information in the image signals.
Ryo OYAMA Shouhei KIDERA Tetsuo KIRIMOTO
Microwave imaging techniques, in particular, synthetic aperture radar (SAR), are promising tools for terrain surface measurement, irrespective of weather conditions. The coherent change detection (CCD) method is being widely applied to detect surface changes by comparing multiple complex SAR images captured from the same scanning orbit. However, in the case of a general damage assessment after a natural disaster such as an earthquake or mudslide, additional about surface change, such as surface height change, is strongly required. Given this background, the current study proposes a novel height change estimation method using a CCD model based on the Pauli decomposition of fully polarimetric SAR images. The notable feature of this method is that it can offer accurate height change beyond the assumed wavelength, by introducing the frequency band-divided approach, and so is significantly better than InSAR based approaches. Experiments in an anechoic chamber on a 1/100 scaled model of the X-band SAR system, show that our proposed method outputs more accurate height change estimates than a similar method that uses single polarimetric data, even if the height change amount is over the assumed wavelength.
Takuro YAMAGUCHI Aiko SUZUKI Masaaki IKEHARA
Mixed noise removal is a major problem in image processing. Different noises have different properties and it is required to use an appropriate removal method for each noise. Therefore, removal of mixed noise needs the combination of removal algorithms for each contained noise. We aim at the removal of the mixed noise composed of Additive White Gaussian Noise (AWGN) and Random-Valued Impulse Noise (RVIN). Many conventional methods cannot remove the mixed noise effectively and may lose image details. In this paper, we propose a new mixed noise removal method utilizing Direction Weighted Median filter (DWM filter) and Block Matching and 3D filtering method (BM3D). Although the combination of the DWM filter for RVIN and BM3D for AWGN removes almost all the mixed noise, it still loses some image details. We find the cause in the miss-detection of the image details as RVIN and solve the problem by re-detection with the difference of an input noisy image and the output by the combination. The re-detection process removes only salient noise which BM3D cannot remove and therefore preserves image details. These processes lead to the high performance removal of the mixed noise while preserving image details. Experimental results show our method obtains denoised images with clearer edges and textures than conventional methods.
Yuma KINOSHITA Sayaka SHIOTA Hitoshi KIYA
This paper proposes a new inverse tone mapping operator (TMO) with estimated parameters. The proposed inverse TMO is based on Reinhard's global operator which is a well-known TMO. Inverse TM operations have two applications: generating an HDR image from an existing LDR one, and reconstructing an original HDR image from the mapped LDR image. The proposed one can be applied to both applications. In the latter application, two parameters used in Reinhard's TMO, i.e. the key value α regarding brightness of a mapped LDR one and the geometric mean $overline{L}_w$ of an original HDR one, are generally required for carrying out the Reinhard based inverse TMO. In this paper, we show that it is possible to estimate $overline{L}_w$ from α under some conditions, while α can be also estimated from $overline{L}_w$, so that a new inverse TMO with estimated parameter is proposed. Experimental results show that the proposed method outperforms conventional ones for both applications, in terms of high structural similarities and low computational costs.
Image steganalysis can determine whether the image contains the secret messages. In practice, the number of the cover images is far greater than that of the secret images, so it is very important to solve the detection problem in imbalanced image sets. Currently, SMOTE, Borderline-SMOTE and ADASYN are three importantly synthesized algorithms used to solve the imbalanced problem. In these methods, the new sampling point is synthesized based on the minority class samples. But this research is seldom seen in image steganalysis. In this paper, we find that the features of the majority class sample are similar to those of the minority class sample based on the distribution of the image features in steganalysis. So the majority and minority class samples are both used to integrate the new sample points. In experiments, compared with SMOTE, Borderline-SMOTE and ADASYN, this approach improves detection accuracy using the FLD ensemble classifier.
In photoacoustic imaging, laser power variation is one of the major factors in the degradation of the quality of reproduced images. A simple, but efficient method of compensating for the variations in laser pulse energy is proposed here where the characteristics of the adopted optical sensor and acoustic sensor were estimated in order to minimize the average local variation in optically homogeneous regions. Phantom experiments were carried out to validate the effectiveness of the proposed method.
The problem of reproducing high dynamic range (HDR) images on devices with a restricted dynamic range has gained a lot of interest in the computer graphics community. Various approaches to this issue exist, spanning several research areas, including computer graphics, image processing, color vision, and physiology. However, most of the approaches to the issue have several serious well-known color distortion problems. Accordingly, this article presents a tone-mapping method. The proposed method comprises the tone-mapping operator and the chromatic adaptation transform. The tone-mapping method is combined with linear and non-linear mapping using visual gamma based on contrast sensitive function (CSF) and using key of scene value, where the visual gamma is adopted to automatically control the dynamic range, parameter free, as well as to avoid both the luminance shift and the hue shift in the displayed images. Furthermore, the key of scene value is used to represent whether the scene was subjectively light, norm, dark. The resulting image is then processed through a chromatic adaptation transform and emphasis lies in human visual perception (HVP). The experiment results show that the proposed method yields better performance of the color rendering over the conventional method in subjective and quantitative quality and color reproduction.
Takashi SHIBATA Kazunori SATO Ryohei IKEJIRI
We conducted experimental classes in an elementary school to examine how the advantages of using stereoscopic 3D images could be applied in education. More specifically, we selected a unit of the Tumulus period in Japan for sixth-graders as the source of our 3D educational materials. This unit represents part of the coursework for the topic of Japanese history. The educational materials used in our study included stereoscopic 3D images for examining the stone chambers and Haniwa (i.e., terracotta clay figures) of the Tumulus period. The results of our experimental class showed that 3D educational materials helped students focus on specific parts in images such as attached objects of the Haniwa and also understand 3D spaces and concavo-convex shapes. The experimental class revealed that 3D educational materials also helped students come up with novel questions regarding attached objects of the Haniwa, and Haniwa's spatial balance and spatial alignment. The results suggest that the educational use of stereoscopic 3D images is worthwhile in that they lead to question and hypothesis generation and an inquiry-based learning approach to history.
Establishing local visual correspondences between images taken under different conditions is an important and challenging task in computer vision. A common solution for this task is detecting keypoints in images and then matching the keypoints with a feature descriptor. This paper proposes a robust and low-dimensional local feature descriptor named Adaptively Integrated Gradient and Intensity Feature (AIGIF). The proposed AIGIF descriptor partitions the support region surrounding each keypoint into sub-regions, and classifies the sub-regions into two categories: edge-dominated ones and smoothness-dominated ones. For edge-dominated sub-regions, gradient magnitude and orientation features are extracted; for smoothness-dominated sub-regions, intensity feature is extracted. The gradient and intensity features are integrated to generate the descriptor. Experiments on image matching were conducted to evaluate performances of the proposed AIGIF. Compared with SIFT, the proposed AIGIF achieves 75% reduction of feature dimension (from 128 bytes to 32 bytes); compared with SURF, the proposed AIGIF achieves 87.5% reduction of feature dimension (from 256 bytes to 32 bytes); compared with the state-of-the-art ORB descriptor which has the same feature dimension with AIGIF, AIGIF achieves higher accuracy and robustness. In summary, the AIGIF combines the advantages of gradient feature and intensity feature, and achieves relatively high accuracy and robustness with low feature dimension.
Zhixian MA Jie ZHU Weitian LI Haiguang XU
Detection of cavities in X-ray astronomical images has become a field of interest, since the flourishing studies on black holes and the Active Galactic Nuclei (AGN). In this paper, an approach is proposed to detect cavities in X-ray astronomical images using our newly designed Granular Convolutional Neural Network (GCNN) based classifiers. The raw data are firstly preprocessed to obtain images of the observed objects, i.e., galaxies or galaxy clusters. In each image, pixels are classified into three categories, (1) the faint backgrounds (BKG), (2) the cavity regions (CAV), and (3) the bright central gas regions (CNT). And the sample sets are then generated by dividing large images into subimages with a window size according to the cavities' scale. Since the number of BKG samples are far more than the other types, to achieve balanced training sets, samples from the major class are split into subsets, i.e., granule. Then a group of three-convolutional-layer granular CNN networks without subsampling layers are designed as the classifiers, and trained with the labeled granular sample sets. Finally, the trained GCNN classifiers are applied to new observations, so as to estimate the cavity regions with a voting strategy and locate them with elliptical profiles on the raw observation images. Experiments and applications of our approach are demonstrated on 40 X-ray astronomical observations retrieved from chandra Data Archive (CDA). Comparisons among our approach, the β-model fitting and the Unsharp Masking (UM) methods were also performed, which prove our approach was more accurate and robust.
Kengo TSUDA Takanori FUJISAWA Masaaki IKEHARA
In this paper, we introduce a new method to remove random-valued impulse noise in an image. Random-valued impulse noise replaces the pixel value at a random position by a random value. Due to the randomness of the noisy pixel values, it is difficult to detect them by comparison with neighboring pixels, which is used in many conventional methods. Then we improve the recent noise detector which uses a non-local search of similar structure. Next we propose a new noise removal algorithm by sparse representation using DCT basis. Furthermore, the sparse representation can remove impulse noise by using the neighboring similar image patch. This method has much more superior noise removal performance than conventional methods at images. We confirm the effectiveness of the proposed method quantitatively and qualitatively.
Yuichi YOSHIDA Tsuyoshi TOYOFUKU
Descriptor aggregation techniques such as the Fisher vector and vector of locally aggregated descriptors (VLAD) are used in most image retrieval frameworks. It takes some time to extract local descriptors, and the geometric verification requires storage if a real-valued descriptor such as SIFT is used. Moreover, if we apply binary descriptors to such a framework, the performance of image retrieval is not better than if we use a real-valued descriptor. Our approach tackles these issues by using a dual representation descriptor that has advantages of being both a real-valued and a binary descriptor. The real value of the dual representation descriptor is aggregated into a VLAD in order to achieve high accuracy in the image retrieval, and the binary one is used to find correspondences in the geometric verification stage in order to reduce the amount of storage needed. We implemented a dual representation descriptor extracted in semi-real time by using the CARD descriptor. We evaluated the accuracy of our image retrieval framework including the geometric verification on three datasets (holidays, ukbench and Stanford mobile visual search). The results indicate that our framework is as accurate as the framework that uses SIFT. In addition, the experiments show that the image retrieval speed and storage requirements of our framework are as efficient as those of a framework that uses ORB.
Mohammad Nehal HASNINE Masatoshi ISHIKAWA Yuki HIRAI Haruko MIYAKODA Keiichi KANEKO
Vocabulary acquisition based on the traditional pen-and-paper approach is outdated, and has been superseded by the multimedia-supported approach. In a multimedia-supported foreign language learning environment, a learning material comprised of a still-image, a text, and the corresponding sound data is considered to be the most effective way to memorize a noun. However, extraction of an appropriate still image for a noun has always been a challenging and time-consuming process for learners. Learners' burden would be reduced if a system could extract an appropriate image for representing a noun. Therefore, the present study purposed to extract an appropriate image for each noun in order to assist foreign language learners in acquisition of foreign vocabulary. This study presumed that, a learning material created with the help of an appropriate image would be more effective in recalling memory compared to the one created with an inappropriate image. As the first step to finding appropriate images for nouns, concrete nouns have been considered as the subject of investigation. Therefore, this study, at first proposed a definition of an appropriate image for a concrete noun. After that, an image re-ranking algorithm has been designed and implemented that is able to extract an appropriate image from a finite set of corresponding images for each concrete noun. Finally, immediate-after, short- and long-term learning effects of those images with regard to learners' memory retention rates have been examined by conducting immediate-after, delayed and extended delayed posttests. The experimental result revealed that participants in the experimental group significantly outperformed the control group in their long-term memory retention, while no significant differences have been observed in immediate-after and in short-term memory retention. This result indicates that our algorithm could extract images that have a higher learning effect. Furthermore, this paper briefly discusses an on-demand learning system that has been developed to assist foreign language learners in creation of vocabulary learning materials.
The Retinex theory assumes that large intensity changes correspond to reflectance edges, while smoothly-varying regions are due to shading. Some algorithms based on the theory adopt simple thresholding schemes and achieve adequate results for reflectance estimation. In this paper, we present a practical reflectance estimation technique for hyperspectral images. Our method is realized simply by thresholding singular values of a matrix calculated from scaled pixel values. In the method, we estimate the reflectance image by measuring spectral similarity between two adjacent pixels. We demonstrate that our thresholding scheme effectively estimates the reflectance and outperforms the Retinex-based thresholding. In particular, our methods can precisely distinguish edges caused by reflectance change and shadows.
Shu KONDO Yuto KOBAYASHI Keita TAKAHASHI Toshiaki FUJII
A layered light-field display based on light-field factorization is considered. In the original work, the factorization is formulated under the assumption that the light field is captured with orthographic cameras. In this paper, we introduce a generalized framework for light-field factorization that can handle both the orthographic and perspective camera projection models. With our framework, a light field captured with perspective cameras can be displayed accurately.
Bin YAO Lifeng HE Shiying KANG Xiao ZHAO Yuyan CHAO
The Euler number of a binary image is an important topological property for pattern recognition, image analysis, and computer vision. A famous method for computing the Euler number of a binary image is by counting certain patterns of bit-quads in the image, which has been improved by scanning three rows once to process two bit-quads simultaneously. This paper studies the bit-quad-based Euler number computing problem. We show that for a bit-quad-based Euler number computing algorithm, with the increase of the number of bit-quads being processed simultaneously, on the one hand, the average number of pixels to be checked for processing a bit-quad will decrease in theory, and on the other hand, the length of the codes for implementing the algorithm will increase, which will make the algorithm less efficient in practice. Experimental results on various types of images demonstrated that scanning five rows once and processing four bit-quads simultaneously is the optimal tradeoff, and that the optimal bit-quad-based Euler number computing algorithm is more efficient than other Euler number computing algorithms.
Takahiro SUZUKI Keita TAKAHASHI Toshiaki FUJII
Structure tensor analysis on epipolar plane images (EPIs) is a successful approach to estimate disparity from a light field, i.e. a dense set of multi-view images. However, the disparity range allowable for the light field is limited because the estimation becomes less accurate as the range of disparities become larger. To overcome this limitation, we developed a new method called sheared EPI analysis, where EPIs are sheared before the structure tensor analysis. The results of analysis obtained with different shear values are integrated into a final disparity map through a smoothing process, which is the key idea of our method. In this paper, we closely investigate the performance of sheared EPI analysis and demonstrate the effectiveness of the smoothing process by extensively evaluating the proposed method with 15 datasets that have large disparity ranges.
Saori TAKEYAMA Shunsuke ONO Itsuo KUMAZAWA
Existing image deblurring methods with a blurred/noisy image pair take a two-step approach: blur kernel estimation and image restoration. They can achieve better and much more stable blur kernel estimation than single image deblurring methods. On the other hand, in the image restoration step, they do not exploit the information on the noisy image, or they require ad hoc tuning of interdependent parameters. This paper focuses on the image restoration step and proposes a new restoration method of using a blurred/noisy image pair. In our method, the image restoration problem is formulated as a constrained convex optimization problem, where data-fidelity to a blurred image and that to a noisy image is properly taken into account as multiple hard constraints. This offers (i) high quality restoration when the blurred image also contains noise; (ii) robustness to the estimation error of the blur kernel; and (iii) easy parameter setting. We also provide an efficient algorithm for solving our optimization problem based on the so-called alternating direction method of multipliers (ADMM). Experimental results support our claims.
Somchai PHATTHANACHUANCHOM Rawesak TANAWONGSUWAN
Color transfer is a simple process to change a color tone in one image (source) to look like another image (target). In transferring colors between images, there are several issues needed to be considered including partial color transfer, trial-and-error, and multiple target color transfer. Our approach enables users to transfer colors partially and locally by letting users select their regions of interest from image segmentation. Since there are many ways that we can transfer colors from a set of target regions to a set of source regions, we introduce the region exploration and navigation approach where users can choose their preferred color tones to transfer one region at a time and gradually customize towards their desired results. The preferred color tones sometimes can come from more than one image; therefore our method is extended to allow users to select their preferred color tones from multiple images. Our experimental results have shown the flexibility of our approach to generate reasonable segmented regions of interest and to enable users to explore the possible results more conveniently.
Miki HASEYAMA Takahiro OGAWA Sho TAKAHASHI Shuhei NOMURA Masatsugu SHIMOMURA
Biomimetics is a new research field that creates innovation through the collaboration of different existing research fields. However, the collaboration, i.e., the exchange of deep knowledge between different research fields, is difficult for several reasons such as differences in technical terms used in different fields. In order to overcome this problem, we have developed a new retrieval platform, “Biomimetics image retrieval platform,” using a visualization-based image retrieval technique. A biological database contains a large volume of image data, and by taking advantage of these image data, we are able to overcome limitations of text-only information retrieval. By realizing such a retrieval platform that does not depend on technical terms, individual biological databases of various species can be integrated. This will allow not only the use of data for the study of various species by researchers in different biological fields but also access for a wide range of researchers in fields ranging from materials science, mechanical engineering and manufacturing. Therefore, our platform provides a new path bridging different fields and will contribute to the development of biomimetics since it can overcome the limitation of the traditional retrieval platform.