Keyword Search Result

[Keyword] image(1441hit)

81-100hit(1441hit)

  • Experimental Study of Fault Injection Attack on Image Sensor Interface for Triggering Backdoored DNN Models Open Access

    Tatsuya OYAMA  Shunsuke OKURA  Kota YOSHIDA  Takeshi FUJINO  

     
    PAPER

      Pubricized:
    2021/10/26
      Vol:
    E105-A No:3
      Page(s):
    336-343

    A backdoor attack is a type of attack method inducing deep neural network (DNN) misclassification. An adversary mixes poison data, which consist of images tampered with adversarial marks at specific locations and of adversarial target classes, into a training dataset. The backdoor model classifies only images with adversarial marks into an adversarial target class and other images into the correct classes. However, the attack performance degrades sharply when the location of the adversarial marks is slightly shifted. An adversarial mark that induces the misclassification of a DNN is usually applied when a picture is taken, so the backdoor attack will have difficulty succeeding in the physical world because the adversarial mark position fluctuates. This paper proposes a new approach in which an adversarial mark is applied using fault injection on the mobile industry processor interface (MIPI) between an image sensor and the image recognition processor. Two independent attack drivers are electrically connected to the MIPI data lane in our attack system. While almost all image signals are transferred from the sensor to the processor without tampering by canceling the attack signal between the two drivers, the adversarial mark is injected into a given location of the image signal by activating the attack signal generated by the two attack drivers. In an experiment, the DNN was implemented on a Raspberry pi 4 to classify MNIST handwritten images transferred from the image sensor over the MIPI. The adversarial mark successfully appeared in a specific small part of the MNIST images using our attack system. The success rate of the backdoor attack using this adversarial mark was 91%, which is much higher than the 18% rate achieved using conventional input image tampering.

  • Adaptive Binarization for Vehicle State Images Based on Contrast Preserving Decolorization and Major Cluster Estimation

    Ye TIAN  Mei HAN  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2021/12/07
      Vol:
    E105-D No:3
      Page(s):
    679-688

    A new adaptive binarization method is proposed for the vehicle state images obtained from the intelligent operation and maintenance system of rail transit. The method can check the corresponding vehicle status information in the intelligent operation and maintenance system of rail transit more quickly and effectively, track and monitor the vehicle operation status in real time, and improve the emergency response ability of the system. The advantages of the proposed method mainly include two points. For decolorization, we use the method of contrast preserving decolorization[1] obtain the appropriate ratio of R, G, and B for the grayscale of the RGB image which can retain the color information of the vehicle state images background to the maximum, and maintain the contrast between the foreground and the background. In terms of threshold selection, the mean value and standard deviation of gray value corresponding to multi-color background of vehicle state images are obtained by using major cluster estimation[2], and the adaptive threshold is determined by the 2 sigma principle for binarization, which can extract text, identifier and other target information effectively. The experimental results show that, regarding the vehicle state images with rich background color information, this method is better than the traditional binarization methods, such as the global threshold Otsu algorithm[3] and the local threshold Sauvola algorithm[4],[5] based on threshold, Mean-Shift algorithm[6], K-Means algorithm[7] and Fuzzy C Means[8] algorithm based on statistical learning. As an image preprocessing scheme for intelligent rail transit data verification, the method can improve the accuracy of text and identifier recognition effectively by verifying the optical character recognition through a data set containing images of different vehicle statuses.

  • Hierarchical Gaussian Markov Random Field for Image Denoising

    Yuki MONMA  Kan ARO  Muneki YASUDA  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2021/12/16
      Vol:
    E105-D No:3
      Page(s):
    689-699

    In this study, Bayesian image denoising, in which the prior distribution is assumed to be a Gaussian Markov random field (GMRF), is considered. Recently, an effective algorithm for Bayesian image denoising with a standard GMRF prior has been proposed, which can help implement the overall procedure and optimize its parameters in O(n)-time, where n is the size of the image. A new GMRF-type prior, referred to as a hierarchical GMRF (HGMRF) prior, is proposed, which is obtained by applying a hierarchical Bayesian approach to the standard GMRF prior; in addition, an effective denoising algorithm based on the HGMRF prior is proposed. The proposed HGMRF method can help implement the overall procedure and optimize its parameters in O(n)-time, as well as the previous GMRF method. The restoration quality of the proposed method is found to be significantly higher than that of the previous GMRF method as well as that of a non-local means filter in several cases. Furthermore, numerical evidence implies that the proposed HGMRF prior is more suitable for the image prior than the standard GMRF prior.

  • Latent Space Virtual Adversarial Training for Supervised and Semi-Supervised Learning

    Genki OSADA  Budrul AHSAN  Revoti PRASAD BORA  Takashi NISHIDE  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2021/12/09
      Vol:
    E105-D No:3
      Page(s):
    667-678

    Virtual Adversarial Training (VAT) has shown impressive results among recently developed regularization methods called consistency regularization. VAT utilizes adversarial samples, generated by injecting perturbation in the input space, for training and thereby enhances the generalization ability of a classifier. However, such adversarial samples can be generated only within a very small area around the input data point, which limits the adversarial effectiveness of such samples. To address this problem we propose LVAT (Latent space VAT), which injects perturbation in the latent space instead of the input space. LVAT can generate adversarial samples flexibly, resulting in more adverse effect and thus more effective regularization. The latent space is built by a generative model, and in this paper we examine two different type of models: variational auto-encoder and normalizing flow, specifically Glow. We evaluated the performance of our method in both supervised and semi-supervised learning scenarios for an image classification task using SVHN and CIFAR-10 datasets. In our evaluation, we found that our method outperforms VAT and other state-of-the-art methods.

  • Recursive Multi-Scale Channel-Spatial Attention for Fine-Grained Image Classification

    Dichao LIU  Yu WANG  Kenji MASE  Jien KATO  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2021/12/22
      Vol:
    E105-D No:3
      Page(s):
    713-726

    Fine-grained image classification is a difficult problem, and previous studies mainly overcome this problem by locating multiple discriminative regions in different scales and then aggregating complementary information explored from the located regions. However, locating discriminative regions introduces heavy overhead and is not suitable for real-world application. In this paper, we propose the recursive multi-scale channel-spatial attention module (RMCSAM) for addressing this problem. Following the experience of previous research on fine-grained image classification, RMCSAM explores multi-scale attentional information. However, the attentional information is explored by recursively refining the deep feature maps of a convolutional neural network (CNN) to better correspond to multi-scale channel-wise and spatial-wise attention, instead of localizing attention regions. In this way, RMCSAM provides a lightweight module that can be inserted into standard CNNs. Experimental results show that RMCSAM can improve the classification accuracy and attention capturing ability over baselines. Also, RMCSAM performs better than other state-of-the-art attention modules in fine-grained image classification, and is complementary to some state-of-the-art approaches for fine-grained image classification. Code is available at https://github.com/Dichao-Liu/Recursive-Multi-Scale-Channel-Spatial-Attention-Module.

  • SimpleZSL: Extremely Simple and Fast Zero-Shot Learning with Nearest Neighbor Classifiers

    Masayuki HIROMOTO  Hisanao AKIMA  Teruo ISHIHARA  Takuji YAMAMOTO  

     
    PAPER-Pattern Recognition

      Pubricized:
    2021/10/29
      Vol:
    E105-D No:2
      Page(s):
    396-405

    Zero-shot learning (ZSL) aims to classify images of unseen classes by learning relationship between visual and semantic features. Existing works have been improving recognition accuracy from various approaches, but they employ computationally intensive algorithms that require iterative optimization. In this work, we revisit the primary approach of the pattern recognition, ı.e., nearest neighbor classifiers, to solve the ZSL task by an extremely simple and fast way, called SimpleZSL. Our algorithm consists of the following three simple techniques: (1) just averaging feature vectors to obtain visual prototypes of seen classes, (2) calculating a pseudo-inverse matrix via singular value decomposition to generate visual features of unseen classes, and (3) inferring unseen classes by a nearest neighbor classifier in which cosine similarity is used to measure distance between feature vectors. Through the experiments on common datasets, the proposed method achieves good recognition accuracy with drastically small computational costs. The execution time of the proposed method on a single CPU is more than 100 times faster than those of the GPU implementations of the existing methods with comparable accuracies.

  • Consistency Regularization on Clean Samples for Learning with Noisy Labels

    Yuichiro NOMURA  Takio KURITA  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2021/10/28
      Vol:
    E105-D No:2
      Page(s):
    387-395

    In the recent years, deep learning has achieved significant results in various areas of machine learning. Deep learning requires a huge amount of data to train a model, and data collection techniques such as web crawling have been developed. However, there is a risk that these data collection techniques may generate incorrect labels. If a deep learning model for image classification is trained on a dataset with noisy labels, the generalization performance significantly decreases. This problem is called Learning with Noisy Labels (LNL). One of the recent researches on LNL, called DivideMix [1], has successfully divided the dataset into samples with clean labels and ones with noisy labels by modeling loss distribution of all training samples with a two-component Mixture Gaussian model (GMM). Then it treats the divided dataset as labeled and unlabeled samples and trains the classification model in a semi-supervised manner. Since the selected samples have lower loss values and are easy to classify, training models are in a risk of overfitting to the simple pattern during training. To train the classification model without overfitting to the simple patterns, we propose to introduce consistency regularization on the selected samples by GMM. The consistency regularization perturbs input images and encourages model to outputs the same value to the perturbed images and the original images. The classification model simultaneously receives the samples selected as clean and their perturbed ones, and it achieves higher generalization performance with less overfitting to the selected samples. We evaluated our method with synthetically generated noisy labels on CIFAR-10 and CIFAR-100 and obtained results that are comparable or better than the state-of-the-art method.

  • Nonuniformity Measurement of Image Resolution under Effect of Color Speckle for Raster-Scan RGB Laser Mobile Projector

    Junichi KINOSHITA  Akira TAKAMORI  Kazuhisa YAMAMOTO  Kazuo KURODA  Koji SUZUKI  Keisuke HIEDA  

     
    PAPER

      Pubricized:
    2021/08/17
      Vol:
    E105-C No:2
      Page(s):
    86-94

    Image resolution under the effect of color speckle was successfully measured for a raster-scan mobile projector, using the modified contrast modulation method. This method was based on the eye-diagram analysis for distinguishing the binary image signals, black-and-white line pairs. The image resolution and the related metrics, illuminance, chromaticity, and speckle contrast were measured at the nine regions on the full-frame area projected on a standard diffusive reflectance screen. The nonuniformity data over the nine regions were discussed and analyzed.

  • Effects of Image Processing Operations on Adversarial Noise and Their Use in Detecting and Correcting Adversarial Images Open Access

    Huy H. NGUYEN  Minoru KURIBAYASHI  Junichi YAMAGISHI  Isao ECHIZEN  

     
    PAPER

      Pubricized:
    2021/10/05
      Vol:
    E105-D No:1
      Page(s):
    65-77

    Deep neural networks (DNNs) have achieved excellent performance on several tasks and have been widely applied in both academia and industry. However, DNNs are vulnerable to adversarial machine learning attacks in which noise is added to the input to change the networks' output. Consequently, DNN-based mission-critical applications such as those used in self-driving vehicles have reduced reliability and could cause severe accidents and damage. Moreover, adversarial examples could be used to poison DNN training data, resulting in corruptions of trained models. Besides the need for detecting adversarial examples, correcting them is important for restoring data and system functionality to normal. We have developed methods for detecting and correcting adversarial images that use multiple image processing operations with multiple parameter values. For detection, we devised a statistical-based method that outperforms the feature squeezing method. For correction, we devised a method that uses for the first time two levels of correction. The first level is label correction, with the focus on restoring the adversarial images' original predicted labels (for use in the current task). The second level is image correction, with the focus on both the correctness and quality of the corrected images (for use in the current and other tasks). Our experiments demonstrated that the correction method could correct nearly 90% of the adversarial images created by classical adversarial attacks and affected only about 2% of the normal images.

  • JPEG Image Steganalysis Using Weight Allocation from Block Evaluation

    Weiwei LUO  Wenpeng ZHOU  Jinglong FANG  Lingyan FAN  

     
    LETTER-Image Processing and Video Processing

      Pubricized:
    2021/10/18
      Vol:
    E105-D No:1
      Page(s):
    180-183

    Recently, channel-aware steganography has been presented for high security. The corresponding selection-channel-aware (SCA) detecting algorithms have also been proposed for improving the detection performance. In this paper, we propose a novel detecting algorithm of JPEG steganography, where the embedding probability and block evaluation are integrated into the new probability. This probability can embody the change due to data embedding. We choose the same high-pass filters as maximum diversity cascade filter residual (MD-CFR) to obtain different image residuals and a weighted histogram method is used to extract detection features. Experimental results on detecting two typical steganographic methods show that the proposed method can improve the performance compared with the state-of-art methods.

  • Feasibility Study for Computer-Aided Diagnosis System with Navigation Function of Clear Region for Real-Time Endoscopic Video Image on Customizable Embedded DSP Cores

    Masayuki ODAGAWA  Tetsushi KOIDE  Toru TAMAKI  Shigeto YOSHIDA  Hiroshi MIENO  Shinji TANAKA  

     
    LETTER-VLSI Design Technology and CAD

      Pubricized:
    2021/07/08
      Vol:
    E105-A No:1
      Page(s):
    58-62

    This paper presents examination result of possibility for automatic unclear region detection in the CAD system for colorectal tumor with real time endoscopic video image. We confirmed that it is possible to realize the CAD system with navigation function of clear region which consists of unclear region detection by YOLO2 and classification by AlexNet and SVMs on customizable embedded DSP cores. Moreover, we confirmed the real time CAD system can be constructed by a low power ASIC using customizable embedded DSP cores.

  • CMOS Image Sensor with Pixel-Parallel ADC and HDR Reconstruction from Intermediate Exposure Images Open Access

    Shinnosuke KURATA  Toshinori OTAKA  Yusuke KAMEDA  Takayuki HAMAMOTO  

     
    LETTER-Image

      Pubricized:
    2021/07/26
      Vol:
    E105-A No:1
      Page(s):
    82-86

    We propose a HDR (high dynamic range) reconstruction method in an image sensor with a pixel-parallel ADC (analog-to-digital converter) for non-destructively reading out the intermediate exposure image. We report the circuit design for such an image sensor and the evaluation of the basic HDR reconstruction method.

  • Classification with CNN features and SVM on Embedded DSP Core for Colorectal Magnified NBI Endoscopic Video Image

    Masayuki ODAGAWA  Takumi OKAMOTO  Tetsushi KOIDE  Toru TAMAKI  Shigeto YOSHIDA  Hiroshi MIENO  Shinji TANAKA  

     
    PAPER-VLSI Design Technology and CAD

      Pubricized:
    2021/07/21
      Vol:
    E105-A No:1
      Page(s):
    25-34

    In this paper, we present a classification method for a Computer-Aided Diagnosis (CAD) system in a colorectal magnified Narrow Band Imaging (NBI) endoscopy. In an endoscopic video image, color shift, blurring or reflection of light occurs in a lesion area, which affects the discrimination result by a computer. Therefore, in order to identify lesions with high robustness and stable classification to these images specific to video frame, we implement a CAD system for colorectal endoscopic images with the Convolutional Neural Network (CNN) feature and Support Vector Machine (SVM) classification on the embedded DSP core. To improve the robustness of CAD system, we construct the SVM learned by multiple image sizes data sets so as to adapt to the noise peculiar to the video image. We confirmed that the proposed method achieves higher robustness, stable, and high classification accuracy in the endoscopic video image. The proposed method also can cope with differences in resolution by old and new endoscopes and perform stably with respect to the input endoscopic video image.

  • Searching and Learning Discriminative Regions for Fine-Grained Image Retrieval and Classification

    Kangbo SUN  Jie ZHU  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2021/10/18
      Vol:
    E105-D No:1
      Page(s):
    141-149

    Local discriminative regions play important roles in fine-grained image analysis tasks. How to locate local discriminative regions with only category label and learn discriminative representation from these regions have been hot spots. In our work, we propose Searching Discriminative Regions (SDR) and Learning Discriminative Regions (LDR) method to search and learn local discriminative regions in images. The SDR method adopts attention mechanism to iteratively search for high-response regions in images, and uses this as a clue to locate local discriminative regions. Moreover, the LDR method is proposed to learn compact within category and sparse between categories representation from the raw image and local images. Experimental results show that our proposed approach achieves excellent performance in both fine-grained image retrieval and classification tasks, which demonstrates its effectiveness.

  • Movie Map for Virtual Exploration in a City

    Kiyoharu AIZAWA  

     
    INVITED PAPER

      Pubricized:
    2021/10/12
      Vol:
    E105-D No:1
      Page(s):
    38-45

    This paper introduces our work on a Movie Map, which will enable users to explore a given city area using 360° videos. Visual exploration of a city is always needed. Nowadays, we are familiar with Google Street View (GSV) that is an interactive visual map. Despite the wide use of GSV, it provides sparse images of streets, which often confuses users and lowers user satisfaction. Forty years ago, a video-based interactive map was created - it is well-known as Aspen Movie Map. Movie Map uses videos instead of sparse images and seems to improve the user experience dramatically. However, Aspen Movie Map was based on analog technology with a huge effort and never built again. Thus, we renovate the Movie Map using state-of-the-art technology. We build a new Movie Map system with an interface for exploring cities. The system consists of four stages; acquisition, analysis, management, and interaction. After acquiring 360° videos along streets in target areas, the analysis of videos is almost automatic. Frames of the video are localized on the map, intersections are detected, and videos are segmented. Turning views at intersections are synthesized. By connecting the video segments following the specified movement in an area, we can watch a walking view along a street. The interface allows for easy exploration of a target area. It can also show virtual billboards in the view.

  • Image Adjustment for Multi-Exposure Images Based on Convolutional Neural Networks

    Isana FUNAHASHI  Taichi YOSHIDA  Xi ZHANG  Masahiro IWAHASHI  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2021/10/21
      Vol:
    E105-D No:1
      Page(s):
    123-133

    In this paper, we propose an image adjustment method for multi-exposure images based on convolutional neural networks (CNNs). We call image regions without information due to saturation and object moving in multi-exposure images lacking areas in this paper. Lacking areas cause the ghosting artifact in fused images from sets of multi-exposure images by conventional fusion methods, which tackle the artifact. To avoid this problem, the proposed method estimates the information of lacking areas via adaptive inpainting. The proposed CNN consists of three networks, warp and refinement, detection, and inpainting networks. The second and third networks detect lacking areas and estimate their pixel values, respectively. In the experiments, it is observed that a simple fusion method with the proposed method outperforms state-of-the-art fusion methods in the peak signal-to-noise ratio. Moreover, the proposed method is applied for various fusion methods as pre-processing, and results show obviously reducing artifacts.

  • Near Hue-Preserving Reversible Contrast and Saturation Enhancement Using Histogram Shifting

    Rio KUROKAWA  Kazuki YAMATO  Madoka HASEGAWA  

     
    PAPER

      Pubricized:
    2021/10/05
      Vol:
    E105-D No:1
      Page(s):
    54-64

    In recent years, several reversible contrast-enhancement methods for color images using digital watermarking have been proposed. These methods can restore an original image from a contrast-enhanced image, in which the information required to recover the original image is embedded with other payloads. In these methods, the hue component after enhancement is similar to that of the original image. However, the saturation of the image after enhancement is significantly lower than that of the original image, and the obtained image exhibits a pale color tone. Herein, we propose a method for enhancing the contrast and saturation of color images and nearly preserving the hue component in a reversible manner. Our method integrates red, green, and blue histograms and preserves the median value of the integrated components. Consequently, the contrast and saturation improved, whereas the subjective image quality improved. In addition, we confirmed that the hue component of the enhanced image is similar to that of the original image. We also confirmed that the original image was perfectly restored from the enhanced image. Our method can contribute to the field of digital photography as a legal evidence. The required storage space for color images and issues pertaining to evidence management can be reduced considering our method enables the creation of color images before and after the enhancement of one image.

  • CLAHE Implementation and Evaluation on a Low-End FPGA Board by High-Level Synthesis

    Koki HONDA  Kaijie WEI  Masatoshi ARAI  Hideharu AMANO  

     
    PAPER

      Pubricized:
    2021/07/12
      Vol:
    E104-D No:12
      Page(s):
    2048-2056

    Automobile companies have been trying to replace side mirrors of cars with small cameras for reducing air resistance. It enables us to apply some image processing to improve the quality of the image. Contrast Limited Adaptive Histogram Equalization (CLAHE) is one of such techniques to improve the quality of the image for the side mirror camera, which requires a large computation performance. Here, an implementation method of CLAHE on a low-end FPGA board by high-level synthesis is proposed. CLAHE has two main processing parts: cumulative distribution function (CDF) generation, and bilinear interpolation. During the CDF generation, the effect of increasing loop initiation interval can be greatly reduced by placing multiple Processing Elements (PEs). and during the interpolation, latency and BRAM usage were reduced by revising how to hold CDF and calculation method. Finally, by connecting each module with streaming interfaces, using data flow pragmas, overlapping processing, and hiding data transfer, our HLS implementation achieved a comparable result to that of HDL. We parameterized the components of the algorithm so that the number of tiles and the size of the image can be easily changed. The source code for this research can be downloaded from https://github.com/kokihonda/fpga_clahe.

  • A Low-Latency Inference of Randomly Wired Convolutional Neural Networks on an FPGA

    Ryosuke KURAMOCHI  Hiroki NAKAHARA  

     
    PAPER

      Pubricized:
    2021/06/24
      Vol:
    E104-D No:12
      Page(s):
    2068-2077

    Convolutional neural networks (CNNs) are widely used for image processing tasks in both embedded systems and data centers. In data centers, high accuracy and low latency are desired for various tasks such as image processing of streaming videos. We propose an FPGA-based low-latency CNN inference for randomly wired convolutional neural networks (RWCNNs), whose layer structures are based on random graph models. Because RWCNNs have several convolution layers that have no direct dependencies between them, our architecture can process them efficiently using a pipeline method. At each layer, we need to use the calculation results of multiple layers as the input. We use an FPGA with HBM2 to enable parallel access to the input data with multiple HBM2 channels. We schedule the order of execution of the layers to improve the pipeline efficiency. We build a conflict graph using the scheduling results. Then, we allocate the calculation results of each layer to the HBM2 channels by coloring the graph. Because the pipeline execution needs to be properly controlled, we developed an automatic generation tool for hardware functions. We implemented the proposed architecture on the Alveo U50 FPGA. We investigated a trade-off between latency and recognition accuracy for the ImageNet classification task by comparing the inference performances for different input image sizes. We compared our accelerator with a conventional accelerator for ResNet-50. The results show that our accelerator reduces the latency by 2.21 times. We also obtained 12.6 and 4.93 times better efficiency than CPU and GPU, respectively. Thus, our accelerator for RWCNNs is suitable for low-latency inference.

  • An Improved U-Net Architecture for Image Dehazing

    Wenyi GE  Yi LIN  Zhitao WANG  Guigui WANG  Shihan TAN  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2021/09/14
      Vol:
    E104-D No:12
      Page(s):
    2218-2225

    In this paper, we present a simple yet powerful deep neural network for natural image dehazing. The proposed method is designed based on U-Net architecture and we made some design changes to make it better. We first use Group Normalization to replace Batch Normalization to solve the problem of insufficient batch size due to hardware limitations. Second, we introduce FReLU activation into the U-Net block, which can achieve capturing complicated visual layouts with regular convolutions. Experimental results on public benchmarks demonstrate the effectiveness of the modified components. On the SOTS Indoor and Outdoor datasets, it obtains PSNR of 32.23 and 31.64 respectively, which are comparable performances with state-of-the-art methods. The code is publicly available online soon.

81-100hit(1441hit)

FlyerIEICE has prepared a flyer regarding multilingual services. Please use the one in your native language.