Keyword Search Result

[Keyword] image(1441hit)

21-40hit(1441hit)

  • Skin Diagnostic Method Using Fontana-Masson Stained Images of Stratum Corneum Cells Open Access

    Shuto HASEGAWA  Koichiro ENOMOTO  Taeko MIZUTANI  Yuri OKANO  Takenori TANAKA  Osamu SAKAI  

     
    PAPER-Biological Engineering

      Pubricized:
    2024/04/19
      Vol:
    E107-D No:8
      Page(s):
    1070-1078

    Melanin, which is responsible for the appearance of spots and freckles, is an important indicator in evaluating skin condition. To assess the efficacy of cosmetics, skin condition scoring is performed by analyzing the distribution and amount of melanin from microscopic images of the stratum corneum cells. However, the current practice of diagnosing skin condition using stratum corneum cells images relies heavily on visual evaluation by experts. The goal of this study is to develop a quantitative evaluation system for skin condition based on melanin within unstained stratum corneum cells images. The proposed system utilizes principal component regression to perform five-level scoring, which is then compared with visual evaluation scores to assess the system’s usefulness. Additionally, we evaluated the impact of indicators related to melanin obtained from images on the scores, and verified which indicators are effective for evaluation. In conclusion, we confirmed that scoring is possible with an accuracy of more than 60% on a combination of several indicators, which is comparable to the accuracy of visual assessment.

  • Prohibited Item Detection Within X-Ray Security Inspection Images Based on an Improved Cascade Network Open Access

    Qingqi ZHANG  Xiaoan BAO  Ren WU  Mitsuru NAKATA  Qi-Wei GE  

     
    PAPER

      Pubricized:
    2024/01/16
      Vol:
    E107-A No:5
      Page(s):
    813-824

    Automatic detection of prohibited items is vital in helping security staff be more efficient while improving the public safety index. However, prohibited item detection within X-ray security inspection images is limited by various factors, including the imbalance distribution of categories, diversity of prohibited item scales, and overlap between items. In this paper, we propose to leverage the Poisson blending algorithm with the Canny edge operator to alleviate the imbalance distribution of categories maximally in the X-ray images dataset. Based on this, we improve the cascade network to deal with the other two difficulties. To address the prohibited scale diversity problem, we propose the Re-BiFPN feature fusion method, which includes a coordinate attention atrous spatial pyramid pooling (CA-ASPP) module and a recursive connection. The CA-ASPP module can implicitly extract direction-aware and position-aware information from the feature map. The recursive connection feeds the CA-ASPP module processed multi-scale feature map to the bottom-up backbone layer for further multi-scale feature extraction. In addition, a Rep-CIoU loss function is designed to address the overlapping problem in X-ray images. Extensive experimental results demonstrate that our method can successfully identify ten types of prohibited items, such as Knives, Scissors, Pressure, etc. and achieves 83.4% of mAP, which is 3.8% superior to the original cascade network. Moreover, our method outperforms other mainstream methods by a significant margin.

  • Effect of Perceptually Uniform Color Space and Diversity of Chromaticity Components on Digital Signage and Image Sensor-Based Visible Light Communication Open Access

    Kazuya SHIMEI  Kentaro KOBAYASHI  Wataru CHUJO  

     
    PAPER-Communication Theory and Signals

      Pubricized:
    2023/08/07
      Vol:
    E107-A No:4
      Page(s):
    638-653

    We study a visible light communication (VLC) system that modulates data signals by changing the color components of image contents on a digital signage display, captures them with an image sensor, and demodulates them using image processing. This system requires that the modulated data signals should not be perceived by the human eye. Previous studies have proposed modulation methods with a chromaticity component that is difficult for the human eye to perceive, and we have also proposed a modulation method with perceptually uniform color space based on human perception characteristics. However, which chromaticity component performs better depends on the image contents, and the evaluation only for some specific image contents was not sufficient. In this paper, we evaluate the communication and visual quality of the modulation methods with chromaticity components for various standard images to clarify the superiority of the method with perceptually uniform color space. In addition, we propose a novel modulation and demodulation method using diversity combining to eliminate the dependency of performance on the image contents. Experimental results show that the proposed method can improve the communication and visual quality for almost all the standard images.

  • Multi-Style Shape Matching GAN for Text Images Open Access

    Honghui YUAN  Keiji YANAI  

     
    PAPER

      Pubricized:
    2023/12/27
      Vol:
    E107-D No:4
      Page(s):
    505-514

    Deep learning techniques are used to transform the style of images and produce diverse images. In the text style transformation field, many previous studies attempted to generate stylized text using deep learning networks. However, to achieve multiple style transformations for text images, the methods proposed in previous studies require learning multiple networks or cannot be guided by style images. Thus, in this study we focused on multistyle transformation of text images using style images to guide the generation of results. We propose a multiple-style transformation network for text style transfer, which we refer to as the Multi-Style Shape Matching GAN (Multi-Style SMGAN). The proposed method generates multiple styles of text images using a single model by training the model only once, and allows users to control the text style according to style images. The proposed method implements conditions to the network such that all styles can be distinguished effectively in the network, and the generation of each styled text can be controlled according to these conditions. The proposed network is optimized such that the conditional information can be transmitted effectively throughout the network. The proposed method was evaluated experimentally on a large number of text images, and the results show that the trained model can generate multiple-style text in realtime according to the style image. In addition, the results of a user survey study indicate that the proposed method produces higher quality results compared to existing methods.

  • Grid Sample Based Temporal Iteration for Fully Pipelined 1-ms SLIC Superpixel Segmentation System Open Access

    Yuan LI  Tingting HU  Ryuji FUCHIKAMI  Takeshi IKENAGA  

     
    PAPER-Computer System

      Pubricized:
    2023/12/19
      Vol:
    E107-D No:4
      Page(s):
    515-524

    A 1 millisecond (1-ms) vision system, which processes videos at 1000 frames per second (FPS) within 1 ms/frame delay, plays an increasingly important role in fields such as robotics and factory automation. Superpixel as one of the most extensively employed image oversegmentation methods is a crucial pre-processing step for reducing computations in various computer vision applications. Among the different superpixel methods, simple linear iterative clustering (SLIC) has gained widespread adoption due to its simplicity, effectiveness, and computational efficiency. However, the iterative assignment and update steps in SLIC make it challenging to achieve high processing speed. To address this limitation and develop a SLIC superpixel segmentation system with a 1 ms delay, this paper proposes grid sample based temporal iteration. By leveraging the high frame rate of the input video, the proposed method distributes the iterations into the temporal domain, ensuring that the system's delay keeps within one frame. Additionally, grid sample information is added as initialization information to the obtained superpixel centers for enhancing the stability of superpixels. Furthermore, a selective label propagation based pipeline architecture is proposed for parallel computation of all the possibilities of label propagation. This eliminates data dependency between adjacent pixels and enables a fully pipelined system. The evaluation results demonstrate that the proposed superpixel segmentation system achieves boundary recall and under-segmentation error comparable to the original SLIC algorithm. When considering label consistency, the proposed system surpasses the performance of state-of-the-art superpixel segmentation methods. Moreover, in terms of hardware performance, the proposed system processes 1000 FPS images with 0.985 ms/frame delay.

  • Infrared and Visible Image Fusion via Hybrid Variational Model Open Access

    Zhengwei XIA  Yun LIU  Xiaoyun WANG  Feiyun ZHANG  Rui CHEN  Weiwei JIANG  

     
    LETTER-Image Processing and Video Processing

      Pubricized:
    2023/12/11
      Vol:
    E107-D No:4
      Page(s):
    569-573

    Infrared and visible image fusion can combine the thermal radiation information and the textures to provide a high-quality fused image. In this letter, we propose a hybrid variational fusion model to achieve this end. Specifically, an ℓ0 term is adopted to preserve the highlighted targets with salient gradient variation in the infrared image, an ℓ1 term is used to suppress the noise in the fused image and an ℓ2 term is employed to keep the textures of the visible image. Experimental results demonstrate the superiority of the proposed variational model and our results have more sharpen textures with less noise.

  • Pipelined ADPCM Compression for HDR Synthesis on an FPGA

    Masahiro NISHIMURA  Taito MANABE  Yuichiro SHIBATA  

     
    PAPER-VLSI Design Technology and CAD

      Pubricized:
    2023/08/31
      Vol:
    E107-A No:3
      Page(s):
    531-539

    This paper presents an FPGA implementation of real-time high dynamic range (HDR) synthesis, which expresses a wide dynamic range by combining multiple images with different exposures using image pyramids. We have implemented a pipeline that performs streaming processing on images without using external memory. However, implementation for high-resolution images has been difficult due to large memory usage for line buffers. Therefore, we propose an image compression algorithm based on adaptive differential pulse code modulation (ADPCM). Compression modules based on the algorithm can be easily integrated into the pipeline. When the image resolution is 4K and the pyramid depth is 7, memory usage can be halved from 168.48% to 84.32% by introducing the compression modules, resulting in better quality.

  • Adversarial Examples Created by Fault Injection Attack on Image Sensor Interface

    Tatsuya OYAMA  Kota YOSHIDA  Shunsuke OKURA  Takeshi FUJINO  

     
    PAPER

      Pubricized:
    2023/09/26
      Vol:
    E107-A No:3
      Page(s):
    344-354

    Adversarial examples (AEs), which cause misclassification by adding subtle perturbations to input images, have been proposed as an attack method on image-classification systems using deep neural networks (DNNs). Physical AEs created by attaching stickers to traffic signs have been reported, which are a threat to traffic-sign-recognition DNNs used in advanced driver assistance systems. We previously proposed an attack method for generating a noise area on images by superimposing an electrical signal on the mobile industry processor interface and showed that it can generate a single adversarial mark that triggers a backdoor attack on the input image. Therefore, we propose a misclassification attack method n DNNs by creating AEs that include small perturbations to multiple places on the image by the fault injection. The perturbation position for AEs is pre-calculated in advance against the target traffic-sign image, which will be captured on future driving. With 5.2% to 5.5% of a specific image on the simulation, the perturbation that induces misclassification to the target label was calculated. As the experimental results, we confirmed that the traffic-sign-recognition DNN on a Raspberry Pi was successfully misclassified when the target traffic sign was captured with. In addition, we created robust AEs that cause misclassification of images with varying positions and size by adding a common perturbation. We propose a method to reduce the amount of robust AEs perturbation. Our results demonstrated successful misclassification of the captured image with a high attack success rate even if the position and size of the captured image are slightly changed.

  • Rotation-Invariant Convolution Networks with Hexagon-Based Kernels

    Yiping TANG  Kohei HATANO  Eiji TAKIMOTO  

     
    PAPER-Biocybernetics, Neurocomputing

      Pubricized:
    2023/11/15
      Vol:
    E107-D No:2
      Page(s):
    220-228

    We introduce the Hexagonal Convolutional Neural Network (HCNN), a modified version of CNN that is robust against rotation. HCNN utilizes a hexagonal kernel and a multi-block structure that enjoys more degrees of rotation information sharing than standard convolution layers. Our structure is easy to use and does not affect the original tissue structure of the network. We achieve the complete rotational invariance on the recognition task of simple pattern images and demonstrate better performance on the recognition task of the rotated MNIST images, synthetic biomarker images and microscopic cell images than past methods, where the robustness to rotation matters.

  • Content-Adaptive Optimization Framework for Universal Deep Image Compression

    Koki TSUBOTA  Kiyoharu AIZAWA  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2023/10/24
      Vol:
    E107-D No:2
      Page(s):
    201-211

    While deep image compression performs better than traditional codecs like JPEG on natural images, it faces a challenge as a learning-based approach: compression performance drastically decreases for out-of-domain images. To investigate this problem, we introduce a novel task that we call universal deep image compression, which involves compressing images in arbitrary domains, such as natural images, line drawings, and comics. Furthermore, we propose a content-adaptive optimization framework to tackle this task. This framework adapts a pre-trained compression model to each target image during testing for addressing the domain gap between pre-training and testing. For each input image, we insert adapters into the decoder of the model and optimize the latent representation extracted by the encoder and the adapter parameters in terms of rate-distortion, with the adapter parameters transmitted per image. To achieve the evaluation of the proposed universal deep compression, we constructed a benchmark dataset containing uncompressed images of four domains: natural images, line drawings, comics, and vector arts. We compare our proposed method with non-adaptive and existing adaptive compression methods, and the results show that our method outperforms them. Our code and dataset are publicly available at https://github.com/kktsubota/universal-dic.

  • Lightweight and Fast Low-Light Image Enhancement Method Based on PoolFormer

    Xin HU  Jinhua WANG  Sunhan XU  

     
    LETTER-Image Processing and Video Processing

      Pubricized:
    2023/10/05
      Vol:
    E107-D No:1
      Page(s):
    157-160

    Images captured in low-light environments have low visibility and high noise, which will seriously affect subsequent visual tasks such as target detection and face recognition. Therefore, low-light image enhancement is of great significance in obtaining high-quality images and is a challenging problem in computer vision tasks. A low-light enhancement model, LLFormer, based on the Vision Transformer, uses axis-based multi-head self-attention and a cross-layer attention fusion mechanism to reduce the complexity and achieve feature extraction. This algorithm can enhance images well. However, the calculation of the attention mechanism is complex and the number of parameters is large, which limits the application of the model in practice. In response to this problem, a lightweight module, PoolFormer, is used to replace the attention module with spatial pooling, which can increase the parallelism of the network and greatly reduce the number of model parameters. To suppress image noise and improve visual effects, a new loss function is constructed for model optimization. The experiment results show that the proposed method not only reduces the number of parameters by 49%, but also performs better in terms of image detail restoration and noise suppression compared with the baseline model. On the LOL dataset, the PSNR and SSIM were 24.098dB and 0.8575 respectively. On the MIT-Adobe FiveK dataset, the PSNR and SSIM were 27.060dB and 0.9490. The evaluation results on the two datasets are better than the current mainstream low-light enhancement algorithms.

  • Hierarchical Detailed Intermediate Supervision for Image-to-Image Translation

    Jianbo WANG  Haozhi HUANG  Li SHEN  Xuan WANG  Toshihiko YAMASAKI  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2023/09/14
      Vol:
    E106-D No:12
      Page(s):
    2085-2096

    The image-to-image translation aims to learn a mapping between the source and target domains. For improving visual quality, the majority of previous works adopt multi-stage techniques to refine coarse results in a progressive manner. In this work, we present a novel approach for generating plausible details by only introducing a group of intermediate supervisions without cascading multiple stages. Specifically, we propose a Laplacian Pyramid Transformation Generative Adversarial Network (LapTransGAN) to simultaneously transform components in different frequencies from the source domain to the target domain within only one stage. Hierarchical perceptual and gradient penalization are utilized for learning consistent semantic structures and details at each pyramid level. The proposed model is evaluated based on various metrics, including the similarity in feature maps, reconstruction quality, segmentation accuracy, similarity in details, and qualitative appearances. Our experiments show that LapTransGAN can achieve a much better quantitative performance than both the supervised pix2pix model and the unsupervised CycleGAN model. Comprehensive ablation experiments are conducted to study the contribution of each component.

  • Low-Light Image Enhancement Method Using a Modified Gamma Transform and Gamma Filtering-Based Histogram Specification for Convex Combination Coefficients

    Mashiho MUKAIDA  Yoshiaki UEDA  Noriaki SUETAKE  

     
    PAPER-Image

      Pubricized:
    2023/04/21
      Vol:
    E106-A No:11
      Page(s):
    1385-1394

    Recently, a lot of low-light image enhancement methods have been proposed. However, these methods have some problems such as causing fine details lost in bright regions and/or unnatural color tones. In this paper, we propose a new low-light image enhancement method to cope with these problems. In the proposed method, a pixel is represented by a convex combination of white, black, and pure color. Then, an equi-hue plane in RGB color space is represented as a triangle whose vertices correspond to white, black, and pure color. The visibility of low-light image is improved by applying a modified gamma transform to the combination coefficients on an equi-hue plane in RGB color space. The contrast of the image is enhanced by the histogram specification method using the histogram smoothed by a filter with a kernel determined based on a gamma distribution. In the experiments, the effectiveness of the proposed method is verified by the comparison with the state-of-the-art low-light image enhancement methods.

  • A DFT and IWT-DCT Based Image Watermarking Scheme for Industry

    Lei LI  Hong-Jun ZHANG  Hang-Yu FAN  Zhe-Ming LU  

     
    LETTER-Information Network

      Pubricized:
    2023/08/22
      Vol:
    E106-D No:11
      Page(s):
    1916-1921

    Until today, digital image watermarking has not been large-scale used in the industry. The first reason is that the watermarking efficiency is low and the real-time performance cannot be satisfied. The second reason is that the watermarking scheme cannot cope with various attacks. To solve above problems, this paper presents a multi-domain based digital image watermarking scheme, where a fast DFT (Discrete Fourier Transform) based watermarking method is proposed for synchronization correction and an IWT-DCT (Integer Wavelet Transform-Discrete Cosine Transform) based watermarking method is proposed for information embedding. The proposed scheme has high efficiency during embedding and extraction. Compared with five existing schemes, the robustness of our scheme is very strong and our scheme can cope with many common attacks and compound attacks, and thus can be used in wide application scenarios.

  • No Reference Quality Assessment of Contrast-Distorted SEM Images Based on Global Features

    Fengchuan XU  Qiaoyue LI  Guilu ZHANG  Yasheng CHANG  Zixuan ZHENG  

     
    LETTER-Image Processing and Video Processing

      Pubricized:
    2023/07/28
      Vol:
    E106-D No:11
      Page(s):
    1935-1938

    This letter presents a global feature-based method for evaluating the no reference quality of scanning electron microscopy (SEM) contrast-distorted images. Based on the characteristics of SEM images and the human visual system, the global features of SEM images are extracted as the score for evaluating image quality. In this letter, the texture information of SEM images is first extracted using a low-pass filter with orientation, and the amount of information in the texture part is calculated based on the entropy reflecting the complexity of the texture. The singular values with four scales of the original image are then calculated, and the amount of structural change between different scales is calculated and averaged. Finally, the amounts of texture information and structural change are pooled to generate the final quality score of the SEM image. Experimental results show that the method can effectively evaluate the quality of SEM contrast-distorted images.

  • Line Segment Detection Based on False Peak Suppression and Local Hough Transform and Application to Nuclear Emulsion

    Ye TIAN  Mei HAN  Jinyi ZHANG  

    This article has been retracted at the request of the authors.
     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2023/08/09
      Vol:
    E106-D No:11
      Page(s):
    1854-1867

    This paper mainly proposes a line segment detection method based on pseudo peak suppression and local Hough transform, which has good noise resistance and can solve the problems of short line segment missing detection, false detection, and oversegmentation. In addition, in response to the phenomenon of uneven development in nuclear emulsion tomographic images, this paper proposes an image preprocessing process that uses the “Difference of Gaussian” method to reduce noise and then uses the standard deviation of the gray value of each pixel to bundle and unify the gray value of each pixel, which can robustly obtain the linear features in these images. The tests on the actual dataset of nuclear emulsion tomographic images and the public YorkUrban dataset show that the proposed method can effectively improve the accuracy of convolutional neural network or vision in transformer-based event classification for alpha-decay events in nuclear emulsion. In particular, the line segment detection method in the proposed method achieves optimal results in both accuracy and processing speed, which also has strong generalization ability in high quality natural images.

  • Facial Mask Completion Using StyleGAN2 Preserving Features of the Person

    Norihiko KAWAI  Hiroaki KOIKE  

     
    PAPER

      Pubricized:
    2023/05/30
      Vol:
    E106-D No:10
      Page(s):
    1627-1637

    Due to the global outbreak of coronaviruses, people are increasingly wearing masks even when photographed. As a result, photos uploaded to web pages and social networking services with the lower half of the face hidden are less likely to convey the attractiveness of the photographed persons. In this study, we propose a method to complete facial mask regions using StyleGAN2, a type of Generative Adversarial Networks (GAN). In the proposed method, a reference image of the same person without a mask is prepared separately from a target image of the person wearing a mask. After the mask region in the target image is temporarily inpainted, the face orientation and contour of the person in the reference image are changed to match those of the target image using StyleGAN2. The changed image is then composited into the mask region while correcting the color tone to produce a mask-free image while preserving the person's features.

  • Fusion-Based Edge and Color Recovery Using Weighted Near-Infrared Image and Color Transmission Maps for Robust Haze Removal

    Onhi KATO  Akira KUBOTA  

     
    PAPER

      Pubricized:
    2023/05/23
      Vol:
    E106-D No:10
      Page(s):
    1661-1672

    Various haze removal methods based on the atmospheric scattering model have been presented in recent years. Most methods have targeted strong haze images where light is scattered equally in all color channels. This paper presents a haze removal method using near-infrared (NIR) images for relatively weak haze images. In order to recover the lost edges, the presented method first extracts edges from an appropriately weighted NIR image and fuses it with the color image. By introducing a wavelength-dependent scattering model, our method then estimates the transmission map for each color channel and recovers the color more naturally from the edge-recovered image. Finally, the edge-recovered and the color-recovered images are blended. In this blending process, the regions with high lightness, such as sky and clouds, where unnatural color shifts are likely to occur, are effectively estimated, and the optimal weighting map is obtained. Our qualitative and quantitative evaluations using 59 pairs of color and NIR images demonstrated that our method can recover edges and colors more naturally in weak haze images than conventional methods.

  • Multi-Scale Estimation for Omni-Directional Saliency Maps Using Learnable Equator Bias

    Takao YAMANAKA  Tatsuya SUZUKI  Taiki NOBUTSUNE  Chenjunlin WU  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2023/07/19
      Vol:
    E106-D No:10
      Page(s):
    1723-1731

    Omni-directional images have been used in wide range of applications including virtual/augmented realities, self-driving cars, robotics simulators, and surveillance systems. For these applications, it would be useful to estimate saliency maps representing probability distributions of gazing points with a head-mounted display, to detect important regions in the omni-directional images. This paper proposes a novel saliency-map estimation model for the omni-directional images by extracting overlapping 2-dimensional (2D) plane images from omni-directional images at various directions and angles of view. While 2D saliency maps tend to have high probability at the center of images (center bias), the high-probability region appears at horizontal directions in omni-directional saliency maps when a head-mounted display is used (equator bias). Therefore, the 2D saliency model with a center-bias layer was fine-tuned with an omni-directional dataset by replacing the center-bias layer to an equator-bias layer conditioned on the elevation angle for the extraction of the 2D plane image. The limited availability of omni-directional images in saliency datasets can be compensated by using the well-established 2D saliency model pretrained by a large number of training images with the ground truth of 2D saliency maps. In addition, this paper proposes a multi-scale estimation method by extracting 2D images in multiple angles of view to detect objects of various sizes with variable receptive fields. The saliency maps estimated from the multiple angles of view were integrated by using pixel-wise attention weights calculated in an integration layer for weighting the optimal scale to each object. The proposed method was evaluated using a publicly available dataset with evaluation metrics for omni-directional saliency maps. It was confirmed that the accuracy of the saliency maps was improved by the proposed method.

  • GAN-based Image Translation Model with Self-Attention for Nighttime Dashcam Data Augmentation

    Rebeka SULTANA  Gosuke OHASHI  

     
    PAPER-Intelligent Transport System

      Pubricized:
    2023/06/27
      Vol:
    E106-A No:9
      Page(s):
    1202-1210

    High-performance deep learning-based object detection models can reduce traffic accidents using dashcam images during nighttime driving. Deep learning requires a large-scale dataset to obtain a high-performance model. However, existing object detection datasets are mostly daytime scenes and a few nighttime scenes. Increasing the nighttime dataset is laborious and time-consuming. In such a case, it is possible to convert daytime images to nighttime images by image-to-image translation model to augment the nighttime dataset with less effort so that the translated dataset can utilize the annotations of the daytime dataset. Therefore, in this study, a GAN-based image-to-image translation model is proposed by incorporating self-attention with cycle consistency and content/style separation for nighttime data augmentation that shows high fidelity to annotations of the daytime dataset. Experimental results highlight the effectiveness of the proposed model compared with other models in terms of translated images and FID scores. Moreover, the high fidelity of translated images to the annotations is verified by a small object detection model according to detection results and mAP. Ablation studies confirm the effectiveness of self-attention in the proposed model. As a contribution to GAN-based data augmentation, the source code of the proposed image translation model is publicly available at https://github.com/subecky/Image-Translation-With-Self-Attention

21-40hit(1441hit)

FlyerIEICE has prepared a flyer regarding multilingual services. Please use the one in your native language.