Author Search Result

[Author] Ji XI(2hit)

1-2hit
  • A CNN-Based Feature Pyramid Segmentation Strategy for Acoustic Scene Classification Open Access

    Ji XI  Yue XIE  Pengxu JIANG  Wei JIANG  

     
    LETTER-Speech and Hearing

      Pubricized:
    2024/03/26
      Vol:
    E107-D No:8
      Page(s):
    1093-1096

    Currently, a significant portion of acoustic scene categorization (ASC) research is centered around utilizing Convolutional Neural Network (CNN) models. This preference is primarily due to CNN’s ability to effectively extract time-frequency information from audio recordings of scenes by employing spectrum data as input. The expression of many dimensions can be achieved by utilizing 2D spectrum characteristics. Nevertheless, the diverse interpretations of the same object’s existence in different positions on the spectrum map can be attributed to the discrepancies between spectrum properties and picture qualities. The lack of distinction between different aspects of input information in ASC-based CNN networks may result in a decline in system performance. Considering this, a feature pyramid segmentation (FPS) approach based on CNN is proposed. The proposed approach involves utilizing spectrum features as the input for the model. These features are split based on a preset scale, and each segment-level feature is then fed into the CNN network for learning. The SoftMax classifier will receive the output of all feature scales, and these high-level features will be fused and fed to it to categorize different scenarios. The experiment provides evidence to support the efficacy of the FPS strategy and its potential to enhance the performance of the ASC system.

  • CNN-Based Feature Integration Network for Speech Enhancement in Microphone Arrays Open Access

    Ji XI  Pengxu JIANG  Yue XIE  Wei JIANG  Hao DING  

     
    LETTER-Speech and Hearing

      Pubricized:
    2024/08/26
      Vol:
    E107-D No:12
      Page(s):
    1546-1549

    The relevant model based on convolutional neural networks (CNNs) has been proven to be an effective solution in speech enhancement algorithms. However, there needs to be more research on CNNs based on microphone arrays, especially in exploring the correlation between networks associated with different microphones. In this paper, we proposed a CNN-based feature integration network for speech enhancement in microphone arrays. The input of CNN is composed of short-time Fourier transform (STFT) from different microphones. CNN includes the encoding layer, decoding layer, and skip structure. In addition, the designed feature integration layer enables information exchange between different microphones, and the designed feature fusion layer integrates additional information. The experiment proved the superiority of the designed structure.

FlyerIEICE has prepared a flyer regarding multilingual services. Please use the one in your native language.