Author Search Result

[Author] Yuya HOSODA(2hit)

1-2hit
  • An Efficient Image to Sound Mapping Method Preserving Speech Spectral Envelope

    Yuya HOSODA  Arata KAWAMURA  Youji IIGUNI  

     
    LETTER-Digital Signal Processing

      Vol:
    E103-A No:3
      Page(s):
    629-630

    In this paper, we propose an image to sound mapping method. This technique treats an image as a spectrogram and maps it to a sound by taking inverse FFT of the spectrogram. Amplitude spectra of a speech signal are embedded to the spectrogram to give speech intelligibility for the mapped sound. Specifically, we hold amplitude spectra of a speech signal with strong power and embed the image brightness in other frequency bands. Holding amplitude spectra of a speech signal with strong power preserves a speech spectral envelope and improves the speech quality of the mapped sound. The amplitude spectra of the mapped sound with weak power represent the image brightness, and then the image is successfully reconstructed from the mapped sound. Simulation results show that the proposed method achieves sufficient speech quality.

  • Artificial Bandwidth Extension for Lower Bandwidth Using Sinusoidal Synthesis based on First Formant Location

    Yuya HOSODA  Arata KAWAMURA  Youji IIGUNI  

     
    PAPER-Engineering Acoustics

      Pubricized:
    2021/10/12
      Vol:
    E105-A No:4
      Page(s):
    664-672

    The narrow bandwidth limitation of 300-3400Hz on the public switching telephone network results in speech quality deterioration. In this paper, we propose an artificial bandwidth extension approach that reconstructs the missing lower bandwidth of 50-300Hz using sinusoidal synthesis based on the first formant location. Sinusoidal synthesis generates sinusoidal waves with a harmonic structure. The proposed method detects the fundamental frequency using an autocorrelation method based on YIN algorithm, where a threshold processing avoids the false fundamental frequency detection on unvoiced sounds. The amplitude of the sinusoidal waves is calculated in the time domain from the weighted energy of 300-600Hz. In this case, since the first formant location corresponds to the first peak of the spectral envelope, we reconstruct the harmonic structure to avoid attenuating and overemphasizing by increasing the weight when the first formant location is lower, and vice versa. Consequently, the subjective and objective evaluations show that the proposed method reduces the speech quality difference between the original speech signal and the bandwidth extended speech signal.

FlyerIEICE has prepared a flyer regarding multilingual services. Please use the one in your native language.