An Efficient Image to Sound Mapping Method Using Speech Spectral Phase and Multi-Column Image

Arata KAWAMURA; Hiro IGARASHI; Youji IIGUNI

doi:10.1587/transfun.E100.A.893

An Efficient Image to Sound Mapping Method Using Speech Spectral Phase and Multi-Column Image

Arata KAWAMURA, Hiro IGARASHI, Youji IIGUNI

Full Text Views

0

Share
Cite this

Summary :

Image-to-sound mapping is a technique that transforms an image to a sound signal, which is subsequently treated as a sound spectrogram. In general, the transformed sound differs from a human speech signal. Herein an efficient image-to-sound mapping method, which provides an understandable speech signal without any training, is proposed. To synthesize such a speech signal, the proposed method utilizes a multi-column image and a speech spectral phase that is obtained from a long-time observation of the speech. The original image can be retrieved from the sound spectrogram of the synthesized speech signal. The synthesized speech and the reconstructed image qualities are evaluated using objective tests.

Publication: IEICE TRANSACTIONS on Fundamentals Vol.E100-A No.3 pp.893-895

Publication Date: 2017/03/01

Publicized

Online ISSN: 1745-1337

DOI: 10.1587/transfun.E100.A.893

Type of Manuscript: LETTER

Category: Digital Signal Processing

Authors

Arata KAWAMURA
  Osaka University
Hiro IGARASHI
  Osaka University
Youji IIGUNI
  Osaka University

Keyword

spectrogram, long-time Fourier transform, image-to-sound mapping, spectral phase

Cite this

Copy

Arata KAWAMURA, Hiro IGARASHI, Youji IIGUNI, "An Efficient Image to Sound Mapping Method Using Speech Spectral Phase and Multi-Column Image" in IEICE TRANSACTIONS on Fundamentals, vol. E100-A, no. 3, pp. 893-895, March 2017, doi: 10.1587/transfun.E100.A.893.
Abstract: Image-to-sound mapping is a technique that transforms an image to a sound signal, which is subsequently treated as a sound spectrogram. In general, the transformed sound differs from a human speech signal. Herein an efficient image-to-sound mapping method, which provides an understandable speech signal without any training, is proposed. To synthesize such a speech signal, the proposed method utilizes a multi-column image and a speech spectral phase that is obtained from a long-time observation of the speech. The original image can be retrieved from the sound spectrogram of the synthesized speech signal. The synthesized speech and the reconstructed image qualities are evaluated using objective tests.
URL: https://globals.ieice.org/en_transactions/fundamentals/10.1587/transfun.E100.A.893/_p

Copy

@ARTICLE{e100-a_3_893,
author={Arata KAWAMURA, Hiro IGARASHI, Youji IIGUNI, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={An Efficient Image to Sound Mapping Method Using Speech Spectral Phase and Multi-Column Image},
year={2017},
volume={E100-A},
number={3},
pages={893-895},
abstract={Image-to-sound mapping is a technique that transforms an image to a sound signal, which is subsequently treated as a sound spectrogram. In general, the transformed sound differs from a human speech signal. Herein an efficient image-to-sound mapping method, which provides an understandable speech signal without any training, is proposed. To synthesize such a speech signal, the proposed method utilizes a multi-column image and a speech spectral phase that is obtained from a long-time observation of the speech. The original image can be retrieved from the sound spectrogram of the synthesized speech signal. The synthesized speech and the reconstructed image qualities are evaluated using objective tests.},
keywords={},
doi={10.1587/transfun.E100.A.893},
ISSN={1745-1337},
month={March},}

Copy

TY - JOUR
TI - An Efficient Image to Sound Mapping Method Using Speech Spectral Phase and Multi-Column Image
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 893
EP - 895
AU - Arata KAWAMURA
AU - Hiro IGARASHI
AU - Youji IIGUNI
PY - 2017
DO - 10.1587/transfun.E100.A.893
JO - IEICE TRANSACTIONS on Fundamentals
SN - 1745-1337
VL - E100-A
IS - 3
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - March 2017
AB - Image-to-sound mapping is a technique that transforms an image to a sound signal, which is subsequently treated as a sound spectrogram. In general, the transformed sound differs from a human speech signal. Herein an efficient image-to-sound mapping method, which provides an understandable speech signal without any training, is proposed. To synthesize such a speech signal, the proposed method utilizes a multi-column image and a speech spectral phase that is obtained from a long-time observation of the speech. The original image can be retrieved from the sound spectrogram of the synthesized speech signal. The synthesized speech and the reconstructed image qualities are evaluated using objective tests.
ER -