Full Text Views
202
This paper presents a way of using a linear regression model to produce a single-valued criterion that indicates the perceived importance of each block in a stream of speech blocks. This method is superior to the conventional approach, voice activity detection (VAD), in that it provides a dynamically changing priority value for speech segments with finer granularity. The approach can be used in conjunction with scalable speech coding techniques in the context of IP QoS services to achieve a flexible form of quality control for speech transmission. A simple linear regression model is used to estimate a mean opinion score (MOS) of the various cases of missing speech segments. The estimated MOS is a continuous value that can be mapped to priority levels with arbitrary granularity. Through subjective evaluation, we show the validity of the calculated priority values.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Yusuke HIWASAKI, Toru MORINAGA, Jotaro IKEDO, Akitoshi KATAOKA, "Measuring the Perceived Importance of Speech Segments for Transmission over IP Networks" in IEICE TRANSACTIONS on Communications,
vol. E89-B, no. 2, pp. 326-333, February 2006, doi: 10.1093/ietcom/e89-b.2.326.
Abstract: This paper presents a way of using a linear regression model to produce a single-valued criterion that indicates the perceived importance of each block in a stream of speech blocks. This method is superior to the conventional approach, voice activity detection (VAD), in that it provides a dynamically changing priority value for speech segments with finer granularity. The approach can be used in conjunction with scalable speech coding techniques in the context of IP QoS services to achieve a flexible form of quality control for speech transmission. A simple linear regression model is used to estimate a mean opinion score (MOS) of the various cases of missing speech segments. The estimated MOS is a continuous value that can be mapped to priority levels with arbitrary granularity. Through subjective evaluation, we show the validity of the calculated priority values.
URL: https://globals.ieice.org/en_transactions/communications/10.1093/ietcom/e89-b.2.326/_p
Copy
@ARTICLE{e89-b_2_326,
author={Yusuke HIWASAKI, Toru MORINAGA, Jotaro IKEDO, Akitoshi KATAOKA, },
journal={IEICE TRANSACTIONS on Communications},
title={Measuring the Perceived Importance of Speech Segments for Transmission over IP Networks},
year={2006},
volume={E89-B},
number={2},
pages={326-333},
abstract={This paper presents a way of using a linear regression model to produce a single-valued criterion that indicates the perceived importance of each block in a stream of speech blocks. This method is superior to the conventional approach, voice activity detection (VAD), in that it provides a dynamically changing priority value for speech segments with finer granularity. The approach can be used in conjunction with scalable speech coding techniques in the context of IP QoS services to achieve a flexible form of quality control for speech transmission. A simple linear regression model is used to estimate a mean opinion score (MOS) of the various cases of missing speech segments. The estimated MOS is a continuous value that can be mapped to priority levels with arbitrary granularity. Through subjective evaluation, we show the validity of the calculated priority values.},
keywords={},
doi={10.1093/ietcom/e89-b.2.326},
ISSN={1745-1345},
month={February},}
Copy
TY - JOUR
TI - Measuring the Perceived Importance of Speech Segments for Transmission over IP Networks
T2 - IEICE TRANSACTIONS on Communications
SP - 326
EP - 333
AU - Yusuke HIWASAKI
AU - Toru MORINAGA
AU - Jotaro IKEDO
AU - Akitoshi KATAOKA
PY - 2006
DO - 10.1093/ietcom/e89-b.2.326
JO - IEICE TRANSACTIONS on Communications
SN - 1745-1345
VL - E89-B
IS - 2
JA - IEICE TRANSACTIONS on Communications
Y1 - February 2006
AB - This paper presents a way of using a linear regression model to produce a single-valued criterion that indicates the perceived importance of each block in a stream of speech blocks. This method is superior to the conventional approach, voice activity detection (VAD), in that it provides a dynamically changing priority value for speech segments with finer granularity. The approach can be used in conjunction with scalable speech coding techniques in the context of IP QoS services to achieve a flexible form of quality control for speech transmission. A simple linear regression model is used to estimate a mean opinion score (MOS) of the various cases of missing speech segments. The estimated MOS is a continuous value that can be mapped to priority levels with arbitrary granularity. Through subjective evaluation, we show the validity of the calculated priority values.
ER -