Risk-Based Semi-Supervised Discriminative Language Modeling for Broadcast Transcription

Akio KOBAYASHI; Takahiro OKU; Toru IMAI; Seiichi NAKAGAWA

doi:10.1587/transinf.E95.D.2674

Risk-Based Semi-Supervised Discriminative Language Modeling for Broadcast Transcription

Akio KOBAYASHI, Takahiro OKU, Toru IMAI, Seiichi NAKAGAWA

Full Text Views

0

Share
Cite this

Summary :

This paper describes a new method for semi-supervised discriminative language modeling, which is designed to improve the robustness of a discriminative language model (LM) obtained from manually transcribed (labeled) data. The discriminative LM is implemented as a log-linear model, which employs a set of linguistic features derived from word or phoneme sequences. The proposed semi-supervised discriminative modeling is formulated as a multi-objective optimization programming problem (MOP), which consists of two objective functions defined on both labeled lattices and automatic speech recognition (ASR) lattices as unlabeled data. The objectives are coherently designed based on the expected risks that reflect information about word errors for the training data. The model is trained in a discriminative manner and acquired as a solution to the MOP problem. In transcribing Japanese broadcast programs, the proposed method reduced relatively a word error rate by 6.3% compared with that achieved by a conventional trigram LM.

Publication: IEICE TRANSACTIONS on Information Vol.E95-D No.11 pp.2674-2681

Publication Date: 2012/11/01

Publicized

Online ISSN: 1745-1361

DOI: 10.1587/transinf.E95.D.2674

Type of Manuscript: PAPER

Category: Speech and Hearing

Cite this

Copy

Akio KOBAYASHI, Takahiro OKU, Toru IMAI, Seiichi NAKAGAWA, "Risk-Based Semi-Supervised Discriminative Language Modeling for Broadcast Transcription" in IEICE TRANSACTIONS on Information, vol. E95-D, no. 11, pp. 2674-2681, November 2012, doi: 10.1587/transinf.E95.D.2674.
Abstract: This paper describes a new method for semi-supervised discriminative language modeling, which is designed to improve the robustness of a discriminative language model (LM) obtained from manually transcribed (labeled) data. The discriminative LM is implemented as a log-linear model, which employs a set of linguistic features derived from word or phoneme sequences. The proposed semi-supervised discriminative modeling is formulated as a multi-objective optimization programming problem (MOP), which consists of two objective functions defined on both labeled lattices and automatic speech recognition (ASR) lattices as unlabeled data. The objectives are coherently designed based on the expected risks that reflect information about word errors for the training data. The model is trained in a discriminative manner and acquired as a solution to the MOP problem. In transcribing Japanese broadcast programs, the proposed method reduced relatively a word error rate by 6.3% compared with that achieved by a conventional trigram LM.
URL: https://globals.ieice.org/en_transactions/information/10.1587/transinf.E95.D.2674/_p

Copy

@ARTICLE{e95-d_11_2674,
author={Akio KOBAYASHI, Takahiro OKU, Toru IMAI, Seiichi NAKAGAWA, },
journal={IEICE TRANSACTIONS on Information},
title={Risk-Based Semi-Supervised Discriminative Language Modeling for Broadcast Transcription},
year={2012},
volume={E95-D},
number={11},
pages={2674-2681},
abstract={This paper describes a new method for semi-supervised discriminative language modeling, which is designed to improve the robustness of a discriminative language model (LM) obtained from manually transcribed (labeled) data. The discriminative LM is implemented as a log-linear model, which employs a set of linguistic features derived from word or phoneme sequences. The proposed semi-supervised discriminative modeling is formulated as a multi-objective optimization programming problem (MOP), which consists of two objective functions defined on both labeled lattices and automatic speech recognition (ASR) lattices as unlabeled data. The objectives are coherently designed based on the expected risks that reflect information about word errors for the training data. The model is trained in a discriminative manner and acquired as a solution to the MOP problem. In transcribing Japanese broadcast programs, the proposed method reduced relatively a word error rate by 6.3% compared with that achieved by a conventional trigram LM.},
keywords={},
doi={10.1587/transinf.E95.D.2674},
ISSN={1745-1361},
month={November},}

Copy

TY - JOUR
TI - Risk-Based Semi-Supervised Discriminative Language Modeling for Broadcast Transcription
T2 - IEICE TRANSACTIONS on Information
SP - 2674
EP - 2681
AU - Akio KOBAYASHI
AU - Takahiro OKU
AU - Toru IMAI
AU - Seiichi NAKAGAWA
PY - 2012
DO - 10.1587/transinf.E95.D.2674
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E95-D
IS - 11
JA - IEICE TRANSACTIONS on Information
Y1 - November 2012
AB - This paper describes a new method for semi-supervised discriminative language modeling, which is designed to improve the robustness of a discriminative language model (LM) obtained from manually transcribed (labeled) data. The discriminative LM is implemented as a log-linear model, which employs a set of linguistic features derived from word or phoneme sequences. The proposed semi-supervised discriminative modeling is formulated as a multi-objective optimization programming problem (MOP), which consists of two objective functions defined on both labeled lattices and automatic speech recognition (ASR) lattices as unlabeled data. The objectives are coherently designed based on the expected risks that reflect information about word errors for the training data. The model is trained in a discriminative manner and acquired as a solution to the MOP problem. In transcribing Japanese broadcast programs, the proposed method reduced relatively a word error rate by 6.3% compared with that achieved by a conventional trigram LM.
ER -