1-3hit |
Michael HENTSCHEL Marc DELCROIX Atsunori OGAWA Tomoharu IWATA Tomohiro NAKATANI
Language models are a key technology in various tasks, such as, speech recognition and machine translation. They are usually used on texts covering various domains and as a result domain adaptation has been a long ongoing challenge in language model research. With the rising popularity of neural network based language models, many methods have been proposed in recent years. These methods can be separated into two categories: model based and feature based adaptation methods. Feature based domain adaptation has compared to model based domain adaptation the advantage that it does not require domain labels in the corpus. Most existing feature based adaptation methods are based on bias adaptation. We propose a novel feature based domain adaptation technique using hidden layer factorisation. This method is fundamentally different from existing methods because we use the domain features to calculate a linear combination of linear layers. These linear layers can capture domain specific information and information common to different domains. In the experiments, we compare our proposed method with existing adaptation methods. The compared adaptation techniques are based on two different ideas, that is, bias based adaptation and gating of hidden units. All language models in our comparison use state-of-the-art long short-term memory based recurrent neural networks. We demonstrate the effectiveness of the proposed method with perplexity results for the well-known Penn Treebank and speech recognition results for a corpus of TED talks.
Marc DELCROIX Takafumi HIKICHI Masato MIYOSHI
It is well known that speech captured in a room by distant microphones suffers from distortions caused by reverberation. These distortions may seriously damage both speech characteristics and intelligibility, and consequently be harmful to many speech applications. To solve this problem, we proposed a dereverberation algorithm based on multi-channel linear prediction. The method is as follows. First we calculate prediction filters that cancel out the room reverberation but also degrade speech characteristics by causing excessive whitening of the speech. Then, we evaluate the prediction-filter degradation to compensate for the excessive whitening. As the reverberation lengthens, the compensation performance becomes worse due to computational accuracy problems. In this paper, we propose a new computation that may improve compensation accuracy when dealing with long reverberation.
Masato MIYOSHI Marc DELCROIX Keisuke KINOSHITA
Speech dereverberation is one of the most difficult tasks in acoustic signal processing. Of the various problems involved in this task, this paper highlights "over-whitening," which flattens the characteristics of recovered speech. This distortion sometimes happens when inverse filters are directly calculated from microphone signals. This paper reviews two studies related to this problem. The first study shows the possibility of compensating for such over-whitening to achieve precise speech-dereverberation. The second study presents a new approach for approximating the original speech by removing the effect of late reflections from observed reverberant speech.