IEICE globals.ieice.org Site

Author Search Result

[Author] Atsushi NAKAMURA(11hit)

1-11hit

A Speech Translation System Applied to a Real-World Task/Domain and Its Evaluation Using Real-World Speech Data
Atsushi NAKAMURA Masaki NAITO Hajime TSUKADA Rainer GRUHN Eiichiro SUMITA Hideki KASHIOKA Hideharu NAKAJIMA Tohru SHIMIZU Yoshinori SAGISAKA

PAPER-Speech and Hearing

Vol:
E84-D No:1
Page(s):
142-154
This paper describes an application of a speech translation system to another task/domain in the real-world by using developmental data collected from real-world interactions. The total cost for this task-alteration was calculated to be 9 Person-Month. The newly applied system was also evaluated by using speech data collected from real-world interactions. For real-world speech having a machine-friendly speaking style, the newly applied system could recognize typical sentences with a word accuracy of 90% or better. We also found that, concerning the overall speech translation performance, the system could translate about 80% of the input Japanese speech into acceptable English sentences.
Chip Level Simulation of Substrate Noise Coupling and Interference in RF ICs with CMOS Digital Noise Emulator
Naoya AZUMA Shunsuke SHIMAZAKI Noriyuki MIURA Makoto NAGATA Tomomitsu KITAMURA Satoru TAKAHASHI Motoki MURAKAMI Kazuaki HORI Atsushi NAKAMURA Kenta TSUKAMOTO Mizuki IWANAMI Eiji HANKUI Sho MUROGA Yasushi ENDO Satoshi TANAKA Masahiro YAMAGUCHI

PAPER

Vol:
E97-C No:6
Page(s):
546-556
Substrate noise coupling in RF receiver front-end circuitry for LTE wireless communication was examined by full-chip level simulation and on-chip measurements, with a demonstrator built in a 65nm CMOS technology. A CMOS digital noise emulator injects high-order harmonic noises in a silicon substrate and induces in-band spurious tones in an RF receiver on the same chip through substrate noise interference. A complete simulation flow of full-chip level substrate noise coupling uses a decoupled modeling approach, where substrate noise waveforms drawn with a unified package-chip model of noise source circuits are given to mixed-level simulation of RF chains as noise sensitive circuits. The distribution of substrate noise in a chip and the attenuation with distance are simulated and compared with the measurements. The interference of substrate noise at the 17th harmonics of 124.8MHz — the operating frequency of the CMOS noise emulator creates spurious tones in the communication bandwidth at 2.1GHz.
BOTDA-Based Technique for Measuring Maximum Loss and Crosstalk at Splice Point in Few-Mode Fibers Open Access
Tomokazu ODA Atsushi NAKAMURA Daisuke IIDA Hiroyuki OSHIDA

PAPER-Optical Fiber for Communications

Pubricized:
2021/11/05
Vol:
E105-B No:5
Page(s):
504-511
We propose a technique based on Brillouin optical time domain analysis for measuring loss and crosstalk in few-mode fibers (FMFs). The proposed technique extracts the loss and crosstalk of a specific mode in FMFs from the Brillouin gains and Brillouin gain coefficients measured under two different conditions in terms of the frequency difference between the pump and probe lights. The technique yields the maximum loss and crosstalk at a splice point by changing the electrical field injected into an FMF as the pump light. Experiments demonstrate that the proposed technique can measure the maximum loss and crosstalk of the LP11 mode at a splice point in a two-mode fiber.
Improved Sequential Dependency Analysis Integrating Labeling-Based Sentence Boundary Detection
Takanobu OBA Takaaki HORI Atsushi NAKAMURA

PAPER-Natural Language Processing

Vol:
E93-D No:5
Page(s):
1272-1281
A dependency structure interprets modification relationships between words or phrases and is recognized as an important element in semantic information analysis. With the conventional approaches for extracting this dependency structure, it is assumed that the complete sentence is known before the analysis starts. For spontaneous speech data, however, this assumption is not necessarily correct since sentence boundaries are not marked in the data. Although sentence boundaries can be detected before dependency analysis, this cascaded implementation is not suitable for online processing since it delays the responses of the application. To solve these problems, we proposed a sequential dependency analysis (SDA) method for online spontaneous speech processing, which enabled us to analyze incomplete sentences sequentially and detect sentence boundaries simultaneously. In this paper, we propose an improved SDA integrating a labeling-based sentence boundary detection (SntBD) technique based on Conditional Random Fields (CRFs). In the new method, we use CRF for soft decision of sentence boundaries and combine it with SDA to retain its online framework. Since CRF-based SntBD yields better estimates of sentence boundaries, SDA can provide better results in which the dependency structure and sentence boundaries are consistent. Experimental results using spontaneous lecture speech from the Corpus of Spontaneous Japanese show that our improved SDA outperforms the original SDA with SntBD accuracy providing better dependency analysis results.
Wide-Tuning-Wavelength-Range LGLC Laser with Low-Loss Dual-Core Spot Size Converter
Takanori SUZUKI Hideo ARIMOTO Takeshi KITATANI Aki TAKEI Takafumi TANIGUCHI Kazunori SHINODA Shigehisa TANAKA Shinji TSUJI Tatemi IDO Jun IGRASHI Atsushi NAKAMURA Kazuhiko NAOE Kenji UCHIDA

BRIEF PAPER

Vol:
E95-C No:7
Page(s):
1272-1275
A dual-core spot size converter (DC-SSC) is integrated with a lateral grating assisted lateral co-directional coupler (LGLC) tunable laser by using no additional complicated fabrication processes. The excess loss due to the DC-SSC is only 0.5 dB, and narrow full width half maximums (FWHMs) of vertical and horizontal far-field patterns (FFPs) produced by the laser are about 25° and 20°. This integration causes no degradations of the performance of the LGLC laser; in other words, it maintains good lasing characteristics, namely, wide tuning range of over 68 nm and SMSR of over 35 dB in the C-band under a 50 semi-cooled condition.
Speech Recognition Based on Student's t-Distribution Derived from Total Bayesian Framework
Shinji WATANABE Atsushi NAKAMURA

PAPER-Speech Recognition

Vol:
E89-D No:3
Page(s):
970-980
We introduce a robust classification method based on the Bayesian predictive distribution (Bayesian Predictive Classification, referred to as BPC) for speech recognition. We and others have recently proposed a total Bayesian framework named Variational Bayesian Estimation and Clustering for speech recognition (VBEC). VBEC includes the practical computation of approximate posterior distributions that are essential for BPC, based on variational Bayes (VB). BPC using VB posterior distributions (VB-BPC) provides an analytical solution for the predictive distribution as the Student's t-distribution, which can mitigate the over-training effects by marginalizing the model parameters of an output distribution. We address the sparse data problem in speech recognition, and show experimentally that VB-BPC is robust against data sparseness.
Production-Oriented Models for Speech Recognition
Erik MCDERMOTT Atsushi NAKAMURA

PAPER-Speech Recognition

Vol:
E89-D No:3
Page(s):
1006-1014
Acoustic modeling in speech recognition uses very little knowledge of the speech production process. At many levels our models continue to model speech as a surface phenomenon. Typically, hidden Markov model (HMM) parameters operate primarily in the acoustic space or in a linear transformation thereof; state-to-state evolution is modeled only crudely, with no explicit relationship between states, such as would be afforded by the use of phonetic features commonly used by linguists to describe speech phenomena, or by the continuity and smoothness of the production parameters governing speech. This survey article attempts to provide an overview of proposals by several researchers for improving acoustic modeling in these regards. Such topics as the controversial Motor Theory of Speech Perception, work by Hogden explicitly using a continuity constraint in a pseudo-articulatory domain, the Kalman filter based Hidden Dynamic Model, and work by many groups showing the benefits of using articulatory features instead of phones as the underlying units of speech, will be covered.
Selection of Shared-State Hidden Markov Model Structure Using Bayesian Criterion
Shinji WATANABE Yasuhiro MINAMI Atsushi NAKAMURA Naonori UEDA

PAPER

Vol:
E88-D No:1
Page(s):
1-9
A Shared-State Hidden Markov Model (SS-HMM) has been widely used as an acoustic model in speech recognition. In this paper, we propose a method for constructing SS-HMMs within a practical Bayesian framework. Our method derives the Bayesian model selection criterion for the SS-HMM based on the variational Bayesian approach. The appropriate phonetic decision tree structure of the SS-HMM is found by using the Bayesian criterion. Unlike the conventional asymptotic criteria, this criterion is applicable even in the case of an insufficient amount of training data. The experimental results on isolated word recognition demonstrate that the proposed method does not require the tuning parameter that must be tuned according to the amount of training data, and is useful for selecting the appropriate SS-HMM structure for practical use.
Efficient Combination of Likelihood Recycling and Batch Calculation for Fast Acoustic Likelihood Calculation
Atsunori OGAWA Satoshi TAKAHASHI Atsushi NAKAMURA

PAPER-Speech and Hearing

Vol:
E94-D No:3
Page(s):
648-658
This paper proposes an efficient combination of state likelihood recycling and batch state likelihood calculation for accelerating acoustic likelihood calculation in an HMM-based speech recognizer. Recycling and batch calculation are each based on different technical approaches, i.e. the former is a purely algorithmic technique while the latter fully exploits computer architecture. To accelerate the recognition process further by combining them efficiently, we introduce conditional fast processing and acoustic backing-off. Conditional fast processing is based on two criteria. The first potential activity criterion is used to control not only the recycling of state likelihoods at the current frame but also the precalculation of state likelihoods for several succeeding frames. The second reliability criterion and acoustic backing-off are used to control the choice of recycled or batch calculated state likelihoods when they are contradictory in the combination and to prevent word accuracies from degrading. Large vocabulary spontaneous speech recognition experiments using four different CPU machines under two environmental conditions showed that, compared with the baseline recognizer, recycling and batch calculation, our combined acceleration technique further reduced both of the acoustic likelihood calculation time and the total recognition time. We also performed detailed analyses to reveal each technique's acceleration and environmental dependency mechanisms by classifying types of state likelihoods and counting each of them. The analysis results comfirmed the effectiveness of the combined acceleration technique.
An Approach for Practical Use of Common-Mode Noise Reduction Technique for In-Vehicle Electronic Equipment
Takanori UNO Kouji ICHIKAWA Yuichi MABUCHI Atsushi NAKAMURA Yuji OKAZAKI Hideki ASAI

PAPER-Transmission Lines and Cables

Vol:
E93-B No:7
Page(s):
1788-1796
In this paper, we studied the use of common-mode noise reduction technique for in-vehicle electronic equipment in an actual instrument design. We have improved the circuit model of the common-mode noise that flows to the wire harness to add the effect of a bypass capacitor located near the LSI. We analyzed the improved circuit model using a circuit simulator and verified the effectiveness of the noise reduction condition derived from the circuit model. It was also confirmed that offsetting the impedance mismatch in the PCB section requires to make a circuit constant larger than that necessary for doing the impedance mismatch in the LSI section. An evaluation circuit board comprising an automotive microcomputer was prototyped to experiment on the common-mode noise reduction effect of the board. The experimental results showed the noise reduction effect of the board. The experimental results also revealed that the degree of impedance mismatch in the LSI section can be estimated by using a PCB having a known impedance. We further inquired into the optimization of impedance parameters, which is difficult for actual products at present. To satisfy the noise reduction condition composed of numerous parameters, we proposed a design method using an optimization algorithm and an electromagnetic field simulator, and confirmed its effectiveness.
Model Shrinkage for Discriminative Language Models
Takanobu OBA Takaaki HORI Atsushi NAKAMURA Akinori ITO

PAPER-Speech and Hearing

Vol:
E95-D No:5
Page(s):
1465-1474
This paper describes a technique for overcoming the model shrinkage problem in automatic speech recognition (ASR), which allows application developers and users to control the model size with less degradation of accuracy. Recently, models for ASR systems tend to be large and this can constitute a bottleneck for developers and users without special knowledge of ASR with respect to introducing the ASR function. Specifically, discriminative language models (DLMs) are usually designed in a high-dimensional parameter space, although DLMs have gained increasing attention as an approach for improving recognition accuracy. Our proposed method can be applied to linear models including DLMs, in which the score of an input sample is given by the inner product of its features and the model parameters, but our proposed method can shrink models in an easy computation by obtaining simple statistics, which are square sums of feature values appearing in a data set. Our experimental results show that our proposed method can shrink a DLM with little degradation in accuracy and perform properly whether or not the data for obtaining the statistics are the same as the data for training the model.

Author Search Result

[Author] Atsushi NAKAMURA(11hit)

A Speech Translation System Applied to a Real-World Task/Domain and Its Evaluation Using Real-World Speech Data

Chip Level Simulation of Substrate Noise Coupling and Interference in RF ICs with CMOS Digital Noise Emulator

BOTDA-Based Technique for Measuring Maximum Loss and Crosstalk at Splice Point in Few-Mode Fibers Open Access

Improved Sequential Dependency Analysis Integrating Labeling-Based Sentence Boundary Detection

Wide-Tuning-Wavelength-Range LGLC Laser with Low-Loss Dual-Core Spot Size Converter

Speech Recognition Based on Student's t-Distribution Derived from Total Bayesian Framework

Production-Oriented Models for Speech Recognition

Selection of Shared-State Hidden Markov Model Structure Using Bayesian Criterion

Efficient Combination of Likelihood Recycling and Batch Calculation for Fast Acoustic Likelihood Calculation

An Approach for Practical Use of Common-Mode Noise Reduction Technique for In-Vehicle Electronic Equipment

Model Shrinkage for Discriminative Language Models

Latest Issue

FlyerIEICE has prepared a flyer regarding multilingual services. Please use the one in your native language.

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles