1-3hit |
Masatoshi SATO Hisashi AOMORI Mamoru TANAKA
In advance of network communication society by the internet, the way how to send data fast with a little loss becomes an important transportation problem. A generalized maximum flow algorithm gives the best solution for the transportation problem that which route is appropriated to exchange data. Therefore, the importance of the maximum flow algorithm is growing more and more. In this paper, we propose a Maximum-Flow Neural Network (MF-NN) in which branch nonlinearity has a saturation characteristic and by which the maximum flow problem can be solved with analog high-speed parallel processing. That is, the proposed neural network for the maximum flow problem can be realized by a nonlinear resistive circuit where each connection weight between nodal neurons has a sigmodal or piece-wise linear function. The parallel hardware of the MF-NN will be easily implemented.
Satoshi SATO Kazutoyo TAKATA Kunio NOBORI
We present a method for classifying image pixels of real images into multiple photometric factors: specular reflection, diffuse reflection, attached shadows and cast shadows. Conventional photometric linearization methods cannot correctly classify pixels under near point light sources, since they assume parallel light. To satisfy this assumption, our method utilizes a photometric linearization method that divides images into small regions. It also propagates linearization coefficients from neighboring regions. Our experimental results show that the proposed method can correctly classify image pixels into photometric factors, even if images are obtained under near point light sources.
Kazunori KOMATANI Naoki HOTTA Satoshi SATO Mikio NAKANO
Appropriate turn-taking is important in spoken dialogue systems as well as generating correct responses. Especially if the dialogue features quick responses, a user utterance is often incorrectly segmented due to short pauses within it by voice activity detection (VAD). Incorrectly segmented utterances cause problems both in the automatic speech recognition (ASR) results and turn-taking: i.e., an incorrect VAD result leads to ASR errors and causes the system to start responding though the user is still speaking. We develop a method that performs a posteriori restoration for incorrectly segmented utterances and implement it as a plug-in for the MMDAgent open-source software. A crucial part of the method is to classify whether the restoration is required or not. We cast it as a binary classification problem of detecting originally single utterances from pairs of utterance fragments. Various features are used representing timing, prosody, and ASR result information. Experiments show that the proposed method outperformed a baseline with manually-selected features by 4.8% and 3.9% in cross-domain evaluations with two domains. More detailed analysis revealed that the dominant and domain-independent features were utterance intervals and results from the Gaussian mixture model (GMM).