Shota FUJII Shohei KAKEI Masanori HIROTOMO Makoto TAKITA Yoshiaki SHIRAISHI Masami MOHRI Hiroki KUZUNO Masakatu MORII
Haoran LUO Tengfei SHAO Tomoji KISHI Shenglei LI
Chee Siang LEOW Tomoki KITAGAWA Hideaki YAJIMA Hiromitsu NISHIZAKI
Dengtian YANG Lan CHEN Xiaoran HAO
Rong HUANG Yue XIE
Toshiki ONISHI Asahi OGUSHI Ryo ISHII Akihiro MIYATA
Meihua XUE Kazuki SUGITA Koichi OTA Wen GU Shinobu HASEGAWA
Jinyong SUN Zhiwei DONG Zhigang SUN Guoyong CAI Xiang ZHAO
Yusuke HIROTA Yuta NAKASHIMA Noa GARCIA
Yusuke HIROTA Yuta NAKASHIMA Noa GARCIA
Kosetsu TSUKUDA Tomoyasu NAKANO Masahiro HAMASAKI Masataka GOTO
ZhengYu LU PengFei XU
Binggang ZHUO Ryota HONDA Masaki MURATA
Qingqing YU Rong JIN
Huawei TAO Ziyi HU Sixian LI Chunhua ZHU Peng LI Yue XIE
Qianhang DU Zhipeng LIU Yaotong SONG Ningning WANG Zeyuan JU Shangce GAO
Ryota TOMODA Hisashi KOGA
Reina SASAKI Atsuko TAKEFUSA Hidemoto NAKADA Masato OGUCHI
So KOIDE Yoshiaki TAKATA Hiroyuki SEKI
Huang Rong Qian Zewen Ma Hao Han Zhezhe Xie Yue
Huu-Long PHAM Ryota MIBAYASHI Takehiro YAMAMOTO Makoto P. KATO Yusuke YAMAMOTO Yoshiyuki SHOJI Hiroaki OHSHIMA
Taku WAKUI Fumio TERAOKA Takao KONDO
Shaobao Wu Zhihua Wu Meixuan Huang
Koji KAMMA Toshikazu WADA
Dingjie PENG Wataru KAMEYAMA
Zhizhong WANG Wen GU Zhaoxing LI Koichi OTA Shinobu HASEGAWA
Tomoaki YAMAZAKI Seiya ITO Kouzou OHARA
Daihei ISE Satoshi KOBAYASHI
Masanari ICHIKAWA Yugo TAKEUCHI
Shota SUZUKI Satoshi ONO
Reoma MATSUO Toru KOIZUMI Hidetsugu IRIE Shuichi SAKAI Ryota SHIOYA
Hirotaka HACHIYA Fumiya NISHIZAWA
Issa SUGIURA Shingo OKAMURA Naoto YANAI
Mudai KOBAYASHI Mohammad Mikal Bin Amrul Halim Gan Takahisa SEKI Takahiro HIROFUCHI Ryousei TAKANO Mitsuhiro KISHIMOTO
Chi ZHANG Luwei ZHANG Toshihiko YAMASAKI
Jung Min Lim Wonho Lee Jun-Hyeong Choi Jong Wook Kwak
Zhuo ZHANG Donghui LI Kun JIANG Ya LI Junhu WANG Xiankai MENG
Takayoshi SHIKANO Shuichi ICHIKAWA
Shotaro ISHIKURA Ryosuke MINAMI Miki YAMAMOTO
Pengfei ZHANG Jinke WANG Yuanzhi CHENG Shinichi TAMURA
Fengqi GUO Qicheng LIU
Runlong HAO Hui LUO Yang LI
Rongchun XIAO Yuansheng LIU Jun ZHANG Yanliang HUANG Xi HAN
Yong JIN Kazuya IGUCHI Nariyoshi YAMAI Rei NAKAGAWA Toshio MURAKAMI
Toru HASEGAWA Yuki KOIZUMI Junji TAKEMASA Jun KURIHARA Toshiaki TANAKA Timothy WOOD K. K. RAMAKRISHNAN
Rikima MITSUHASHI Yong JIN Katsuyoshi IIDA Yoshiaki TAKAI
Zezhong LI Jianjun MA Fuji REN
Lorenzo Mamelona TingHuai Ma Jia Li Bright Bediako-Kyeremeh Benjamin Kwapong Osibo
Wonho LEE Jong Wook KWAK
Xiaoxiao ZHOU Yukinori SATO
Kento WATANABE Masataka GOTO
Kazuyo ONISHI Hiroki TANAKA Satoshi NAKAMURA
Takashi YOKOTA Kanemitsu OOTSU
Chenbo SHI Wenxin SUN Jie ZHANG Junsheng ZHANG Chun ZHANG Changsheng ZHU
Masateru TSUNODA Ryoto SHIMA Amjed TAHIR Kwabena Ebo BENNIN Akito MONDEN Koji TODA Keitaro NAKASAI
Masateru TSUNODA Takuto KUDO Akito MONDEN Amjed TAHIR Kwabena Ebo BENNIN Koji TODA Keitaro NAKASAI Kenichi MATSUMOTO
Hiroaki AKUTSU Ko ARAI
Lanxi LIU Pengpeng YANG Suwen DU Sani M. ABDULLAHI
Xiaoguang TU Zhi HE Gui FU Jianhua LIU Mian ZHONG Chao ZHOU Xia LEI Juhang YIN Yi HUANG Yu WANG
Yingying LU Cheng LU Yuan ZONG Feng ZHOU Chuangao TANG
Jialong LI Takuto YAMAUCHI Takanori HIRANO Jinyu CAI Kenji TEI
Wei LEI Yue ZHANG Hanfeng XIE Zebin CHEN Zengping CHEN Weixing LI
David CLARINO Naoya ASADA Atsushi MATSUO Shigeru YAMASHITA
Takashi YOKOTA Kanemitsu OOTSU
Xiaokang Jin Benben Huang Hao Sheng Yao Wu
Tomoki MIYAMOTO
Ken WATANABE Katsuhide FUJITA
Masashi UNOKI Kai LI Anuwat CHAIWONGYEN Quoc-Huy NGUYEN Khalid ZAMAN
Takaharu TSUBOYAMA Ryota TAKAHASHI Motoi IWATA Koichi KISE
Chi ZHANG Li TAO Toshihiko YAMASAKI
Ann Jelyn TIEMPO Yong-Jin JEONG
Jiakun LI Jiajian LI Yanjun SHI Hui LIAN Haifan WU
Nikolay FEDOROV Yuta YAMASAKI Masateru TSUNODA Akito MONDEN Amjed TAHIR Kwabena Ebo BENNIN Koji TODA Keitaro NAKASAI
Yukasa MURAKAMI Yuta YAMASAKI Masateru TSUNODA Akito MONDEN Amjed TAHIR Kwabena Ebo BENNIN Koji TODA Keitaro NAKASAI
Akira ITO Yoshiaki TAKAHASHI
Rindo NAKANISHI Yoshiaki TAKATA Hiroyuki SEKI
Chuzo IWAMOTO Ryo TAKAISHI
Koichi FUJII Tomomi MATSUI
Kazuyuki AMANO
Takumi SHIOTA Tonan KAMATA Ryuhei UEHARA
Hitoshi MURAKAMI Yutaro YAMAGUCHI
Kento KIMURA Tomohiro HARAMIISHI Kazuyuki AMANO Shin-ichi NAKANO
Ryotaro MITSUBOSHI Kohei HATANO Eiji TAKIMOTO
Naohito MATSUMOTO Kazuhiro KURITA Masashi KIYOMI
Tomohiro KOBAYASHI Tomomi MATSUI
Shin-ichi NAKANO
Ming PAN
Mitsuru ISHIZUKA Shigeo MORISHIMA
In this paper, we present a game of dice that combines multi-party communication with a tangible interface. The game has been used as a testbed to study typical conversational behavior patterns in interactions between human users and synthetic agents. In particular, we were interested in the question to what extent the interaction with the agent can be considered as natural. As an evaluation criterion, we propose to investigate whether the communicative behaviors of humans differ when conversing with an agent as opposed to conversing with other humans.
Helmut PRENDINGER Mitsuru ISHIZUKA
This paper highlights some of our recent research efforts in designing and evaluating life-like characters that are capable of entertaining affective and social communication with human users. The key novelty of our approach is the use of human physiological information: first, as a method to evaluate the effect of life-like character behavior on a moment-to-moment basis, and second, as an input modality for a new generation of interface agents that we call 'physiologically perceptive' life-like characters. By exploiting the stream of primarily involuntary human responses, such as autonomic nervous system activity or eye movements, those characters are expected to respond to users' affective and social needs in a truly sensitive, and hence effective, friendly, and beneficial way.
Kouichi KATSURADA Hiroaki ADACHI Kunitoshi SATO Hirobumi YAMADA Tsuneo NITTA
We have developed Interaction Builder (IB), a rapid prototyping tool for constructing web-based Multi-Modal Interaction (MMI) applications. The goal of IB is making it easy to develop MMI applications with speech recognition, life-like agents, speech synthesis, web browsing, etc. For this purpose, IB supports the following interface and functions: (1) GUI for implementing MMI systems without the details of MMI and MMI description language, (2) functionalities of handling synchronized multimodal inputs/outputs, (3) a test run mode for run-time testing. The results of evaluation tests showed that the application development cycle using IB was significantly shortened in comparison with the time using a text editor both for MMI description language experts and for beginners.
Masahiro ARAKI Akiko KOUZAWA Kenji TACHIBANA
In this paper, we propose a new multimodal interaction description language, MIML (Multimodal Interaction Markup Language), which defines dialogue patterns between human and various types of interactive agents. The feature of this language is three-layered description of agent-based interactive systems. The high-level description is a task definition that can easily construct typical agent-based interactive task control information. The middle-level description is an interaction description that defines agent's behavior and user's input at the granularity of dialogue segment. The low-level description is a platform dependent description that can override the pre-defined function in the interaction description. The connection between task-level and interaction-level is realized by generation of interaction description templates from the task level description. The connection between interaction-level and platform-level is realized by a binding mechanism of XML. As a result of the comparison with other languages, MIML has advantages in high-level interaction description, modality extensibility and compatibility with standardized technologies.
Tatsuo YOTSUKURA Shigeo MORISHIMA Satoshi NAKAMURA
An accurate audio-visual speech corpus is inevitable for talking-heads research. This paper presents our audio-visual speech corpus collection and proposes a head-movement normalization method and a facial motion generation method. The audio-visual corpus contains speech data, movie data on faces, and positions and movements of facial organs. The corpus consists of Japanese phoneme-balanced sentences uttered by a female native speaker. An accurate facial capture is realized by using an optical motion-capture system. We captured high-resolution 3D data by arranging many markers on the speaker's face. In addition, we propose a method of acquiring the facial movements and removing head movements by using affine transformation for computing displacements of pure facial organs. Finally, in order to easily create facial animation from this motion data, we propose a technique assigning the captured data to the facial polygon model. Evaluation results demonstrate the effectiveness of the proposed facial motion generation method and show the relationship between the number of markers and errors.
Makoto TACHIBANA Junichi YAMAGISHI Takashi MASUKO Takao KOBAYASHI
This paper describes an approach to generating speech with emotional expressivity and speaking style variability. The approach is based on a speaking style and emotional expression modeling technique for HMM-based speech synthesis. We first model several representative styles, each of which is a speaking style and/or an emotional expression, in an HMM-based speech synthesis framework. Then, to generate synthetic speech with an intermediate style from representative ones, we synthesize speech from a model obtained by interpolating representative style models using a model interpolation technique. We assess the style interpolation technique with subjective evaluation tests using four representative styles, i.e., neutral, joyful, sad, and rough in read speech and synthesized speech from models obtained by interpolating models for all combinations of two styles. The results show that speech synthesized from the interpolated model has a style in between the two representative ones. Moreover, we can control the degree of expressivity for speaking styles or emotions in synthesized speech by changing the interpolation ratio in interpolation between neutral and other representative styles. We also show that we can achieve style morphing in speech synthesis, namely, changing style smoothly from one representative style to another by gradually changing the interpolation ratio.
Naotake NIWASE Junichi YAMAGISHI Takao KOBAYASHI
This paper presents a new technique for automatically synthesizing human walking motion. In the technique, a set of fundamental motion units called motion primitives is defined and each primitive is modeled statistically from motion capture data using a hidden semi-Markov model (HSMM), which is a hidden Markov model (HMM) with explicit state duration probability distributions. The mean parameter for the probability distribution function of HSMM is assumed to be given by a function of factors that control the walking pace and stride length, and a training algorithm, called factor adaptive training, is derived based on the EM algorithm. A parameter generation algorithm from motion primitive HSMMs with given control factors is also described. Experimental results for generating walking motion are presented when the walking pace and stride length are changed. The results show that the proposing technique can generate smooth and realistic motion, which are not included in the motion capture data, without the need for smoothing or interpolation.
Md. Altab HOSSAIN Rahmadi KURNIA Akio NAKAMURA Yoshinori KUNO
We are developing a helper robot that carries out tasks ordered by the user through speech. The robot needs a vision system to recognize the objects appearing in the orders. It is, however, difficult to realize vision systems that can work in various conditions. Thus, we have proposed to use the human user's assistance through speech. When the vision system cannot achieve a task, the robot makes a speech to the user so that the natural response by the user can give helpful information for its vision system. Our previous system assumes that it can segment images without failure. However, if there are occluded objects and/or objects composed of multicolor parts, segmentation failures cannot be avoided. This paper presents an extended system that tries to recover from segmentation failures using photometric invariance. If the system is not sure about segmentation results, the system asks the user by appropriate expressions depending on the invariant values. Experimental results show the usefulness of the system.
Dai MIYAUCHI Akio NAKAMURA Yoshinori KUNO
Eye contact is an effective means of controlling human communication, such as in starting communication. It seems that we can make eye contact if we simply look at each other. However, this alone does not establish eye contact. Both parties also need to be aware of being watched by the other. We propose a method of bidirectional eye contact satisfying these conditions for human-robot communication. When a human wants to start communication with a robot, he/she watches the robot. If it finds a human looking at it, the robot turns to him/her, changing its facial expressions to let him/her know its awareness of his/her gaze. When the robot wants to initiate communication with a particular person, it moves its body and face toward him/her and changes its facial expressions to make the person notice its gaze. We show several experimental results to prove the effectiveness of this method. Moreover, we present a robot that can recognize hand gestures after making eye contact with the human to show the usefulness of eye contact as a means of controlling communication.
An embodied interactive agent has a virtual body that is generally drawn by CG animation. We intuitively assume that the agent's body primarily expresses non-verbal messages, or symbolizes its social characteristics through its appearance. However, we have not objectively elucidated the expressive competence of an agent's body beyond the conclusions of our empirical and subjective intuition. Therefore, it is necessary to explore scientifically how users regard the functional competence of an agent's embodiment. Do users attribute the intelligence of an agent to its virtual body? We investigated how users physically interact with an agent which is merely a virtual entity drawn on the display by CG, through "showing" something to the eyes of the agent, "listening" to something from the mouth of the agent, and "speaking" something into the ears of the agent. However, such interaction does not necessarily attribute the intellectual processing function to the agent, and this issue is explored through two psychological experiments.
Masashi OKAMOTO Yukiko I. NAKANO Kazunori OKAMOTO Ken'ichi MATSUMURA Toyoaki NISHIDA
In virtue of great progress in computer graphics technologies, CG movies have been getting popular. However, cinematography techniques, which contribute to improving the contents' comprehensibility, need to be learned from professional experiences, and not easily acquired by non-professional people. This paper focuses on film cutting as one of the most important cinematography techniques in conversational scenes, and presents a system that automatically generates shot transitions to improve comprehensibility of CG contents. First, we propose a cognitive model of User Involvement serving as constraints on selecting shot transitions. Then, to examine the validity of the model, we analyze shot transitions in TV programs, and based on the analysis, we implement a CG contents creation system. Results of our preliminary evaluation experiment show the effectiveness of the proposed method, specifically in enhancing contents' comprehensibility.
Jinseok KONG Pen-Chung YEW Gyungho LEE
Directory-based cache coherence schemes are commonly used in large-scale shared-memory multiprocessors, but most of them rely on heuristics to avoid large hardware requirements. We proposed using physical address mapping on directories to significantly reduce directory size needed. This approach allows the size of directory to grow as O(cn log2 n) as in optimal pointer-based directory schemes [11], where n is the number of nodes in the system and c is the number of cache lines in each cache memory. Performance aspects of the proposed scheme are studied in detail using simulation.
Seokjin HONG Bongki MOON Sukho LEE
A range top-k query returns the topmost k records in the order set by a measure attribute within a specified region of multi-dimensional data. The range top-k query is a powerful tool for analysis in spatial databases and data warehouse environments. In this paper, we propose an algorithm to answer the query by selectively traversing an aggregate R-tree having MAX as the aggregate values. The algorithm can execute the query by accessing only a small part of the leaf nodes within a query region. Therefore, it shows good query performance regardless of the size of the query region. We suggest an efficient pruning technique for the priority queue, which reduces the cost of handling the priority queue, and also propose an efficient technique for leaf node organization to reduce the number of node accesses to execute the range top-k queries.
Tomoya KITAI Tomohiro YONEDA Chris MYERS
This work proposes a technique to automatically obtain timing constraints for a given timed circuit to operate correctly. A designated set of delay parameters of a circuit are first set to sufficiently large bounds, and verification runs followed by failure analysis are repeated. Each verification run performs timed state space enumeration under the given delay bounds, and produces a failure trace if it exists. The failure trace is analyzed, and sufficient timing constraints to prevent the failure are obtained. Then, the delay bounds are tightened according to the timing constraints by using an ILP (Integer Linear Programming) solver. This process terminates when either some delay bounds under which no failure is detected are found or no new delay bounds to prevent the failures can be obtained. The experimental results using a naive implementation show that the proposed method can efficiently handle asynchronous benchmark circuits and nontrivial GasP circuits.
Tatsuya MIZUTANI Takehiko KAGOSHIMA
This paper proposes a novel speech synthesis method to generate human-like natural speech. The conventional unit-selection-based synthesis method selects speech units from a large database, and concatenates them with or without modifying the prosody to generate synthetic speech. This method features highly human-like voice quality. The method, however, has a problem that a suitable speech unit is not necessarily selected. Since the unsuitable speech unit selection causes discontinuity between the consecutive speech units, the synthesized speech quality deteriorates. It might be considered that the conventional method can attain higher speech quality if the database size increases. However, preparation of a larger database requires a longer recording time. The narrator's voice quality does not remain constant throughout the recording period. This fact deteriorates the database quality, and still leaves the problem of unsuitable selection. We propose the plural unit selection and fusion method which avoids this problem. This method integrates the unit fusion used in the unit-training-based method with the conventional unit-selection-based method. The proposed method selects plural speech units for each segment, fuses the selected speech units for each segment, modifies the prosody of the fused speech units, and concatenates them to generate synthetic speech. This unit fusion creates speech units which are connected to one another with much less voice discontinuity, and realizes high quality speech. A subjective evaluation test showed that the proposed method greatly improves the speech quality compared with the conventional method. Also, it showed that the speech quality of the proposed method is kept high regardless of the database size, from small (10 minutes) to large (40 minutes). The proposed method is a new framework in the sense that it is a hybrid method between the unit-selection-based method and the unit-training-based method. In the framework, the algorithms of the unit selection and the unit fusion are exchangeable for more efficient techniques. Thus, the framework is expected to lead to new synthesis methods.
Seung-In NOH Kwanghyuk BAE Kang Ryoung PARK Jaihie KIM
In a conventional method based on quadrature 2D Gabor wavelets to extract iris features, the iris recognition is performed by a 256-byte iris code, which is computed by applying the Gabor wavelets to a given area of the iris. However, there is a code redundancy because the iris code is generated by basis functions without considering the characteristics of the iris texture. Therefore, the size of the iris code is increased unnecessarily. In this paper we propose a new feature extraction algorithm based on independent component analysis (ICA) for a compact iris code. We implemented the ICA to generate optimal basis functions which could represent iris signals efficiently. In practice the coefficients of the ICA expansions are used as feature vectors. Then iris feature vectors are encoded into the iris code for storing and comparing individual's iris patterns. Additionally, we introduce a method to refine the ICA basis functions for improving the recognition performance. Experimental results show that our proposed method has a similar equal error rate as a conventional method based on the Gabor wavelets, and the iris code size of our proposed methods is five times smaller than that of the Gabor wavelets.
Jorji NONAKA Nobuyuki KUKIMOTO Yasuo EBARA Masato OGATA Takeshi IWASHITA Masanori KANAZAWA Koji KOYAMADA
Volume Graphics Clusters (VG Clusters) have proven to be efficient in a wide range of visualization applications and have also shown promise in some other applications where the image composition device could be fully utilized. The main differentiating feature from other graphics clusters is a specialized image composition device, commercially available as the MPC Image Compositor, which enables the building of do-it-yourself VG Clusters. Although this device is highly scalable, the unidirectional composition flow limits the data subdivision to the quantity of physically available rendering nodes. In addition, the limited buffer memory limits the maximum capable image composition size, therefore limiting its use in large-scale data visualization and high-resolution visualization. To overcome these limitations, we propose and evaluate an image composition mechanism in which additional hardware is used for assisting the image composition process. Because of the synergistic use of two distinct image composition hardware devices we named it "Hybrid Image Composition". Some encouraging results were obtained showing the effectiveness of this solution in improving the VG Cluster 's potential. A low-cost parallel port based hardware barrier is also presented as an efficient method for further enhancing this kind of small-scale VG Cluster. Moreover, this solution has proven to be especially useful in clusters built using low-speed networks, such as Fast Ethernet, which are still in common use.
Hideya TAKEO Kazuo SHIMURA Takashi IMAMURA Akinobu SHIMIZU Hidefumi KOBATAKE
CR (Computed Radiography) is characterized by high sensitivity and wide dynamic range. Moreover, it has the advantage of being able to transfer exposed images directly to a computer-aided detection (CAD) system which is not possible using conventional film digitizer systems. This paper proposes a high-performance clustered microcalcification detection system for CR mammography. Before detecting and classifying candidate regions, the system preprocesses images with a normalization step to take into account various imaging conditions and to enhance microcalcifications with weak contrast. Large-scale experiments using images taken under various imaging conditions at seven hospitals were performed. According to analysis of the experimental results, the proposed system displays high performance. In particular, at a true positive detection rate of 97.1%, the false positive clusters average is only 0.4 per image. The introduction of geometrical features of each microcalcification for identifying true microcalcifications contributed to the performance improvement. One of the aims of this study was to develop a system for practical use. The results indicate that the proposed system is promising.
Myungseok KANG Jaeyun JUNG Younghoon WHANG Youngyong KIM Hagbae KIM
This paper presents a Fault-Tolerant Object Group (FTOG) model that provides the group management service and the fault-tolerance service for consistency maintenance and state transparency. Through Intelligent Home Network Simulator, we verify that FTOG model supports both of reliability and the stability of the distributed system.
Joon-Hyuk CHANG Sanjit K. MITRA
This paper describes a multiband vector quantization (VQ) technique based on inner product for wideband speech coding at 16 kb/s. Our approach consists of splitting the input speech into two separate bands and then applying an independent coding scheme for each band. A code excited linear prediction (CELP) coder is used in the lower band while a transform based coding strategy is applied in the higher band. The spectral components in the higher frequency band are represented by a set of modulated lapped transform (MLT) coefficients. The higher frequency band is divided into three subbands, and the MLT coefficients construct a vector for each subband. Specifically, for the VQ of these vectors, an inner product-based distance measure is proposed as a new strategy. The proposed 16 kb/s coder with the inner-product based distortion measure achieves better performance than the 48 kb/s ITU-T G.722 in subjective quality tests.
Xi LI Zhengnan NING Liuwei XIANG
The problem of multi-body motion segmentation is important in many computer vision applications. In this paper, we propose a novel algorithm called fuzzy k-subspace clustering for robust segmentation. The proposed method exploits the property that under orthographic camera model the tracked feature points of moving objects reside in multiple subspaces. We compute a partition of feature points into corresponding subspace clusters. First, we find a "soft partition" of feature points based on fuzzy k-subspace algorithm. The proposed fuzzy k-subspace algorithm iteratively minimizes the objective function using Weighted Singular Value Decomposition. Then the points with high partition confidence are gathered to form the subspace bases and the remaining points are classified using their distance to the bases. The proposed method can handle the case of missing data naturally, meaning that the feature points do not have to be visible throughout the sequence. The method is robust to noise and insensitive to initialization. Extensive experiments on synthetic and real data show the effectiveness of the proposed fuzzy k-subspace clustering algorithm.