1-4hit |
Dan MIKAMI Kazuhiro OTSUKA Shiro KUMANO Junji YAMATO
A novel enhancement for the memory-based particle filter is proposed for visual pose tracking under severe occlusions. The enhancement is the addition of a detection-based memory acquisition mechanism. The memory-based particle filter, called M-PF, is a particle filter that predicts prior distributions from past history of target state stored in memory. It can achieve high robustness against abrupt changes in movement direction and quick recovery from target loss due to occlusions. Such high performance requires sufficient past history stored in the memory. Conventionally, M-PF conducts online memory acquisition which assumes simple target dynamics without occlusions for guaranteeing high-quality histories of the target track. The requirement of memory acquisition narrows the coverage of M-PF in practice. In this paper, we propose a new memory acquisition mechanism for M-PF that well supports application in practical conditions including complex dynamics and severe occlusions. The key idea is to use a target detector that can produce additional prior distribution of the target state. We call it M-PFDMA for M-PF with detection-based memory acquisition. The detection-based prior distribution well predicts possible target position/pose even in limited-visibility conditions caused by occlusions. Such better prior distributions contribute to stable estimation of target state, which is then added to memorized data. As a result, M-PFDMA can start with no memory entries but soon achieve stable tracking even in severe conditions. Experiments confirm M-PFDMA's good performance in such conditions.
Minoru MORI Minako SAWAKI Junji YAMATO
This paper describes an adaptive feature extraction method that exploits category-specific information to overcome both image degradation and deformation in character recognition. When recognizing multiple fonts, geometric features such as directional information of strokes are often used but they are weak against the deformation and degradation that appear in videos or natural scenes. To tackle these problems, the proposed method estimates the degree of deformation and degradation of an input pattern by comparing the input pattern and the template of each category as category-specific information. This estimation enables us to compensate the aspect ratio associated with shape and the degradation in feature values and so obtain higher recognition accuracy. Recognition experiments using characters extracted from videos show that the proposed method is superior to the conventional alternatives in resisting deformation and degradation.
Shiro KUMANO Kazuhiro OTSUKA Masafumi MATSUDA Junji YAMATO
This study analyzes emotions established between people while interacting in face-to-face conversation. By focusing on empathy and antipathy, especially the process by which they are perceived by external observers, this paper aims to elucidate the tendency of their perception and from it develop a computational model that realizes the automatic inference of perceived empathy/antipathy. This paper makes two main contributions. First, an experiment demonstrates that an observer's perception of an interacting pair is affected by the time lags found in their actions and reactions in facial expressions and by whether their expressions are congruent or not. For example, a congruent but delayed reaction is unlikely to be perceived as empathy. Based on our findings, we propose a probabilistic model that relates the perceived empathy/antipathy of external observers to the actions and reactions of conversation participants. An experiment is conducted on ten conversations performed by 16 women in which the perceptions of nine external observers are gathered. The results demonstrate that timing cues are useful in improving the inference performance, especially for perceived antipathy.
Keiji HIRATA Yasunori HARADA Toshihiro TAKADA Naomi YAMASHITA Shigemi AOYAGI Yoshinari SHIRAI Katsuhiko KAJI Junji YAMATO Kenji NAKAZAWA
We propose a 2D display and camera arrangement for video communication systems that supports both spatial information between distant sites and user mobility. The implementation of this arrangement is called the "surrounding back screen method." The method enables users to freely come from and go into other users' spaces and provides every user with the direct pointing capability, since there is no apparent spatial barrier separating users, unlike the case of conventional video communication systems. In this paper, we introduce two properties ("sharedness" and "exclusiveness") and three parameters (a distance and two angles) to represent the geometrical relationship between two users. These properties and parameters are used to classify the shared spaces created by a video communication system and to investigate the surrounding back screen method. Furthermore, to demonstrate and explore our surrounding back screen method, we have developed a prototype system, called t-Room. Taking into account practical situations, we studied a practical case where two t-Rooms with different layouts are connected.