IEICE globals.ieice.org Site

Author Search Result

[Author] Masahide KANEKO(10hit)

1-10hit

Toward the New Era of Visual Communication
Masahide KANEKO Fumio KISHINO Kazunori SHIMAMURA Hiroshi HARASHIMA

INVITED PAPER

Vol:
E76-B No:6
Page(s):
577-591
Recently, studies aiming at the next generation of visual communication services which support better human communication have been carried out intensively in Japan. The principal motive of these studies is to develop new services which are not restricted to a conventional communication framework based on the transmission of waveform signals. This paper focuses on three important key words in these studies; "intelligent," "real," and "distributed and collaborative," and describes recent research activities. The first key word "intelligent" relates to intelligent image coding. As a particular example, model-based coding of moving facial images is discussed in detail. In this method, shape change and motion of the human face is described by a small number of parameters. This feature leads to the development of new applications such as very low bit-rate transmission of moving facial images, analysis and synthesis of facial expression, human interfaces, and so on. The second key word "real" relates to communication with realistic sensations and virtual space teleconferencing. Among various component technologies, real-time reproduction of 3-D human images and a cooperative work environment with virtual space are discussed in detail. The last key word "distributed and collaborative" relates to collaborative work in a distributed work environment. The importance of visual media in collaborative work, a concept of CSCW, and requirements for realizing a distributed collaborative environment are discussed. Then, four examples of CSCW systems are briefly outlined.
Block-Based Bag of Words for Robust Face Recognition under Variant Conditions of Facial Expression, Illumination, and Partial Occlusion
Zisheng LI Jun-ichi IMAI Masahide KANEKO

PAPER-Processing

Vol:
E94-A No:2
Page(s):
533-541
In many real-world face recognition applications, there might be only one training image per person available. Moreover, the test images may vary in facial expressions and illuminations, or may be partially occluded. However, most classical face recognition techniques assume that multiple images per person are available for training, and they are difficult to deal with extreme expressions, illuminations and occlusions. This paper proposes a novel block-based bag of words (BBoW) method to solve those problems. In our approach, a face image is partitioned into multiple blocks, dense SIFT features are then calculated and vector quantized into different visual words on each block respectively. Finally, histograms of codeword distribution on each local block are concatenated to represent the face image. Our method is able to capture local features on each block while maintaining holistic spatial information of different facial components. Without any illumination compensation or image alignment processing, the proposed method achieves excellent face recognition results on AR and XM2VTS databases. Experimental results show that only using one neutral expression frame per person for training, our method can obtain the best performance ever on face images of AR database with extreme expressions, variant illuminations, and partial occlusions. We also test our method on the standard and darkened sets of XM2VTS database, and achieve the average rates of 100% and 96.10% on the standard and darkened sets of XM2VTS database, respectively.
Processing of Face Images and Its Applications
Masahide KANEKO Osamu HASEGAWA

INVITED SURVEY PAPER

Vol:
E82-D No:3
Page(s):
589-600
Human faces convey various information, including that is specific to each individual person and that is part of mutual communication among persons. Information exhibited by a "face" is what is called "non-verbal information" and usually verbal media cannot easily describe such information appropriately. Recently, detailed studies on the processing of face images by a computer have been carried out in the engineering field for applications to communication media and human computer interaction as well as automatic identification of human faces. Two main technical topics are the recognition of human faces and the synthesis of face images. The objective of the former is to enable a computer to detect and identify users and further to recognize their facial expressions, while that of the latter is to provide a natural and impressive user interface on a computer in the form of a "face. " These studies have also been found to be useful in various non-engineering fields related to a face, such as psychology, anthropology, cosmetology and dentistry. Most of the studies in these different fields have been carried out independently up to now, although all of them deal with a "face. " Now in virtue of the progress in the above engineering technologies a common study tools and databases for facial information have become available. On the basis of these backgrounds, this paper surveys recent research trends in the processing of face images by a computer and its typical applications. Firstly, the various characteristics of faces are considered. Secondly, recent research activities in the recognition and synthesis of face images are outlined. Thirdly, the applications of digital processing methods of facial information are discussed from several standpoints: intelligent image coding, media handling, human computer interaction, caricature, facial impression, psychological and medical applications. The common tools and databases used in the studies of processing of facial information and some related topics are also described.
Compression and Representation of 3-D Images
Takeshi NAEMURA Masahide KANEKO Hiroshi HARASHIMA

INVITED SURVEY PAPER

Vol:
E82-D No:3
Page(s):
558-567
This paper surveys the results of various studies on 3-D image coding. Themes are focused on efficient compression and display-independent representation of 3-D images. Most of the works on 3-D image coding have been concentrated on the compression methods tuned for each of the 3-D image formats (stereo pairs, multi-view images, volumetric images, holograms and so on). For the compression of stereo images, several techniques concerned with the concept of disparity compensation have been developed. For the compression of multi-view images, the concepts of disparity compensation and epipolar plane image (EPI) are the efficient ways of exploiting redundancies between multiple views. These techniques, however, heavily depend on the limited camera configurations. In order to consider many other multi-view configurations and other types of 3-D images comprehensively, more general platform for the 3-D image representation is introduced, aiming to outgrow the framework of 3-D "image" communication and to open up a novel field of technology, which should be called the "spatial" communication. Especially, the light ray based method has a wide range of application, including efficient transmission of the physical world, as well as integration of the virtual and physical worlds.
Recent and Current Research on Very Low Bit-Rate Video Coding in Japan
Masahide KANEKO

INVITED PAPER

Vol:
E79-B No:10
Page(s):
1415-1424
This paper presents an overview of research activities in Japan in the field of very low bit-rate video coding. Related research based on the concept of "intelligent image coding" started in the mid-1980's. Although this concept originated from the consideration of a new type of image coding, it can also be applied to other interesting applications such as human interface and psychology. On the other hand, since the beginning of the 1990's, research on the improvement of waveform coding has been actively performed to realize very low bit-rate video coding. Key techniques employed here are improvement of motion compensation and adoption of region segmentation. In addition to the above, we propose new concepts of image coding, which have the potential to open up new aspects of image coding, e.g., ideas of interactive image coding, integrated 3-D visual communication and coding of multimedia information considering mutual relationship amongst various media.
Interactive Model-Based Coding of Facial Image Sequence with a New Motion Detection Algorithm
Kazuo OHZEKI Takahiro SAITO Masahide KANEKO Hiroshi HARASHIMA

PAPER

Vol:
E79-B No:10
Page(s):
1474-1483
To make the model-based coding a practical method, new signal processing techniques other than fully-automatic image recognition should be studied. Also after having realized the model-based coding, another new signal processing technique to improve the performance of the model-based coding should be studied. Moreover non-coding functions related to the model-based coding can be embedded as additional features. The authors are studying the interactive model-based coding in order to achieve its practical realization, improve its performance and extend related non-coding functions. We have already proposed the basic concept of interactive model-based coding and presented an eyeglasses processing for a facial image with glasses to remove the frame for improving the model-based coding performance. In this paper, we focus on the 3-D motion detection algorithm in the interactive model-based coding. Previous works were mainly based on iterative methods to solve non-linear equations. A new motion detection algorithm is developed for interactive model-based coding. It is linear because the interactive operation generates more information and the environment of the applications limits the range of parameters. The depth parameter is first obtained by the fact that a line segment is invariant as to 3-D space transformation. Relation of distance between two points is utilized. The number of conditions is larger than that of the unknown variables, which allows to use least square method for obtaining stable solutions in the environment of the applications. Experiments are carried out using the proposed motion detection method and input noise problems are removed. Synthesized wireframe modified by eight parameters provides smooth and natural motion.
Visual Tracking in Occlusion Environments by Autonomous Switching of Targets
Jun-ichi IMAI Masahide KANEKO

PAPER-Image Recognition, Computer Vision

Vol:
E91-D No:1
Page(s):
86-95
Visual tracking is required by many vision applications such as human-computer interfaces and human-robot interactions. However, in daily living spaces where such applications are assumed to be used, stable tracking is often difficult because there are many objects which can cause the visual occlusion. While conventional tracking techniques can handle, to some extent, partial and short-term occlusion, they fail when presented with complete occlusion over long periods. They also cannot handle the case that an occluder such as a box and a bag contains and carries the tracking target inside itself, that is, the case that the target invisibly moves while being contained by the occluder. In this paper, to handle this occlusion problem, we propose a method for visual tracking by a particle filter, which switches tracking targets autonomously. In our method, if occlusion occurs during tracking, a model of the occluder is dynamically created and the tracking target is switched to this model. Thus, our method enables the tracker to indirectly track the "invisible target" by switching its target to the occluder effectively. Experimental results show the effectiveness of our method.
Face Alignment Based on Statistical Models Using SIFT Descriptors
Zisheng LI Jun-ichi IMAI Masahide KANEKO

PAPER-Processing

Vol:
E92-A No:12
Page(s):
3336-3343
Active Shape Model (ASM) is a powerful statistical tool for image interpretation, especially in face alignment. In the standard ASM, local appearances are described by intensity profiles, and the model parameter estimation is based on the assumption that the profiles follow a Gaussian distribution. It suffers from variations of poses, illumination, expressions and obstacles. In this paper, an improved ASM framework, GentleBoost based SIFT-ASM is proposed. Local appearances of landmarks are originally represented by SIFT (Scale-Invariant Feature Transform) descriptors, which are gradient orientation histograms based representations of image neighborhood. They can provide more robust and accurate guidance for search than grey-level profiles. Moreover, GentleBoost classifiers are applied to model and search the SIFT features instead of the unnecessary assumption of Gaussian distribution. Experimental results show that SIFT-ASM significantly outperforms the original ASM in aligning and localizing facial features.
Facial Expression Recognition Based on Facial Region Segmentation and Modal Value Approach
Gibran BENITEZ-GARCIA Gabriel SANCHEZ-PEREZ Hector PEREZ-MEANA Keita TAKAHASHI Masahide KANEKO

PAPER-Image Recognition, Computer Vision

Vol:
E97-D No:4
Page(s):
928-935
This paper presents a facial expression recognition algorithm based on segmentation of a face image into four facial regions (eyes-eyebrows, forehead, mouth and nose). In order to unify the different results obtained from facial region combinations, a modal value approach that employs the most frequent decision of the classifiers is proposed. The robustness of the algorithm is also evaluated under partial occlusion, using four different types of occlusion (half left/right, eyes and mouth occlusion). The proposed method employs sub-block eigenphases algorithm that uses the phase spectrum and principal component analysis (PCA) for feature vector estimation which is fed to a support vector machine (SVM) for classification. Experimental results show that using modal value approach improves the average recognition rate achieving more than 90% and the performance can be kept high even in the case of partial occlusion by excluding occluded parts in the feature extraction process.
Multicultural Facial Expression Recognition Based on Differences of Western-Caucasian and East-Asian Facial Expressions of Emotions
Gibran BENITEZ-GARCIA Tomoaki NAKAMURA Masahide KANEKO

PAPER-Machine Vision and its Applications

Pubricized:
2018/02/16
Vol:
E101-D No:5
Page(s):
1317-1324
An increasing number of psychological studies have demonstrated that the six basic expressions of emotions are not culturally universal. However, automatic facial expression recognition (FER) systems disregard these findings and assume that facial expressions are universally expressed and recognized across different cultures. Therefore, this paper presents an analysis of Western-Caucasian and East-Asian facial expressions of emotions based on visual representations and cross-cultural FER. The visual analysis builds on the Eigenfaces method, and the cross-cultural FER combines appearance and geometric features by extracting Local Fourier Coefficients (LFC) and Facial Fourier Descriptors (FFD) respectively. Furthermore, two possible solutions for FER under multicultural environments are proposed. These are based on an early race detection, and independent models for culture-specific facial expressions found by the analysis evaluation. HSV color quantization combined with LFC and FFD compose the feature extraction for race detection, whereas culture-independent models of anger, disgust and fear are analyzed for the second solution. All tests were performed using Support Vector Machines (SVM) for classification and evaluated using five standard databases. Experimental results show that both solutions overcome the accuracy of FER systems under multicultural environments. However, the approach which individually considers the culture-specific facial expressions achieved the highest recognition rate.

Author Search Result

[Author] Masahide KANEKO(10hit)

Toward the New Era of Visual Communication

Block-Based Bag of Words for Robust Face Recognition under Variant Conditions of Facial Expression, Illumination, and Partial Occlusion

Processing of Face Images and Its Applications

Compression and Representation of 3-D Images

Recent and Current Research on Very Low Bit-Rate Video Coding in Japan

Interactive Model-Based Coding of Facial Image Sequence with a New Motion Detection Algorithm

Visual Tracking in Occlusion Environments by Autonomous Switching of Targets

Face Alignment Based on Statistical Models Using SIFT Descriptors

Facial Expression Recognition Based on Facial Region Segmentation and Modal Value Approach

Multicultural Facial Expression Recognition Based on Differences of Western-Caucasian and East-Asian Facial Expressions of Emotions

Latest Issue

FlyerIEICE has prepared a flyer regarding multilingual services. Please use the one in your native language.

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles