Shota FUJII Shohei KAKEI Masanori HIROTOMO Makoto TAKITA Yoshiaki SHIRAISHI Masami MOHRI Hiroki KUZUNO Masakatu MORII
Haoran LUO Tengfei SHAO Tomoji KISHI Shenglei LI
Chee Siang LEOW Tomoki KITAGAWA Hideaki YAJIMA Hiromitsu NISHIZAKI
Dengtian YANG Lan CHEN Xiaoran HAO
Rong HUANG Yue XIE
Toshiki ONISHI Asahi OGUSHI Ryo ISHII Akihiro MIYATA
Meihua XUE Kazuki SUGITA Koichi OTA Wen GU Shinobu HASEGAWA
Jinyong SUN Zhiwei DONG Zhigang SUN Guoyong CAI Xiang ZHAO
Yusuke HIROTA Yuta NAKASHIMA Noa GARCIA
Yusuke HIROTA Yuta NAKASHIMA Noa GARCIA
Kosetsu TSUKUDA Tomoyasu NAKANO Masahiro HAMASAKI Masataka GOTO
ZhengYu LU PengFei XU
Binggang ZHUO Ryota HONDA Masaki MURATA
Qingqing YU Rong JIN
Huawei TAO Ziyi HU Sixian LI Chunhua ZHU Peng LI Yue XIE
Qianhang DU Zhipeng LIU Yaotong SONG Ningning WANG Zeyuan JU Shangce GAO
Ryota TOMODA Hisashi KOGA
Reina SASAKI Atsuko TAKEFUSA Hidemoto NAKADA Masato OGUCHI
So KOIDE Yoshiaki TAKATA Hiroyuki SEKI
Huang Rong Qian Zewen Ma Hao Han Zhezhe Xie Yue
Huu-Long PHAM Ryota MIBAYASHI Takehiro YAMAMOTO Makoto P. KATO Yusuke YAMAMOTO Yoshiyuki SHOJI Hiroaki OHSHIMA
Taku WAKUI Fumio TERAOKA Takao KONDO
Shaobao Wu Zhihua Wu Meixuan Huang
Koji KAMMA Toshikazu WADA
Dingjie PENG Wataru KAMEYAMA
Zhizhong WANG Wen GU Zhaoxing LI Koichi OTA Shinobu HASEGAWA
Tomoaki YAMAZAKI Seiya ITO Kouzou OHARA
Daihei ISE Satoshi KOBAYASHI
Masanari ICHIKAWA Yugo TAKEUCHI
Shota SUZUKI Satoshi ONO
Reoma MATSUO Toru KOIZUMI Hidetsugu IRIE Shuichi SAKAI Ryota SHIOYA
Hirotaka HACHIYA Fumiya NISHIZAWA
Issa SUGIURA Shingo OKAMURA Naoto YANAI
Mudai KOBAYASHI Mohammad Mikal Bin Amrul Halim Gan Takahisa SEKI Takahiro HIROFUCHI Ryousei TAKANO Mitsuhiro KISHIMOTO
Chi ZHANG Luwei ZHANG Toshihiko YAMASAKI
Jung Min Lim Wonho Lee Jun-Hyeong Choi Jong Wook Kwak
Zhuo ZHANG Donghui LI Kun JIANG Ya LI Junhu WANG Xiankai MENG
Takayoshi SHIKANO Shuichi ICHIKAWA
Shotaro ISHIKURA Ryosuke MINAMI Miki YAMAMOTO
Pengfei ZHANG Jinke WANG Yuanzhi CHENG Shinichi TAMURA
Fengqi GUO Qicheng LIU
Runlong HAO Hui LUO Yang LI
Rongchun XIAO Yuansheng LIU Jun ZHANG Yanliang HUANG Xi HAN
Yong JIN Kazuya IGUCHI Nariyoshi YAMAI Rei NAKAGAWA Toshio MURAKAMI
Toru HASEGAWA Yuki KOIZUMI Junji TAKEMASA Jun KURIHARA Toshiaki TANAKA Timothy WOOD K. K. RAMAKRISHNAN
Rikima MITSUHASHI Yong JIN Katsuyoshi IIDA Yoshiaki TAKAI
Zezhong LI Jianjun MA Fuji REN
Lorenzo Mamelona TingHuai Ma Jia Li Bright Bediako-Kyeremeh Benjamin Kwapong Osibo
Wonho LEE Jong Wook KWAK
Xiaoxiao ZHOU Yukinori SATO
Kento WATANABE Masataka GOTO
Kazuyo ONISHI Hiroki TANAKA Satoshi NAKAMURA
Takashi YOKOTA Kanemitsu OOTSU
Chenbo SHI Wenxin SUN Jie ZHANG Junsheng ZHANG Chun ZHANG Changsheng ZHU
Masateru TSUNODA Ryoto SHIMA Amjed TAHIR Kwabena Ebo BENNIN Akito MONDEN Koji TODA Keitaro NAKASAI
Masateru TSUNODA Takuto KUDO Akito MONDEN Amjed TAHIR Kwabena Ebo BENNIN Koji TODA Keitaro NAKASAI Kenichi MATSUMOTO
Hiroaki AKUTSU Ko ARAI
Lanxi LIU Pengpeng YANG Suwen DU Sani M. ABDULLAHI
Xiaoguang TU Zhi HE Gui FU Jianhua LIU Mian ZHONG Chao ZHOU Xia LEI Juhang YIN Yi HUANG Yu WANG
Yingying LU Cheng LU Yuan ZONG Feng ZHOU Chuangao TANG
Jialong LI Takuto YAMAUCHI Takanori HIRANO Jinyu CAI Kenji TEI
Wei LEI Yue ZHANG Hanfeng XIE Zebin CHEN Zengping CHEN Weixing LI
David CLARINO Naoya ASADA Atsushi MATSUO Shigeru YAMASHITA
Takashi YOKOTA Kanemitsu OOTSU
Xiaokang Jin Benben Huang Hao Sheng Yao Wu
Tomoki MIYAMOTO
Ken WATANABE Katsuhide FUJITA
Masashi UNOKI Kai LI Anuwat CHAIWONGYEN Quoc-Huy NGUYEN Khalid ZAMAN
Takaharu TSUBOYAMA Ryota TAKAHASHI Motoi IWATA Koichi KISE
Chi ZHANG Li TAO Toshihiko YAMASAKI
Ann Jelyn TIEMPO Yong-Jin JEONG
Jiakun LI Jiajian LI Yanjun SHI Hui LIAN Haifan WU
Nikolay FEDOROV Yuta YAMASAKI Masateru TSUNODA Akito MONDEN Amjed TAHIR Kwabena Ebo BENNIN Koji TODA Keitaro NAKASAI
Yukasa MURAKAMI Yuta YAMASAKI Masateru TSUNODA Akito MONDEN Amjed TAHIR Kwabena Ebo BENNIN Koji TODA Keitaro NAKASAI
Akira ITO Yoshiaki TAKAHASHI
Rindo NAKANISHI Yoshiaki TAKATA Hiroyuki SEKI
Chuzo IWAMOTO Ryo TAKAISHI
Koichi FUJII Tomomi MATSUI
Kazuyuki AMANO
Takumi SHIOTA Tonan KAMATA Ryuhei UEHARA
Hitoshi MURAKAMI Yutaro YAMAGUCHI
Kento KIMURA Tomohiro HARAMIISHI Kazuyuki AMANO Shin-ichi NAKANO
Ryotaro MITSUBOSHI Kohei HATANO Eiji TAKIMOTO
Naohito MATSUMOTO Kazuhiro KURITA Masashi KIYOMI
Tomohiro KOBAYASHI Tomomi MATSUI
Shin-ichi NAKANO
Ming PAN
Ryo SAEGUSA Hitoshi SAKANO Shuji HASHIMOTO
Principal Component Analysis (PCA) has been applied in various areas such as pattern recognition and data compression. In some cases, however, PCA does not extract the characteristics of the data-distribution efficiently. In order to overcome this problem, we have proposed a novel method of Nonlinear PCA which preserves the order of the principal components. In this paper, we reduce the dimensionality of image data using the proposed method, and examine its effectiveness in the compression and recognition of images.
Din-Yuen CHAN Chih-Hsueh LIN Wen-Shyong HSIEH
This investigation proposes a fast wavelet-based color segmentation (FWCS) technique and a modified directional region-growing (DRG) technique for semantic image segmentation. The FWCS is a subsequent combination of progressive color truncation and histogram-based color extraction processes for segmenting color regions in images. By exploring specialized centroids of segmented fragments as initial growing seeds, the proposed DRG operates a directional 1-D region growing on pairs of color segmented regions based on those centroids. When the two examined regions are positively confirmed by DRG, the proposed framework subsequently computes the texture features extracted from these two regions to further check their relation using texture similarity testing (TST). If any pair of regions passes double checking with both DRG and TST, they are identified as associated regions. If two associated regions/areas are connective, they are unified to a union area enclosed by a single contour. On the contrary, the proposed framework merely acknowledges a linking relation between those associated regions/areas highlighted with any linking mark. Particularly, by the systematic integration of all proposed processes, the critical issue to decide the ending level of wavelet decomposition in various images can be efficiently solved in FWCS by a quasi-linear high-frequency analysis model newly proposed. The simulations conducted here demonstrate that the proposed segmentation framework can achieve a quasi-semantic segmentation without priori a high-level knowledge.
Kenichi KANATANI Yasuyuki SUGAYA
We analyze the noise sensitivity of the focal length computation, the principal point estimation, and the orthogonality enforcement for single-view 3-D reconstruction based on vanishing points and orthogonality. We point out that due to the nonlinearity of the problem the standard statistical optimization is not very effective. We present a practical compromise for avoiding the computational failure and preserving high accuracy, allowing a consistent 3-D shape in the presence of however large noise.
Atsutada NAKATSUJI Yasuyuki SUGAYA Kenichi KANATANI
In reconstructing 3-D from images based on feature points, one usually defines a triangular mesh that has these feature points as vertices and displays the scene as a polyhedron. If the scene itself is a polyhedron, however, some of the displayed edges may be inconsistent with the true shape. This paper presents a new technique for automatically eliminating such inconsistencies by using a special template. We also present a technique for removing spurious occluding edges. All the procedures do not require any thresholds to be adjusted. Using real images, we demonstrate that our method has high capability to correct inconsistencies.
Haiyuan WU Qian CHEN Toshikazu WADA
This paper describes a sophisticated method to estimate visual direction using iris contours. This method requires only one monocular image taken by a camera with unknown focal length. In order to estimate the visual direction, we assume the visual directions of both eyes are parallel and iris boundaries are circles in 3D space. In this case, the two planes where the iris boundaries reside are also parallel. We estimate the normal vector of the two planes from the iris contours extracted from an input image by using an extended "two-circle" algorithm. Unlike most existing gaze estimation algorithms that require information about eye corners and heuristic knowledge about 3D structure of the eye in addition to the iris contours, our method uses two iris contours only. Another contribution of our method is the ability of estimating the focal length of the camera. It allows one to use a zoom lens to take images and the focal length can be adjusted at any time. The extensive experiments over simulated images and real images demonstrate the robustness and the effectiveness of our method.
Kazuhiro HOTTA Masaru TANAKA Takio KURITA Taketoshi MISHIMA
This paper presents Dynamic Attention Map by Ising model for face detection. In general, a face detector can not know where faces there are and how many faces there are in advance. Therefore, the face detector must search the whole regions on the image and requires much computational time. To speed up the search, the information obtained at previous search points should be used effectively. In order to use the likelihood of face obtained at previous search points effectively, Ising model is adopted to face detection. Ising model has the two-state spins; "up" and "down". The state of a spin is updated by depending on the neighboring spins and an external magnetic field. Ising spins are assigned to "face" and "non-face" states of face detection. In addition, the measured likelihood of face is integrated into the energy function of Ising model as the external magnetic field. It is confirmed that face candidates would be reduced effectively by spin flip dynamics. To improve the search performance further, the single level Ising search method is extended to the multilevel Ising search. The interactions between two layers which are characterized by the renormalization group method is used to reduce the face candidates. The effectiveness of the multilevel Ising search method is also confirmed by the comparison with the single level Ising search method.
Widhyakorn ASDORNWISED Somchai JITAPUNKUL
In this paper, a source coding model for learning multiple concept descriptions of data is proposed. Our source coding model is based on the concept of transmitting data over multiple channels, called multiple description (MD) coding. In particular, frame expansions have been used in our MD coding models for pattern classification. Using this model, there are several interesting properties within a class of multiple classifier algorithms that share with our proposed scheme. Generalization of the MD view under an extension of local discriminant basis towards the theory of frames allows the formulation of a generalized class of low-complexity learning algorithms applicable to high-dimensional pattern classification. To evaluate this approach, performance results for automatic target recognition (ATR) are presented for synthetic aperture radar (SAR) images from the MSTAR public release data set. From the experimental results, our approach outperforms state-of-the-art methods such as conditional Gaussian signal model, Adaboost, and ECOC-SVM.
Tomohiko OHTSUKA Takeshi TAKAHASHI
This paper describes a new approach to detect a fingerprint core location using the extended relational graph, which is generated by the segmentation of the ridge directional image. The extended relational graph presents the adjacency between segments of the directional image and the boundary information between segments of the directional image. The boundary curves generated by the boundary information in the extended relational graph is approximated to the straight lines. The fingerprint core location is calculated as center of the gravity in the points of intersection of these approximated lines. Experimental results show that 90.8% of the 130 fingerprint samples are succeeded to detect the core location.
We propose a learning method combining query learning and a "genetic translator" we previously developed. Query learning is a useful technique for high-accuracy, high-speed learning and reduction of training sample size. However, it has not been applied to practical optical character readers (OCRs) because human beings cannot recognize queries as character images in the feature space used in practical OCR devices. We previously proposed a character image reconstruction method using a genetic algorithm. This method is applied as a "translator" from feature space for query learning of character recognition. The results of an experiment with hand-written numeral recognition show the possibility of training sample size reduction.
A novel digital redesign methodology based on evolutionary programming (EP) is introduced to find the 'best' digital controller for optimal tracking design of hybrid uncertain multi-input/ multi-output (MIMO) input-delay systems with constraints on states and controls. To deal with these multivariable concurrent specifications and system restrictions, instead of conventional interval methods, the proposed global optimization scheme is able to practically implement optimal digital controller for constrained uncertain hybrid systems with input time delay. Further, an illustrative example is included to demonstrate the efficiency of the proposed method.
In order to solve large Multidimensional Knapsack problems we examine a technique which decomposes a problem instance into two parts. The first part is solved using a traditional technique, such as Dynamic Programming, to reduce the number of variables in the problem by creating a single variable with many non-dominated states. In the second part the remaining variables are determined by an algorithm that repeatedly enumerates them with different constraint and objective requirements. The constraint and objective requirements are imposed by the various non-dominated states of the variable created in the first part of this technique. The main advantage of this approach is that when memory requirements prevent traditional techniques solving a problem instance, the enumeration provides a much less memory-intensive method, enabling a solution to be found. Two approaches are proposed for repeatedly enumerating a 0/1 Multidimensional Knapsack problem. It is demonstrated how these enumeration methods, in conjunction with the Modular Approach, were used to find the optimal solutions to a number of 500-variable, 5-constraint Multidimensional Knapsack problem instances proposed in the literature. The exact solutions to these instances were previously unknown.
Document merging is essential to synchronizing several versions of a document concurrently edited by two or more users. A few methods for merging structured documents have been proposed so far, and yet the methods may not always merge given documents appropriately. As an aid for finding an appropriate merging, using another approach we propose a polynomial-time algorithm for merging structured documents. In the approach, we merge given two documents (treated as ordered trees) by optimally transforming the documents into isomorphic ones, using operations such as add (add a new node), del (delete an existing node), and upd (make two nodes have the same label).
Yousif A. EL-IMAM Zuraidah Mohd DON
Phonetic transcription of text is an indispensable component of text-to-speech (TTS) systems and is used in acoustic modeling for speech recognition and other natural language processing applications. One approach to the transcription of written text into phonetic entities or sounds is to use a set of well-defined context and language-dependent rules. The process of transcribing text into sounds starts by preprocessing the text and representing it by lexical items to which the rules are applicable. The rules can be segregated into phonemic and phonetic rules. Phonemic rules operate on graphemes to convert them into phonemes. Phonetic rules operate on phonemes and convert them into context-dependent phonetic entities with actual sounds. Converting from written text into actual sounds, developing a comprehensive set of rules, and transforming the rules into implementable algorithms for any language cause several problems that have their origins in the relative lack of correspondence between the spelling of the lexical items and their sound contents. For Standard Malay (SM) these problems are not as severe as those for languages of complex spelling systems, such as English and French, but they do exist. In this paper, developing a comprehensive computerized system for processing SM text and transcribing it into phonetic entities and evaluating the performance of this system, irrespective of the application, is discussed. In particular, the following issues are dealt with in this paper: (1) the spelling and other problems of SM writing and their impact on converting graphemes into phonemes, (2) the development of a comprehensive set of grapheme-to-phoneme rules for SM, (3) a description of the phonetic variations of SM or how the phonemes of SM vary in context and the development of a set of phoneme-to-phonetic transcription rules, (4) the formulation of the phonemic and phonetic rules into algorithms that are applicable to the computer-based processing of input SM text, and (5) the evaluation of the performance of the process of converting SM text into actual sounds by the above mentioned methods.
This paper describes a pattern classifier for detecting frontal-view faces via learning a decision boundary. The proposed classifier consists of two major parts for improving classification accuracy: the implicit modeling of both the face and the near-face classes resulting in an extended discriminative feature set, and the subsequent composite Support Vector Machines (SVMs) for speeding up the classification. For the extended discriminative feature set, Principal Component Analysis (PCA) or Independent Component Analysis (ICA) is performed for the face and near-face classes separately. The projections and distances to the two different subspaces are complementary, which significantly enhances classification accuracy of SVM. Multiple nonlinear SVMs are trained for the local facial feature spaces considering the general multi-modal characteristic of the face space. Each component SVM has a simpler boundary than that of a single SVM for the whole face space. The most appropriate component SVM is selected by a gating mechanism based on clustering. The classification by utilizing one of the multiple SVMs guarantees good generalization performance and speeds up face detection. The proposed classifier is finally implemented to work in real-time by cascading a boosting based face detector.
Yousun KANG Ken'ichi MOROOKA Hiroshi NAGAHASHI
As a representative of the linear discriminant analysis, the Fisher method is most widely used in practice and it is very effective in two-class classification. However, when it is expanded to a multi-class classification problem, the precision of its discrimination may become worse. A main reason is an occurrence of overlapped distributions on the discriminant space built by Fisher criterion. In order to take such overlaps among classes into consideration, our approach builds a new discriminant space by hierarchically classifying the overlapped classes. In this paper, we propose a new hierarchical discriminant analysis for texture classification. We divide the discriminant space into subspaces by recursively grouping the overlapped classes. In the experiment, texture images from many classes are classified based on the proposed method. We show the outstanding result compared with the conventional Fisher method.
To develop human interfaces such as home information equipment, highly capable word learning ability is required. In particular, in order to realize user-customized and situation-dependent interaction using language, a function is needed that can build new categories online in response to presented objects for an advanced human interface. However, at present, there are few basic studies focusing on the purpose of language acquisition with category formation. In this study, taking hints from an analogy between machine learning and infant developmental word acquisition, we propose a taxonomy-based word-learning model using a neural network. Through computer simulations, we show that our model can build categories and find the name of an object based on categorization.
In this paper, we present a learning approach, positive correlation learning (PCL), that creates a multilayer neural network with good generalization ability. A correlation function is added to the standard error function of back propagation learning, and the error function is minimized by a steepest-descent method. During training, all the unnecessary units in the hidden layer are correlated with necessary ones in a positive sense. PCL can therefore create positively correlated activities of hidden units in response to input patterns. We show that PCL can reduce the information on the input patterns and decay the weights, which lead to improved generalization ability. Here, the information is defined with respect to hidden unit activity since the hidden unit plays a crucial role in storing the information on the input patterns. That is, as previously proposed, the information is defined by the difference between the uncertainty of the hidden unit at the initial stage of learning and the uncertainty of the hidden unit at the final stage of learning. After deriving new weight update rules for the PCL, we applied this method to several standard benchmark classification problems such as breast cancer, diabetes and glass identification problems. Experimental results confirmed that the PCL produces positively correlated hidden units and reduces significantly the amount of information, resulting improved generalization ability.
The most obvious architectural solution for high-speed fuzzy inference is to exploit temporal parallelism and spatial parallelism inherited in a fuzzy inference execution. However, in fact, the active rules in each fuzzy inference execution are often only a small part of the total rules. In this paper, we present a new architecture that uses less hardware resources by discarding non-active rules in the earlier pipeline stage. Compared with previous work, implementation data show that the proposed architecture achieves very good results in terms of the inference speed and the chip area.
Kazushi TANIHIRA Hiromi KOBAYASHI
This paper presents properties of role-based access control which were obtained through a development of a prototype of a teaching management system. These properties are related to assignment of temporal constraints and access control procedure in terms of the corresponding flow of user's view and considered to be suitable to other information systems.
An efficient hybrid image vector quantization (VQ) technique based on a classification in the DCT domain is presented in this letter. This algorithm combines two kinds of VQ, predictive VQ (PVQ) and discrete cosine transform domain VQ (DCTVQ), and adopts a simple classifier which employs only three DCT coefficients in the 8
Elhassane IBNELHAJ Driss ABOUTAJDINE
In this paper we present a 3D adaptive nonlinear filter, namely the 3D adaptive CPWLN, based on the Canonical Piece Wise-Linear Network with an LMS L-filter type of adaptation. This filter is used to equalize nonlinear channel effect and remove impulsive/or mixed impulsive and Additive White Gaussian noise from video sequences. First, motion compensation is performed by a robust estimator. Then, a 3-D CPWLN LMS L-filter is applied. The overall combination is able to adequately remove undesired effects of communication channel and noise. Computer simulations on real-world image sequences are included. The algorithm yields promising results in terms of both objective and subjective quality of the restored sequence.
We propose a system that enables us to gather hundreds of images related to one set of keywords provided by a user from the World Wide Web. The system is called Image Collector II. The Image Collector, which we proposed previously, can gather only one or two hundreds of images. We propose the two following improvements on our previous system in terms of the number of gathered images and their precision: (1) We extract some words appearing with high frequency from all HTML files in which output images are embedded in an initial image gathering, and using them as keywords, we carry out a second image gathering. Through this process, we can obtain hundreds of images for one set of keywords. (2) The more images we gather, the more the precision of gathered images decreases. To improve the precision, we introduce word vectors of HTML files embedding images into the image selecting process in addition to image feature vectors.