IEICE TRANSACTIONS on Information

Impact Factor

0.59
Eigenfactor

0.002
article influence

0.1
Cite Score

1.4

To the Advance publication
To the Archives

Advance publication (published online immediately after acceptance)

A+Block: Web Security Add-on Service by Just Pointing to Your Website URL
Shota FUJII Shohei KAKEI Masanori HIROTOMO Makoto TAKITA Yoshiaki SHIRAISHI Masami MOHRI Hiroki KUZUNO Masakatu MORII

Pubricized:
2025/02/06
PAPER
- Summary
- Free PDF (647.3KB)
Aspect-level Cross-linguistic Multi-layer Sentiment Analysis: A Case Study on User Preferences for Mask Attributes During the COVID-19 Pandemic
Haoran LUO Tengfei SHAO Tomoji KISHI Shenglei LI

Pubricized:
2025/01/31
PAPER
- Summary
- Free PDF (3.2MB)
Handwritten Character Image Generation for Effective Data Augmentation
Chee Siang LEOW Tomoki KITAGAWA Hideaki YAJIMA Hiromitsu NISHIZAKI

Pubricized:
2025/01/31
PAPER
- Summary
- Free PDF (4.2MB)
Solve Point Competition with Label Reassignment in Unmanned Aerial Vehicle Object Detector
Dengtian YANG Lan CHEN Xiaoran HAO

Pubricized:
2025/01/27
LETTER
- Summary
- Free PDF (2.3MB)
DPDF-AEC: MULTI-SCALE DUAL PATH CONVOLUTION RECURRENT NETWORK WITH DEEP FILTERING BLOCK FOR ACOUSTIC ECHO CANCELLATION
Rong HUANG Yue XIE

Pubricized:
2025/01/23
LETTER
- Summary
- Free PDF (1.5MB)
Detecting Praising Behaviors Based on Multimodal Information
Toshiki ONISHI Asahi OGUSHI Ryo ISHII Akihiro MIYATA

Pubricized:
2025/01/23
PAPER
- Summary
- Free PDF (250.2KB)
PICSU: A Japanese Vocabulary Learning System with Object Recognition and Lexical Database
Meihua XUE Kazuki SUGITA Koichi OTA Wen GU Shinobu HASEGAWA

Pubricized:
2025/01/23
PAPER
- Summary
- Free PDF (1.7MB)
A Spatial Graph Convolutional Network based on Node Semantic for Graph Classification
Jinyong SUN Zhiwei DONG Zhigang SUN Guoyong CAI Xiang ZHAO

Pubricized:
2025/01/20
PAPER
- Summary
- Free PDF (1.5MB)
Societal Bias in Image Captioning: Identifying and Measuring Bias Amplification
Yusuke HIROTA Yuta NAKASHIMA Noa GARCIA

Pubricized:
2025/01/20
PAPER
- Summary
- Free PDF (1.8MB)
Mitigating Gender Bias Amplification in Image Captioning
Yusuke HIROTA Yuta NAKASHIMA Noa GARCIA

Pubricized:
2025/01/20
PAPER
- Summary
- Free PDF (2.1MB)
Toward an Understanding of Musical Factors in Judging a Song on First Listen
Kosetsu TSUKUDA Tomoyasu NAKANO Masahiro HAMASAKI Masataka GOTO

Pubricized:
2025/01/17
PAPER
- Summary
- Free PDF (1.5MB)
Hail Intelligent Recognition Algorithm Based on HAM-Unet
ZhengYu LU PengFei XU

Pubricized:
2025/01/17
PAPER
- Summary
- Free PDF (1.3MB)
Title Information for Transformer-based Japanese Document Emphasis
Binggang ZHUO Ryota HONDA Masaki MURATA

Pubricized:
2025/01/16
PAPER
- Summary
- Free PDF (2.4MB)
Improved Quantum Approximate Optimization Algorithm based on Conditional Value-at-Risk for Portfolio Optimization
Qingqing YU Rong JIN

Pubricized:
2025/01/15
PAPER
- Summary
- Free PDF (2.4MB)
An Interpretable Multi-level Feature Disentanglement Algorithm for Speech Emotion Recognition
Huawei TAO Ziyi HU Sixian LI Chunhua ZHU Peng LI Yue XIE

Pubricized:
2025/01/10
LETTER
- Summary
- Free PDF (1.5MB)
A Lightweight Dendritic ShuffleNet for Medical Image Classification
Qianhang DU Zhipeng LIU Yaotong SONG Ningning WANG Zeyuan JU Shangce GAO

Pubricized:
2025/01/10
PAPER
- Summary
- Free PDF (850.2KB)
Section Min-Hash Approximating Time Series Search based on Dynamic Time Warping
Ryota TOMODA Hisashi KOGA

Pubricized:
2025/01/09
PAPER
- Summary
- Free PDF (872.2KB)
Communication Performance of ROS and ROS 2-based IoT Systems for Smart Home Applications
Reina SASAKI Atsuko TAKEFUSA Hidemoto NAKADA Masato OGUCHI

Pubricized:
2025/01/09
PAPER
- Summary
- Free PDF (1.6MB)
Non-Cooperative Rational Synthesis Problem for Probabilistic Strategies
So KOIDE Yoshiaki TAKATA Hiroyuki SEKI

Pubricized:
2025/01/09
LETTER
- Summary
- Free PDF (114.5KB)
Sports performance prediction for college students through ensemble learning algorithm
Huang Rong Qian Zewen Ma Hao Han Zhezhe Xie Yue

Pubricized:
2025/01/07
PAPER
- Summary
- Free PDF (855.5KB)
Pre-trained BERT Model Retrieval: Inference-Based No-Learning Approach using k-Nearest Neighbour Algorithm
Huu-Long PHAM Ryota MIBAYASHI Takehiro YAMAMOTO Makoto P. KATO Yusuke YAMAMOTO Yoshiyuki SHOJI Hiroaki OHSHIMA

Pubricized:
2025/01/07
PAPER
- Summary
- Free PDF (996KB)
GAMPALv2: An Anomaly Detection Mechanism for Internet Traffic by Predicting Flow Size Range from Time Features
Taku WAKUI Fumio TERAOKA Takao KONDO

Pubricized:
2025/01/07
PAPER
- Summary
- Free PDF (1.9MB)
Copy-Move Forgery Detection via Dimensionality Reduction and Double Quantization Feature
Shaobao Wu Zhihua Wu Meixuan Huang

Pubricized:
2025/01/06
PAPER
- Summary
- Free PDF (5.1MB)
POEM: Pruning with Output Error Minimization for Compressing Deep Neural Networks
Koji KAMMA Toshikazu WADA

Pubricized:
2024/12/27
PAPER
- Summary
- Free PDF (532.8KB)
Structural Relation Multi-class Token Transformer for Weakly Supervised Semantic Segmentation
Dingjie PENG Wataru KAMEYAMA

Pubricized:
2024/12/25
PAPER
- Summary
- Free PDF (1.6MB)
Spotlight Contents Extraction from Text-based Online Discussion
Zhizhong WANG Wen GU Zhaoxing LI Koichi OTA Shinobu HASEGAWA

Pubricized:
2024/12/20
PAPER
- Summary
- Free PDF (776.2KB)
Evaluation of Sense Embeddings with Optimized Vector-Meaning Correspondence
Tomoaki YAMAZAKI Seiya ITO Kouzou OHARA

Pubricized:
2024/12/20
PAPER
- Summary
- Free PDF (1MB)
A Necessary and Sufficient Condition for Controlled Generation of Right Linear Grammars with Unknown Behaviors
Daihei ISE Satoshi KOBAYASHI

Pubricized:
2024/12/19
PAPER
- Summary
- Free PDF (424.5KB)
Online Communication Environment Design for Encouraging Reciprocal Use of Information among Groups
Masanari ICHIKAWA Yugo TAKEUCHI

Pubricized:
2024/12/19
PAPER
- Summary
- Free PDF (3.5MB)
Self-supervised Neural Architecture Search for Multimodal Deep Neural Networks
Shota SUZUKI Satoshi ONO

Pubricized:
2024/12/18
LETTER
- Summary
- Free PDF (361.6KB)
Enhancing GPU Performance through Complexity-Effective Out-of-Order Execution using Distance-based ISA
Reoma MATSUO Toru KOIZUMI Hidetsugu IRIE Shuichi SAKAI Ryota SHIOYA

Pubricized:
2024/12/16
PAPER
- Summary
- Free PDF (4.2MB)
A Transformer-based fully trainable point process
Hirotaka HACHIYA Fumiya NISHIZAWA

Pubricized:
2024/12/12
PAPER
- Summary
- Free PDF (1.4MB)
Removing Mislabeled Data from Trained Models via Machine Unlearning
Issa SUGIURA Shingo OKAMURA Naoto YANAI

Pubricized:
2024/12/11
PAPER
- Summary
- Free PDF (261.5KB)
A component placement mechanism for latency-constrained applications in cloud-edge environments
Mudai KOBAYASHI Mohammad Mikal Bin Amrul Halim Gan Takahisa SEKI Takahiro HIROFUCHI Ryousei TAKANO Mitsuhiro KISHIMOTO

Pubricized:
2024/12/11
LETTER
- Summary
- Free PDF (476.1KB)
AdaCF: Adaptively Updating Adjacency Matrix in Collaborative Filtering
Chi ZHANG Luwei ZHANG Toshihiko YAMASAKI

Pubricized:
2024/12/10
PAPER
- Summary
- Free PDF (1.6MB)
CLOCK-DPP: Hybrid disk buffer replacement policy for SSDs with dirty page preservation for write intensive environments
Jung Min Lim Wonho Lee Jun-Hyeong Choi Jong Wook Kwak

Pubricized:
2024/12/10
LETTER
- Summary
- Free PDF (1.2MB)
Propagation-based Code Clone Analysis for Detecting Smart Contract Vulnerability
Zhuo ZHANG Donghui LI Kun JIANG Ya LI Junhu WANG Xiankai MENG

Pubricized:
2024/12/10
LETTER
- Summary
- Free PDF (1.9MB)
Performance Enhancement of the LFSR-Based Unpredictable Random Number Generator in Rocket Core
Takayoshi SHIKANO Shuichi ICHIKAWA

Pubricized:
2024/12/10
PAPER
- Summary
- Free PDF (198.4KB)
BMR: a new BBR algorithm with moderate delivery rate
Shotaro ISHIKURA Ryosuke MINAMI Miki YAMAMOTO

Pubricized:
2024/12/10
PAPER
- Summary
- Free PDF (1.5MB)
Clinical Learning Order-guided Deep Neural Network For Brain Tumor Segmentation
Pengfei ZHANG Jinke WANG Yuanzhi CHENG Shinichi TAMURA

Pubricized:
2024/12/06
PAPER
- Summary
- Free PDF (1.9MB)
A Boosting Method Based on Center-of-gravity Oversampling and Pruning for Classifying Imbalanced Data
Fengqi GUO Qicheng LIU

Pubricized:
2024/12/05
PAPER
- Summary
- Free PDF (1.7MB)
Multi-modal Fake News Detection Enhanced by Fine-grained Knowledge Graph
Runlong HAO Hui LUO Yang LI

Pubricized:
2024/11/29
PAPER
- Summary
- Free PDF (2.2MB)
FusionReg: LiDAR-Camera Fusion Regression Enhancement for 3D Object Detection
Rongchun XIAO Yuansheng LIU Jun ZHANG Yanliang HUANG Xi HAN

Pubricized:
2024/11/27
PAPER
- Summary
- Free PDF (2.6MB)
An Accelerated Integrity-secured Name Resolution Architecture Using Two Full-service Resolvers with and without DNSSEC Validation in Parallel
Yong JIN Kazuya IGUCHI Nariyoshi YAMAI Rei NAKAGAWA Toshio MURAKAMI

Pubricized:
2024/11/22
PAPER
- Summary
- Free PDF (1010.6KB)
Leveraging Heterogeneous Programmable Data Planes for Security and Privacy of Cellular Networks, 5G & Beyond
Toru HASEGAWA Yuki KOIZUMI Junji TAKEMASA Jun KURIHARA Toshiaki TANAKA Timothy WOOD K. K. RAMAKRISHNAN

Pubricized:
2024/11/21
INVITED PAPER
- Summary
- Free PDF (10.1MB)
DGA-based Malware Communication Detection from DoH Traffic Using Hierarchical Machine Learning Analysis
Rikima MITSUHASHI Yong JIN Katsuyoshi IIDA Yoshiaki TAKAI

Pubricized:
2024/11/21
PAPER
- Summary
- Free PDF (3.8MB)
Exploiting Multi-Level Data Uncertainty for Japanese-Chinese Neural Machine Translation
Zezhong LI Jianjun MA Fuji REN

Pubricized:
2024/11/19
LETTER
- Summary
- Free PDF (275.3KB)
Improving Sentiment Analysis with an Ensemble Transformers Model on Health Pandemic based on Twitter data
Lorenzo Mamelona TingHuai Ma Jia Li Bright Bediako-Kyeremeh Benjamin Kwapong Osibo

Pubricized:
2024/11/19
PAPER
- Summary
- Free PDF (1.1MB)
APW: Asymmetric Padded Winograd to reduce thread divergence for computational efficiency on SIMT architecture
Wonho LEE Jong Wook KWAK

Pubricized:
2024/11/14
LETTER
- Summary
- Free PDF (1.1MB)
Effects of Numerical Method Selection on Fully-Pipelined FPGA Accelerators for Neural Simulations
Xiaoxiao ZHOU Yukinori SATO

Pubricized:
2024/11/13
PAPER
- Summary
- Free PDF (1.5MB)
A Text-to-Lyrics Generation Method Leveraging Image-based Semantics and Reducing Plagiarism Risk
Kento WATANABE Masataka GOTO

Pubricized:
2024/11/13
PAPER
- Summary
- Free PDF (1.6MB)
Multimodal Voice Activity Projection for Turn-taking and Effects on Speaker Adaptation
Kazuyo ONISHI Hiroki TANAKA Satoshi NAKAMURA

Pubricized:
2024/11/13
PAPER
- Summary
- Free PDF (1.6MB)
On a Perturbation Concept in Regular Interconnection Networks
Takashi YOKOTA Kanemitsu OOTSU

Pubricized:
2024/11/12
LETTER
- Summary
- Free PDF (217.3KB)
Bayesian-Optimization-based auto Optical Mark array recognition for flexible paper answer sheet
Chenbo SHI Wenxin SUN Jie ZHANG Junsheng ZHANG Chun ZHANG Changsheng ZHU

Pubricized:
2024/11/12
PAPER
- Summary
- Free PDF (3.9MB)
Selecting Source Code Generation Tools Based on Bandit Algorithms
Masateru TSUNODA Ryoto SHIMA Amjed TAHIR Kwabena Ebo BENNIN Akito MONDEN Koji TODA Keitaro NAKASAI

Pubricized:
2024/11/11
LETTER
- Summary
- Free PDF (130.3KB)
On the Application of Bandit Algorithm for Selecting Clone Detection Methods
Masateru TSUNODA Takuto KUDO Akito MONDEN Amjed TAHIR Kwabena Ebo BENNIN Koji TODA Keitaro NAKASAI Kenichi MATSUMOTO

Pubricized:
2024/11/11
LETTER
- Summary
- Free PDF (1.1MB)
Lightweight Neural Data Sequence Modeling by Scale Causal Blocks
Hiroaki AKUTSU Ko ARAI

Pubricized:
2024/11/08
PAPER
- Summary
- Free PDF (1.6MB)
Recaptured Image Detection Based on Multi-Scale Residual Features of Discriminative Regions
Lanxi LIU Pengpeng YANG Suwen DU Sani M. ABDULLAHI

Pubricized:
2024/11/08
PAPER
- Summary
- Free PDF (5.9MB)
Learn Discriminative Features for Small Object Detection through Multi-scale Image Degradation with Contrastive Learning
Xiaoguang TU Zhi HE Gui FU Jianhua LIU Mian ZHONG Chao ZHOU Xia LEI Juhang YIN Yi HUANG Yu WANG

Pubricized:
2024/11/05
PAPER
- Summary
- Free PDF (1.4MB)
Joint Distribution-Aligned Dual-Sparse Linear Regression for Cross-Stimulus Speech-Based Depression Detection
Yingying LU Cheng LU Yuan ZONG Feng ZHOU Chuangao TANG

Pubricized:
2024/11/01
LETTER
- Summary
- Free PDF (229.4KB)
Multi-grained Guaranteeable Requirement Analysis for Iterative Adaptation
Jialong LI Takuto YAMAUCHI Takanori HIRANO Jinyu CAI Kenji TEI

Pubricized:
2024/10/31
PAPER
- Summary
- Free PDF (567.6KB)
A fully digital transmitting-receiving platform for MIMO radar waveform diversity experiment
Wei LEI Yue ZHANG Hanfeng XIE Zebin CHEN Zengping CHEN Weixing LI

Pubricized:
2024/10/30
PAPER
- Summary
- Free PDF (7MB)
Leveraging Different Boolean Function Decompositions to Reduce T-Count in LUT-based Quantum Circuit Synthesis
David CLARINO Naoya ASADA Atsushi MATSUO Shigeru YAMASHITA

Pubricized:
2024/10/30
PAPER
- Summary
- Free PDF (1.3MB)
Criticality and Tolerance in Injection Timing in Cup-Stacking Method for Collective Communication
Takashi YOKOTA Kanemitsu OOTSU

Pubricized:
2024/10/28
PAPER
- Summary
- Free PDF (1.5MB)
An anchor-free Siamese tracker with multi-attention and corner detection mechanism
Xiaokang Jin Benben Huang Hao Sheng Yao Wu

Pubricized:
2024/10/28
PAPER
- Summary
- Free PDF (3MB)
Effect of Politeness on Trust in Re-enter Requests to User by Smart Speaker -Pilot Study-
Tomoki MIYAMOTO

Pubricized:
2024/10/23
LETTER
- Summary
- Free PDF (2.2MB)
Fine-tuning Models for Final Disagreement Anticipation in Negotiation Mid-Dialogues
Ken WATANABE Katsuhide FUJITA

Pubricized:
2024/10/10
PAPER
- Summary
- Free PDF (3.8MB)
Deepfake speech detection: approaches from acoustic features related to auditory perception to deep neural networks
Masashi UNOKI Kai LI Anuwat CHAIWONGYEN Quoc-Huy NGUYEN Khalid ZAMAN

Pubricized:
2024/10/07
INVITED PAPER
- Summary
- Free PDF (965KB)
Video Watermarking Method Based on 3D U-Net Robust Against Re-shooting
Takaharu TSUBOYAMA Ryota TAKAHASHI Motoi IWATA Koichi KISE

Pubricized:
2024/10/07
PAPER
- Summary
- Free PDF (4MB)
UTStyleCap4K: Generating Image Captions with Sentimental Styles
Chi ZHANG Li TAO Toshihiko YAMASAKI

Pubricized:
2024/10/02
PAPER
- Summary
- Free PDF (2.5MB)
FP-GNN: A Graph Neural Network for Hardware Trojan Detection in Gate-Level Netlist
Ann Jelyn TIEMPO Yong-Jin JEONG

Pubricized:
2024/10/01
LETTER
- Summary
- Free PDF (532.3KB)
A Multi-Agent Deep Reinforcement Learning Algorithm for Task offloading in future 6G V2X Network
Jiakun LI Jiajian LI Yanjun SHI Hui LIAN Haifan WU

Pubricized:
2024/09/24
PAPER
- Summary
- Free PDF (1.2MB)
Building Defect Prediction Models by Online Learning Considering Defect Overlooking
Nikolay FEDOROV Yuta YAMASAKI Masateru TSUNODA Akito MONDEN Amjed TAHIR Kwabena Ebo BENNIN Koji TODA Keitaro NAKASAI

Pubricized:
2024/09/09
LETTER
- Summary
- Free PDF (96.5KB)
The Impact of Defect (Re) Prediction on Software Testing
Yukasa MURAKAMI Yuta YAMASAKI Masateru TSUNODA Akito MONDEN Amjed TAHIR Kwabena Ebo BENNIN Koji TODA Keitaro NAKASAI

Pubricized:
2024/09/09
LETTER
- Summary
- Free PDF (148.4KB)
Bilaterally Colored Finite Automata and Bilaterally Colored Regular Expressions
Akira ITO Yoshiaki TAKAHASHI

Pubricized:
2024/08/20
PAPER
- Summary
- Free PDF (652.7KB)
Strategies and Equilibria on Indistinguishability of Winning Objectives and Related Decision Problems
Rindo NAKANISHI Yoshiaki TAKATA Hiroyuki SEKI

Pubricized:
2024/08/20
PAPER
- Summary
- Free PDF (1.5MB)
Computational Complexity of Yajisan-Kazusan and Stained Glass
Chuzo IWAMOTO Ryo TAKAISHI

Pubricized:
2024/08/16
PAPER
- Summary
- Free PDF (682.2KB)
A Bigram Based ILP Formulation for Break Minimization in Sports Scheduling Problems
Koichi FUJII Tomomi MATSUI

Pubricized:
2024/08/08
PAPER
- Summary
- Free PDF (1MB)
(15/14)n Flips are (almost) Sufficient to Sort Heydari and Sudborough's Pancake Stack
Kazuyuki AMANO

Pubricized:
2024/08/05
LETTER
- Summary
- Free PDF (288.7KB)
Overlapping of Lattice Unfolding for Cuboids
Takumi SHIOTA Tonan KAMATA Ryuhei UEHARA

Pubricized:
2024/08/05
PAPER
- Summary
- Free PDF (722.9KB)
An FPT Algorithm for the Exact Matching Problem and NP-hardness of Related Problems
Hitoshi MURAKAMI Yutaro YAMAGUCHI

Pubricized:
2024/08/01
PAPER
- Summary
- Free PDF (704.5KB)
Escape from the Room
Kento KIMURA Tomohiro HARAMIISHI Kazuyuki AMANO Shin-ichi NAKANO

Pubricized:
2024/07/11
PAPER
- Summary
- Free PDF (1.1MB)
Online combinatorial linear optimization via a Frank-Wolfe-based metarounding algorithm
Ryotaro MITSUBOSHI Kohei HATANO Eiji TAKIMOTO

Pubricized:
2024/07/11
PAPER
- Summary
- Free PDF (1.1MB)
Space-efficient FPT Algorithms for Degeneracy
Naohito MATSUMOTO Kazuhiro KURITA Masashi KIYOMI

Pubricized:
2024/05/31
- Summary
- Free PDF (101.3KB)
The Least Core of Routing Game Without Triangle Inequality
Tomohiro KOBAYASHI Tomomi MATSUI

Pubricized:
2024/05/30
- Summary
- Free PDF (232.9KB)
Enumerating floorplans with Aligned Columns
Shin-ichi NAKANO

Pubricized:
2024/05/30
- Summary
- Free PDF (365.4KB)
An IP Core Protection Scheme Based on Hybrid Lightweight Encryption for Neuromorphic Computing System
Ming PAN

The aritcle processing charge of this paper has not been paid.

Pubricized:
2022/09/14
- Summary

Whole issue

Volume E87-D No.7 (Publication Date:2004/07/01)

Special Section on Hardware/Software Support for High Performance Scientific and Engineering Computing

FOREWORD
Minyi GUO Laurence T. YANG

FOREWORD

Page(s):
1615-1616
- HTML
- PDF(67.3KB) >> Buy this Article
Utilization of the On-Chip L2 Cache Area in CC-NUMA Multiprocessors for Applications with a Small Working Set
Sung Woo CHUNG Hyong-Shik KIM Chu Shik JHON

PAPER-Networking and System Architectures

Page(s):
1617-1624
In CC-NUMA multiprocessor systems, it is important to reduce the remote memory access time. Based upon the fact that increasing the size of the LRU second-level (L2) cache more than a certain value does not reduce the cache miss rate significantly, in this paper, we propose two split L2 caches to utilize the surplus of the L2 cache. The split L2 caches are composed of a traditional LRU cache and another cache to reduce the remote memory access time. Both work together to reduce total L2 cache miss time by keeping remote (or long-distance) blocks as well as recently used blocks. For another cache, we propose two alternatives: an L2-RVC (Level 2 - Remote Victim Cache) and an L2-DAVC (Level 2 - Distance-Aware Victim Cache). The proposed split L2 caches reduce total execution time by up to 27%. It is also found that the proposed split L2 caches outperform the traditional single LRU cache of double size.
Optimal Multicast Tree Routing for Cluster Computing in Hypercube Interconnection Networks
Weijia JIA Bo HAN Pui On AU Yong HE Wanlei ZHOU

PAPER-Networking and System Architectures

Page(s):
1625-1632
Cluster computation has been used in the applications that demand performance, reliability, and availability, such as cluster server groups, large-scale scientific computations, distributed databases, distributed media-on-demand servers and search engines etc. In those applications, multicast can play the vital roles for the information dissemination among groups of servers and users. This paper proposes a set of novel efficient fault-tolerant multicast routing algorithms on hypercube interconnection of cluster computers using multicast shared tree approach. We present some new algorithms for selecting an optimal core (root) and constructing the shared tree so as to minimize the average delay for multicast messages. Simulation results indicate that our algorithms are efficient in the senses of short end-to-end average delay, load balance and less resource utilizations over hypercube cluster interconnection networks.
Robust Transmission of Wavelet Video Sequence over Wireless Communication Channels
Joo-Kyong LEE Ki-Dong CHUNG

PAPER-Networking and System Architectures

Page(s):
1633-1640
Bit-errors in a subband of a wavelet-based video frame during network transmission affect not only lower-level subbands within the same frame but also the subsequent frames. This is because the video frame is wavelet-transformed image with multi-levels and referenced from later frames. In this paper, we propose a new motion estimation scheme for wavelet-based video called Intra-frame Motion Estimation (IME), in which each subband except the LL subband refers to the 1-level-lower subband in the same orientation within the same frame. This scheme protects video quality by confining the effects of the bit-errors of all subbands, except the LL subband, within a frame. We evaluated the performance of our proposed scheme in a simulated wireless network environment. As a result of tests, it was shown that the proposed IME algorithm performs better than MRME, a motion-compensated video coding scheme for wavelet video, in a heavy motion video sequence, while IME outperforms MRME at a high bit-rate in small motion video sequence.
Enhancing ICP with P2P Technology: Cost, Availability, and Reconfiguration
Ping-Jer YEH Yu-Chen CHUANG Shyan-Ming YUAN

PAPER-Networking and System Architectures

Page(s):
1641-1648
Traditional Web cache servers based on HTTP and ICP infrastructure tend to have higher hardware and management cost, have difficulty in availability, automatic and dynamic reconfiguration, and may have slow links to some users. We find that peer-to-peer technology can help solve these problems. The peer cache service (PCS) we proposed here leverages each peer's local cache, similar access patterns, fully distributed coordination, and fast communication channels to enhance response time, scale of cacheable objects, and availability. Moreover, incorporating goals and strategies such as making the protocol lightweight and mutually compatible with existing cache infrastructure, supporting mobile devices, undertaking dynamic three-level caching, and exchanging cache meta-information further improve the effectiveness and differentiate our work from other similar-at-first-glance P2P Web cache systems.
Hierarchical Interconnection Networks Based on (3, 3)-Graphs for Massively Parallel Processors
Gene Eu JAN Yuan-Shin HWANG

PAPER-Networking and System Architectures

Page(s):
1649-1656
This paper proposes several novel hierarchical interconnection networks based on the (3, 3)-graphs, namely folded (3, 3)-networks, root-folded (3, 3)-networks, recursively expanded (3, 3)-networks, and flooded (3, 3)-networks. Just as the hypercubes, CCC, Peterson-based networks, and Heawood-based networks, these hierarchical networks have the following nice properties: regular topology, high scalability, and small diameters. Due to these important properties, these hierarchical networks seem to have the potential as alternatives for the future interconnection structures of multicomputer systems, especially massively parallel processors (MPPs). Furthermore, this paper will present the routing and broadcasting algorithms for these proposed networks to demonstrate that these algorithms are as elegant as the algorithms for hypercubes, CCC, and Petersen- or Heawood-based networks.
Configurable Communication Middleware for Clusters with Multiple Interconnections
Nader MOHAMED Jameela AL-JAROODI Hong JIANG

PAPER-Networking and System Architectures

Page(s):
1657-1665
High performance scientific and engineering applications running on clusters have different communication requirements. Current cluster configurations typically provide multiple network interfaces per node and multiple interconnections among nodes. However, transport protocols such as TCP do not utilize existing multiple network interfaces to enhance communication performance. This paper introduces a new configurable communication model utilizing multiple interconnections. The model adds mechanisms to manage and enhance the overall communication performance of clusters. These configurations include the use of parallel message transfers, the separation of the transfer channels between small messages and large messages, and load balancing among the channels. The main advantages of the model are: (1) providing a flexible, enhanced network infrastructure, (2) hiding the technical details of the heterogeneous network resources from the applications, and (3) providing an easy and flexible way to extend the network capacities for specific nodes. To illustrate the advantages and performance enhancements of the model, a prototype was implemented to experimentally evaluate the cluster network performance, which showed considerable gains.
Speculative Selection Routing in 2D Torus Network
Tran CONG SO Shigeru OYANAGI Katsuhiro YAMAZAKI

PAPER-Networking and System Architectures

Page(s):
1666-1673
We have proposed a speculative selection function for adaptive routing, which uses idle cycles of the network physical links to exchange network information between nodes, thus helps to decide the best selection. Previous study on the mesh network showed that SSR gives message selection flexibility that improves network performance by balancing the network traffic in both global and local scopes. This paper evaluates the speculative selection function on 2D torus network with simulation. The simulation compares the network throughput and latency with various traffic patterns. The visualization graphs show how the speculative selection eliminates hotspots and disperses traffic in the global scope. The simulation results demonstrate that by using speculative selection, the network performance is increased by around 7%. Compared to the mesh network, the torus's version has smaller gain due to the high performance nature of the torus network.
Address Computation in Configurable Parallel Memory Architecture
Eero AHO Jarno VANNE Kimmo KUUSILINNA Timo D. HAMALAINEN

PAPER-Networking and System Architectures

Page(s):
1674-1681
Parallel memories increase memory bandwidth with several memory modules working in parallel and can be used to feed a processor with only necessary data. The Configurable Parallel Memory Architecture (CPMA) enables a multitude of access formats and module assignment functions to be used within a single hardware implementation, which has not been possible in prior embedded parallel memory systems. This paper focuses on address computation in CPMA, which is implemented using several configurable computation units in parallel. One unit is dedicated for each type of access formats and module assignment functions that the implementation supports. Timing and area estimates are given for a 0.25-micron CMOS process. The utilized resources are shown to be linearly proportional to the number of memory modules.
TLB Update-Hint: A Scalable TLB Consistency Algorithm for Cache-Coherent Non-uniform Memory Access Multiprocessors
Byeonghag SEONG Donggook KIM Yangwoo ROH Kyuho PARK Daeyeon PARK

PAPER-Networking and System Architectures

Page(s):
1682-1692
Shared memory multiprocessors in which each processor has its own TLB must manage consistency among TLBs and a page table. As the large-scale CC-NUMA (cache-coherent non-uniform memory access) shared memory multiprocessors become popular, it is important for TLB consistency management algorithms to be highly scalable. In this paper, we propose a TLB update-hint algorithm as a scalable TLB consistency management solution for CC-NUMA multiprocessors. By using a lazy TLB invalidation approach, we reduced the number of unnecessary processor interruptions and idle-waiting time, and achieved a high level of scalability. Using a shared memory simulator, we evaluated the TLB update-hint algorithm. For performance comparison, we also simulated the TLB shootdown algorithm, one of the most popular TLB consistency algorithms. The simulations demonstrated that the TLB update-hint algorithm scales well in systems with a large number of processors. At 64 node systems, the TLB update-hint algorithm shows 4787% better performance than the TLB shootdown algorithm.
Programming Support for MPMD Parallel Computing in ClusterGOP
Fan CHAN Jiannong CAO Alvin T.S. CHAN Minyi GUO

PAPER-Software Support and Optimization Techniques

Page(s):
1693-1702
Many parallel applications involve different independent tasks with their own data. Using the MPMD model, programmers can have a modular view and simplified structure of the parallel programs. Although MPI supports both SPMD and MPMD models for programming, MPI libraries do not provide an efficient way for task communication for the MPMD model. We have developed a programming environment, called ClusterGOP, for building and developing parallel applications. Based on the graph-oriented programming (GOP) model, ClusterGOP provides higher-level abstractions for message-passing parallel programming with the support of software tools for developing and running parallel applications. In this paper, we describe how ClusterGOP supports programming of MPMD parallel applications on top of MPI. We discuss the issues of implementing the MPMD model in ClusterGOP using MPI and evaluate the performance by using example applications.
Traditional File Systems versus DualFS: A Performance Comparison Approach
Juan PIERNAS Toni CORTES Jose M. GARCIA

PAPER-Software Support and Optimization Techniques

Page(s):
1703-1711
DualFS is a next-generation journaling file system which has the same consistency guaranties as traditional journaling file systems but better performance. This paper introduces three new enhancements which significantly improve DualFS performance during normal operation, and presents different experimental results which compare DualFS and other traditional file systems, namely, Ext2, Ext3, XFS, JFS, and ReiserFS. The experiments carried out prove, for the first time, that a new file system design based on separation of data and metadata can significantly improve file systems' performance without requiring several storage devices.
VLaTTe: A Java Just-in-Time Compiler for VLIW with Fast Scheduling and Register Allocation
Suhyun KIM Soo-Mook MOON Kemal EBCIOLU Erik ALTMAN

PAPER-Software Support and Optimization Techniques

Page(s):
1712-1720
For network computing on desktop machines, fast execution of Java bytecode programs is essential because these machines are expected to run substantial application programs written in Java. We believe higher Java performance can be achieved by exploiting instruction-level parallelism (ILP) in the context of Java JIT compilation. This paper introduces VLaTTe, a Java JIT compiler for VLIW machines that performs efficient scheduling while doing fast register allocation. It is an extended version of our previous JIT compiler for RISC machines called LaTTe whose translation overhead is low (i.e., consistently taking one or two seconds for SPECJVM98 benchmarks) due to its fast register allocation. VLaTTe adds the scheduling capability onto the same framework of register allocation, with a constraint for precise in-order exception handling which guarantees the same Java exception behavior with the original bytecode program. Our experimental results on the SPECJVM98 benchmarks show that VLaTTe achieves a geometric mean of useful IPC 1.7 (2-ALU), 2.1 (4-ALU), and 2.3 (8-ALU), while the scheduling/allocation overhead is 3.6 times longer than LaTTe's on average, which appears to be reasonable.
Packing/Unpacking Using MPI User-Defined Datatypes for Efficient Data Redistribution
Sheng-Wen BAI Chu-Sing YANG Tsung-Chuan HUANG

PAPER-Software Support and Optimization Techniques

Page(s):
1721-1728
In many parallel programs, run-time data redistribution is usually required to enhance data locality and reduce remote memory access on the distributed memory multicomputers. Research on data redistribution algorithms has recently matured. The time required to generate data sets and processor sets is much lesser than before. Therefore, packing/unpacking has become a relatively high cost in redistribution. In this paper, we present methods to perform BLOCK-CYCLIC(s) to BLOCK-CYCLIC(t) redistribution, using MPI user-defined datatypes. This method reduces the required memory buffers and avoids unnecessary movement of data. Theoretical models are presented to determine the best method for redistribution. The methods were implemented on an IBM SP2 parallel machine to evaluate the performance of the proposed methods. The experimental results indicate that this approach can clearly improve the redistribution in most cases.
Proposal of a Tree Load Balancing Algorithm to Grid Computing Environments
Rodrigo Fernandes de MELLO Erico C. T. de MATTOS Luis Carlos TREVELIN Maria Stela Veludo de PAIVA Laurence T. YANG

PAPER-Software Support and Optimization Techniques

Page(s):
1729-1736
The availability of a low cost hardware has increased the development of distributed systems, by making then more and more accessible. In order to optimize the resources allocation on the distributed systems, some load balancing algorithms have been proposed. These algorithms distribute the application loads over the environment computers, make homogeneous the occupation of the whole environment and increase the application performance. This equal distribution prevents certain computers to get overloaded, to the detriment of the idleness of the other ones. This article proposes and analyzes the TLBAGrid, a load balancing algorithm for Grid computing environments.
Dynamic Code Repositioning for Java
Shinji TANAKA Tetsuyasu YAMADA Satoshi SHIRAISHI

PAPER-Software Support and Optimization Techniques

Page(s):
1737-1742
The sizes of recent Java-based server-side applications, like J2EE containers, have been increasing continuously. Past techniques for improving the performance of Java applications have targeted relatively small applications. Moreover, when the methods of these small target applications are invoked, they are not usually distributed over the entire memory space. As a result, these techniques cannot be applied efficiently to improve the performance of current large applications. We propose a dynamic code repositioning approach to improve the hit rates of instruction caches and translation look-aside buffers. Profiles of method invocations are collected when the application performs with its heaviest processor load, and the code is repositioned based on these profiles. We also discuss a method-splitting technique to significantly reduce the sizes of methods. Our evaluation of a prototype implementing these techniques indicated 5% improvement in the throughput of the application.
Algorithmic Concept Recognition to Support High Performance Code Reengineering
Beniamino DI MARTINO

PAPER-Software Support and Optimization Techniques

Page(s):
1743-1750
Techniques for automatic program recognition, at the algorithmic level, could be of high interest for the area of Software Maintenance, in particular for knowledge based reengineering, because the selection of suitable restructuring strategies is mainly driven by algorithmic features of the code. In this paper an automated hierarchical concept parsing recognition technique, and a formalism for the specification of algorithmic concepts, is presented. Based on this technique, the design and development of ALCOR, a production rule based system for automatic recognition of algorithmic concepts within programs, aimed at support of knowledge based reengineering for high performance, is presented.
A Two-Dimensional Quantum Transport Simulation of Nanoscale Double-Gate MOSFETs Using Parallel Adaptive Technique
Yiming LI Shao-Ming YU

PAPER-Scientific and Engineering Computing with Applications

Page(s):
1751-1758
In this paper we apply a parallel adaptive solution algorithm to simulate nanoscale double-gate metal-oxide-semiconductor field effect transistors (MOSFETs) on a personal computer (PC)-based Linux cluster with the message passing interface (MPI) libraries. Based on a posteriori error estimation, the triangular mesh generation, the adaptive finite volume method, the monotone iterative method, and the parallel domain decomposition algorithm, a set of two-dimensional quantum correction hydrodynamic (HD) equations is solved numerically on our constructed cluster system. This parallel adaptive simulation methodology with 1-irregular mesh was successfully developed and applied to deep-submicron semiconductor device simulation in our recent work. A 10 nm n-type double-gate MOSFET is simulated with the developed parallel adaptive simulator. In terms of physical quantities and refined adaptive mesh, simulation results demonstrate very good accuracy and computational efficiency. Benchmark results, such as load-balancing, speedup, and parallel efficiency are achieved and exhibit excellent parallel performance. On a 16 nodes PC-based Linux cluster, the maximum difference among CPUs is less than 6%. A 12.8 times speedup and 80% parallel efficiency are simultaneously attained with respect to different simulation cases.
A Parallel Implementation of Multi-Domain High-Order Navier-Stokes Equations Using MPI
Hui WANG Minyi GUO Daming WEI

PAPER-Scientific and Engineering Computing with Applications

Page(s):
1759-1765
In this paper, Message Passing Interface (MPI) techniques are used to implement high-order full 3-D Navier-Stokes equations in multi-domain applications. A two-domain interface with five-point overlapping used previously is expanded to a multi-domain computation. There are normally two approaches for this expansion. One is to break up the domain into two parts through domain decomposition (say, one overlapping), then using MPI directives to further break up each domain into n parts. Another is to break the domain up into 2n parts with (2n-1) overlappings. In our present effort, the latter approach is used and finite-size overlappings are employed to exchange data between adjacent multi-dimensional sub-domains. It is an alternative way to parallelize the high-order full 3-D Navier-Stokes equations into multi-domain applications without adding much complexity. Results with high-order boundary treatments show consistency among multi-domain calculations and single-domain results.
An Acceleration Processor for Data Intensive Scientific Computing
Cheong Ghil KIM Hong-Sik KIM Sungho KANG Shin Dug KIM Gunhee HAN

PAPER-Scientific and Engineering Computing with Applications

Page(s):
1766-1773
Scientific computations for diffusion equations and ANNs (Artificial Neural Networks) are data intensive tasks accompanied by heavy memory access; on the other hand, their computational complexities are relatively low. Thus, this type of tasks naturally maps onto SIMD (Single Instruction Multiple Data stream) parallel processing with distributed memory. This paper proposes a high performance acceleration processor of which architecture is optimized for scientific computing using diffusion equations and ANNs. The proposed architecture includes a customized instruction set and specific hardware resources which consist of a control unit (CU), 16 processing units (PUs), and a non-linear function unit (NFU) on chip. They are effectively connected with dedicated ring and global bus structure. Each PU is equipped with an address modifier (AM) and 16-bit 1.5 k-word local memory (LM). The proposed processor can be easily expanded by multi-chip expansion mode to accommodate to a large scale parallel computation. The prototype chip is implemented with FPGA. The total gate count is about 1 million with 530, 432-bit embedded memory cells and it operates at 15 MHz. The functionality and performance of the proposed processor is verified with simulation of oil reservoir problem using diffusion equations and character recognition application using ANNs. The execution times of two applications are compared with software realizations on 1.7 GHz Pentium IV personal computer. Though the proposed processor architecture and the instruction set are optimized for diffusion equations and ANNs, it provides flexibility to program for many other scientific computation algorithms.
A Super-Programming Technique for Large Sparse Matrix Multiplication on PC Clusters
Dejiang JIN Sotirios G. ZIAVRAS

PAPER-Scientific and Engineering Computing with Applications

Page(s):
1774-1781
The multiplication of large spare matrices is a basic operation in many scientific and engineering applications. There exist some high-performance library routines for this operation. They are often optimized based on the target architecture. For a parallel environment, it is essential to partition the entire operation into well balanced tasks and assign them to individual processing elements. Most of the existing techniques partition the given matrices based on some kind of workload estimation. For irregular sparse matrices on PC clusters, however, the workloads may not be well estimated in advance. Any approach other than run-time dynamic partitioning may degrade performance. In this paper, we apply our super-programming approach to parallel large matrix multiplication on PC clusters. In our approach, tasks are partitioned into super-instructions that are dynamically assigned to member computer nodes. Thus, the load balancing logic is separated from the computing logic; the former is taken over by the runtime environment. Our super-programming approach facilitates ease of program development and targets high efficiency in dynamic load balancing. Workloads can be balanced effectively and the optimization overhead is small. The results prove the viability of our approach.
Fast Parallel Solution for Set-Packing and Clique Problems by DNA-Based Computing
Michael (Shan-Hui) HO Weng-Long CHANG Minyi GUO Laurence T. YANG

PAPER-Scientific and Engineering Computing with Applications

Page(s):
1782-1788
This paper shows how to use sticker to construct solution space of DNA for the library sequences in the set-packing problem and the clique problem. Then, with biological operations, we propose DNA-based algorithms to remove illegal solutions and to find legal solutions for the set-packing and clique problems from the solution space of sticker. Any NP-complete problem in Cook's Theorem can be reduced and solved by the proposed DNA-based computing approach if its size is equal to or less than that of the set-packing problem. Otherwise, Cook's Theorem is incorrect on DNA-based computing and a new DNA algorithm should be developed from the characteristics of the NP-complete problem. Finally, the result to DNA simulation is given.
I/O-Efficient Multilevel Graph Partitioning Algorithm for Massive Graph Data
Jun-Ho HER R.S. RAMAKRISHNA

PAPER-Scientific and Engineering Computing with Applications

Page(s):
1789-1794
Graph data in large scientific/engineering applications are often too massive to fit inside the computer's main memory. The resulting input/output (I/O) costs could be a major performance bottleneck. This paper proposes an extension to extant multilevel graph partitioning algorithms with improved I/O-efficiency. The input graph is envisioned as the union of disjoint blocks (subgraphs) of almost the same size. Each block is coarsened in turn. Recursive matching and contraction are the operations in this phase. All the coarsened blocks are then merged in an iterative manner in order to ensure that the resulting graph fits in the main memory. This graph is then treated with an in-core multilevel graph partitioning algorithm in the usual way. Our experimental results show that the larger graph size is, the more dependent on the I/O-efficiency the performance is. And our modification can easily partition very large graphs. It also exhibits considerable improvement in I/O-complexity.
Database Allocation Modeling for Optimal Design of Distributed Systems
Jae-Woo LEE Doo-Kwon BAIK

PAPER-Distributed, Grid and P2P Computing

Page(s):
1795-1804
By using distributed database systems, many advantages can be obtained such as database management cost, efficiency, and high integrity of systems through allocating fragments to many distributed sites with horizontal/vertical fragmentation of global database schema. To minimize costs, distributed algorithms must be applied so that database fragments are allocated to optimal sites. It is useful to replicate fragments, such as allocating many copies in many sites including load balancing. But there are too many possible combinations of each site and fragment, making it impossible to find a solution in real time, i.e., it is an NP-complete problem. This paper proposes near optimal heuristic algorithms for minimizing cost by defining a cost model based on read and update queries that are requested in many sites. Various factors are applied to the proposed algorithms for sizing efficient network resources that compute database transactions as remote query or update requests for consistency in replicated database systems. For network load balancing, incoming network traffic table is defined in each site. A request transaction from unallocated sites to allocated sites can be accessed properly at any other replicated sites by using the network traffic table. Finally, some experimental results verified the proposed algorithms by comparing actual cases of database allocation.
A Distributed 3D Rendering Application for Massive Data Sets
Huabing ZHU Tony K.Y. CHAN Lizhe WANG Reginald C. JEGATHESE

PAPER-Distributed, Grid and P2P Computing

Page(s):
1805-1812
This paper presents a prototype of a distributed 3D rendering system in a hierarchical Grid environment. 3D rendering with massive data sets is a computationally intensive task. In order to make full use of computational resources on Grids, a hierarchical system architecture is designed to run over multiple clusters. This architecture involves both sort-first and sort-last parallel rendering algorithms to achieve excellent scalability, rendering performance and load balance.
High System Availability Using Neighbor Replication on Grid
Mustafa MAT DERIS Noraziah AHMAD Md. Yazid Mohd SAMAN Noraida ALI Youwei YUAN

PAPER-Distributed, Grid and P2P Computing

Page(s):
1813-1819
Data Replication can be used to improve the system availability of distributed systems. In such a system, a mechanism is required to maintain the consistency of the replicated data. The grid structure technique based on quorum is one of the solutions to perform this while providing a high availability of the system. It was shown in the study that, it still requires a bigger number of copies be made available to construct a quorum. So it is not suitable for large systems. In this paper, we propose a technique called the neighbor replication on grid (NRG) technique by considering only neighbors to have the replicated data. In comparison to the grid structure technique, NRG requires a lower communication cost for an operation, while providing a higher system availability, which is preferred for large systems.
MPICH-GF: Transparent Checkpointing and Rollback-Recovery for Grid-Enabled MPI Processes
Namyoon WOO Hyungsoo JUNG Heon Young YEOM Taesoon PARK Hyungwoo PARK

PAPER-Distributed, Grid and P2P Computing

Page(s):
1820-1828
Fault-tolerance is an essential feature of the distributed systems where the possibility of a failure increases with the growth of the system. In spite of extensive researches over two decades, fault-tolerance systems have not succeeded in practical use. It is due to the high overhead and the unhandiness of the previous fault-tolerance systems. In this paper, we propose MPICH-GF, a user-transparent checkpointing system for grid-enabled MPICH. Our objectives are to fill the gap between the theory and the practice of fault-tolerance systems, and to provide a checkpointing-recovery system for grids. To build a fault-tolerant MPICH version, we have designed task migration, dynamic process management, and atomic message transfer. MPICH-GF requires no modification of application source codes, and it affects the MPICH communication characteristics as less as possible. The features of MPICH-GF are that it supports the direct message transfer mode and that all of the implementation has been done at the lower layer, that is, the abstract device level. We have evaluated MPICH-GF using NPB applications on Globus middleware.
Evaluation of the Feedback Guided Dynamic Loop Scheduling (FGDLS) Algorithms
Sabin TABIRCA Tatiana TABIRCA Laurence T. YANG Len FREEMAN

PAPER-Distributed, Grid and P2P Computing

Page(s):
1829-1833
In this paper we consider the Feedback-Guided Dynamic Loop Scheduling (FGDLS) method that was proposed by Bull. The method uses a feedback-guided mechanism to schedule a parallel loop within a sequential outer loop. The execution times and the scheduling bounds at a outer iteration are used to find the scheduling bound of the next outer iteration. In this way FGDLS achieves an optimal load balance. Two algorithms have been proposed so far by Tabirca et al. In this article we will review these two algorithms and will give a comparison between their performances.
Ω Line Problem in Optimistic Log-Based Rollback Recovery Protocol
MaengSoon BAIK SungJin CHOI ChongSun HWANG JoonMin GIL ChanYeol PARK HeonChang YOO

PAPER-Distributed, Grid and P2P Computing

Page(s):
1834-1842
Optimistic log-based rollback recovery protocols have been regarded as an attractive fault-tolerant solution in distributed systems based on message-passing paradigm due to low overhead in failure-free time. These protocols are based on a Piecewise Deterministic (PWD) Assumption model. They, however, assumed that all logged non-deterministic events in a consistent global recovery line must be determinately replayed in recovery time. In this paper, we give the impossibility of deterministic replaying of logged non-deterministic event in a consistent global recovery line as a Ω Line Problem, because of asynchronous properties of distributed systems: no bound on the relative speeds of processes, no bound on message transmission delays and no global time source. In addition, we propose a new optimistic log-based rollback recovery protocol, which guarantees the deterministic replaying of all logged non-deterministic events belonged in a consistent global recovery line and solves a Ω Line Problem in recovery time.
A Grid Portal to Support High-Performance Scientific Computing on Distributed Resources
Jacobo TARRIO Juan TOURIÑO María J. MARTIN Patricia GONZALEZ Ramon DOALLO

PAPER-Distributed, Grid and P2P Computing

Page(s):
1843-1849
Grid computing can help to promote high-performance computing at a low overall cost by encouraging research centers to share their resources. However, research staff usually finds it quite hard to use Grids effectively, due to the need of installing and managing new Grid software. Thus, Grid portals are created, making it easier to take advantage of the full capability of the Grid, favoring in this way its use. The goal of this paper is to describe the process of design and implementation of a Grid Portal with the aim of both supporting distributed high-performance resources and make its use by researchers as transparent as possible. This portal uses standard Grid and Web technologies. We have designed the portal so that it can be adapted to different existing Grid infrastructures, based on the Globus Toolkit, and new functionalities can be easily added. The first prototype of the portal has been tested on an experimental Grid platform, and we present encouraging experiences carried out there.
A Rank-Based Selection Method of Materialized Queries for Efficient Query Evaluation in a Mediator
Kil Hong JOO Won Suk LEE

PAPER-Distributed, Grid and P2P Computing

Page(s):
1850-1858
This paper proposes an efficient query evaluation scheme for a mediator system intended to integrate heterogeneous computing environment in terms of operating systems, database management systems, and other software. Most of mediator systems transform a global query into a set of sub-queries based on their target remote servers. Each sub-query is evaluated by the query modification method to evaluate a global query. However, it is possible to reduce the evaluation cost of a global query when the results of frequently requested sub-queries are materialized in a mediator. In a mediator, its integrating schema can be incrementally modified and the evaluation frequency of a global query can also be continuously varied. In order to select the optimized set of materialized sub-queries with respect to their current evaluation frequencies, the proposed method applies a decay factor for modeling the recent access behavior of each sub-query. In other words, the latest access of a sub-query gets the highest attention in the selection process of materialized sub-queries. As a result, it is possible to adjust the optimized set of materialized sub-queries adaptively according to the recent changes in the evaluation frequencies of sub-queries. Since finding the optimum solution of this problem is NP-hard, it takes too long to be used in practice when the number of sub-queries is large. Consequently, given the size of mediator storage, the rank-based selection algorithm proposed in this paper finds the set of materialized sub-queries which minimizes the total evaluation cost of global queries in linear search complexity.
Allocation of Tasks in a DCS Using a Different Approach with A^* Considering Load
Biplab KUMER SARKER Anil KUMAR TRIPATHI Deo PRAKASH VIDYARTHI Laurence T. YANG Kuniaki UEHARA

PAPER-Distributed, Grid and P2P Computing

Page(s):
1859-1866
In a Distributed Computing Systems (DCS) tasks submitted to it, are usually partitioned into different modules and these modules may be allocated to different processing nodes so as to achieve minimum turn around time of the tasks utilizing the maximum resources of the existing system such as CPU speed, memory capacities etc. The problem lies on how to obtain the optimal allocation of these multiple tasks by keeping in mind that no processing node is overloaded due to this allocation. This paper proposes an algorithm A^*RS, using well-known A^*, which aims to reduce the search space and time for task allocation. It aims at minimization of turn around time of tasks in the way so that processing nodes do not become overloaded due to this allocation. Our experimental results justify the claims with necessary supports by comparing it with the earlier algorithm for multiple tasks allocation.

Regular Section

Efficient Architectures for the Biorthogonal Wavelet Transform by Filter Bank and Lifting Scheme
Yeu-Horng SHIAU Jer Min JOU Chin-Chi LIU

PAPER-VLSI Systems

Page(s):
1867-1877
In this paper, two efficient VLSI architectures for biorthogonal wavelet transform are proposed. One is constructed by the filter bank implementation and another is constructed by the lifting scheme. In the filter bank implementation, due to the symmetric property of biorthogonal wavelet transform, the proposed architecture uses fewer multipliers than the orthogonal wavelet transform. Besides, the polyphase decomposition is adopted to speed up the processing by a factor of 2. In the lifting scheme implementation, the pipeline-scheduling technique is employed to optimize the architecture. Both two architectures are with advantages of lower implementation complexity and higher throughput rate. Moreover, they can also be applied to realize the inverse DWT efficiently. Based on the above properties, the two architectures can be applied to time-critical image compressions, such as JPEG2000. Finally, the architecture constructed by the lifting scheme is implemented into a single chip on 0.35 µm 1P4M CMOS technology, and its area and working performance are 5.005 5.005 mm² and 50 MHz, respectively.
A Statistical Time Synchronization Method for Frequency-Synchronized Network Clocks in Distributed Systems
Takao YAMASHITA Satoshi ONO

PAPER-Computer Systems

Page(s):
1878-1886
In this paper, we propose a statistical method of time synchronization for computer clocks that have precisely frequency-synchronized oscillators. This method not only improves the accuracy of time synchronization but also prevents degradation in the frequency stability of the precise oscillators when the errors in the measured time offsets between computer clocks caused by network traffic possess a Gaussian distribution. Improved accuracy of time synchronization is achieved by estimating the confidence interval of the measured time offsets between the clocks. Degradation in frequency stability is prevented by eliminating unnecessary time correction for the computer clock, because time correction generally causes changes in the frequency accuracy and stability of the precise oscillators. To eliminate unnecessary time correction, our method uses an extended hypothesis test of the difference between the current mean and the mean at the last time adjustment to determine whether time correction is needed. Evaluation by simulating changes in the time offset of the existing ISDN clock synchronization system showed that this method achieves accurate time and stable frequency synchronization.
Dynamic Communication Performance of a Hierarchical Torus Network under Non-uniform Traffic Patterns
M. M. Hafizur RAHMAN Susumu HORIGUCHI

PAPER-Computer Systems

Page(s):
1887-1896
Interconnection networks play a crucial role in the performance of massively parallel computers. Hierarchical interconnection networks provide high performance at low cost by exploring the locality that exists in the communication patterns of massively parallel computers. A Hierarchical Torus Network (HTN) is a 2D-torus network of multiple basic modules, in which the basic modules are 3D-torus networks that are hierarchically interconnected for higher level networks. The static network performance of the HTN has already been studied and has been shown to be good. Dynamic communication performance has been evaluated under uniform traffic pattern but not under non-uniform traffic patterns. In this paper, we present a deadlock-free routing algorithm for the HTN using 3 virtual channels and evaluate the network's dynamic communication performance under three non-uniform traffic patterns, using the proposed routing algorithm. We evaluate the dynamic communication performance of HTN, H3D-mesh, H3D-torus, TESH, and mesh networks by computer simulation. We find that the dynamic communication performance of HTN is better than that of the H3D-mesh, H3D-torus, TESH, and mesh networks.
Design and Implementation of Markup Language for Integrating of Motion Capture Data Formats
Hyun-Sook CHUNG Yillbyung LEE

PAPER-System Programs

Page(s):
1897-1909
Motion capture technology is widely used to make a realistic motion in these days. Different motion capture devices use different motion capture data formats. Because of the lack of compatibility of motion capture data animators can't reuse the already captured motion sequence. In addition, it is difficult for integrating, storing and retrieving motion capture data with different formats in the storage. In this paper, we propose a standard format for integrating a different motion capture data formats. In addition, we propose a framework of a system that manages motion capture data using our standard format. Our standard format is called MCML (Motion Capture Markup Language). It is a markup language for motion capture data and is based on XML (extensible Markup Language). Our system designed to manage motion capture data consists of a several components -- Mocap Syntax Analyzer, MCML Converter, MCML Editor, Motion Viewer, MCML Storage Wrapper.
Multiple DNA Sequences Alignment Using Heuristic-Based Genetic Algorithm
Chih-Chin LAI Shih-Wei CHUNG

PAPER-Artificial Intelligence and Cognitive Science

Page(s):
1910-1916
The alignment of biological sequences is a crucial tool in molecular biology and genome analysis. A wide variety of approaches has been proposed for multiple sequence alignment problem; however, some of them need prerequisites to help find the best alignment or some of them may suffer from the drawbacks of complexity and memory requirement so they can be only applied to cases with a limited number of sequences. In this paper, we view the multiple sequence alignment problem as an optimization problem and propose a heuristic-based genetic algorithm (GA) approach to solve it. The heuristic/GA hybrid yields better results than other well-known packages do. Experimental results are presented to illustrate the feasibility of the proposed approach.
A Statistical Method of Evaluating Pronunciation Proficiency for English Words Spoken by Japanese
Seiichi NAKAGAWA Naoki NAKAMURA Kazumasa MORI

PAPER-Speech and Hearing

Page(s):
1917-1922
In this paper, we propose a statistical method of evaluating the pronunciation proficiency of English words spoken by Japanese. We analyzed statistically the utterances to note a combination that has a high correlation between an English teacher's score and certain acoustic features. We obserbed that the phoneme recognition rates (correct rate and accuracy) were the best measure of pronunciation proficiency, and the likelihood ratio of English phoneme acoustic models to phoneme acoustic models adapted by Japanese was the second best measure. The effective measure which was highly correlated with the English teacher's score was the combination of the likelihood for American native models, likelihood for English models adapted by Japanese, the best likelihood for arbitrary sequences of acoustic models, phoneme recognition rate and the rate of speech. We obtained a correlation coefficient of 0.81 with an open data for vocabulary and 0.69 with open data for speaker at the five words set level, respectively. The coefficient was higher than the correlation between humans' scores, 0.65. In the 15 words set level which corresponds to one or two sentences, we obtained the correlation coefficient of 0.86 with open data for the speaker.
Auto Focusing Algorithm for Iris Recognition Camera Using Corneal Specular Reflection
Kang Ryoung PARK

This paper was deleted on March 10, 2006 because it was found to be a duplicate submission (see details in the pdf file).

PAPER-Image Processing and Video Processing

Page(s):
1923-1934
Iris recognition is used to identify a user based on the iris texture information which exists between the white sclera and the black pupil. For fast iris recognition, it is very important to capture user's focused eye image at fast speed. If not, the total recognition time is increased and it makes the user feel much inconvenience. In previous researches and systems, they use the focusing method which has been used for general landscape scene without considering the characteristics of iris image. So, they take much focusing time sometimes, especially in case of the user with glasses. To overcome such problems, we propose a new iris image acquisition method to capture user's focused eye image at very fast speed based on the corneal specular reflection. Experimental results show that the focusing time for both the users with glasses and without glasses is average 480 ms and we can conclude our method can be used for the real-time iris recognition camera.
Multi-Stage Unsupervised Learning for Multi-Body Motion Segmentation
Yasuyuki SUGAYA Kenichi KANATANI

PAPER-Image Recognition, Computer Vision

Page(s):
1935-1942
Many techniques have been proposed for segmenting feature point trajectories tracked through a video sequence into independent motions, but objects in the scene are usually assumed to undergo general 3-D motions. As a result, the segmentation accuracy considerably deteriorates in realistic video sequences in which object motions are nearly degenerate. In this paper, we propose a multi-stage unsupervised learning scheme first assuming degenerate motions and then assuming general 3-D motions and show by simulated and real video experiments that the segmentation accuracy significantly improves without compromising the accuracy for general 3-D motions.
Multi-Modal Neural Networks for Symbolic Sequence Pattern Classification
Hanxi ZHU Ikuo YOSHIHARA Kunihito YAMAMORI Moritoshi YASUNAGA

PAPER-Biocybernetics, Neurocomputing

Page(s):
1943-1952
We have developed Multi-modal Neural Networks (MNN) to improve the accuracy of symbolic sequence pattern classification. The basic structure of the MNN is composed of several sub-classifiers using neural networks and a decision unit. Two types of the MNN are proposed: a primary MNN and a twofold MNN. In the primary MNN, the sub-classifier is composed of a conventional three-layer neural network. The decision unit uses the majority decision to produce the final decisions from the outputs of the sub-classifiers. In the twofold MNN, the sub-classifier is composed of the primary MNN for partial classification. The decision unit uses a three-layer neural network to produce the final decisions. In the latter type of the MNN, since the structure of the primary MNN is folded into the sub-classifier, the basic structure of the MNN is used twice, which is the reason why we call the method twofold MNN. The MNN is validated with two benchmark tests: EPR (English Pronunciation Reasoning) and prediction of protein secondary structure. The reasoning accuracy of EPR is improved from 85.4% by using a three-layer neural network to 87.7% by using the primary MNN. In the prediction of protein secondary structure, the average accuracy is improved from 69.1% of a three-layer neural network to 74.6% by the primary MNN and 75.6% by the twofold MNN. The prediction test is based on a database of 126 non-homologous protein sequences.
A Chaotic Maximum Neural Network for Maximum Clique Problem
Jiahai WANG Zheng TANG Ronglong WANG

PAPER-Biocybernetics, Neurocomputing

Page(s):
1953-1961
In this paper, based on maximum neural network, we propose a new parallel algorithm that can escape from local minima and has powerful ability of searching the globally optimal or near-optimum solution for the maximum clique problem (MCP). In graph theory a clique is a completely connected subgraph and the MCP is to find a clique of maximum size of a graph. The MCP is a classic optimization problem in computer science and in graph theory with many real-world applications, and is also known to be NP-complete. Lee and Takefuji have presented a very powerful neural approach called maximum neural network for this NP-complete problem. The maximum neural model always guarantees a valid solution and greatly reduces the search space without a burden on the parameter-tuning. However, the model has a tendency to converge to the local minimum easily because it is based on the steepest descent method. By adding a negative self-feedback to the maximum neural network, we proposed a parallel algorithm that introduces richer and more flexible chaotic dynamics and can prevent the network from getting stuck at local minima. After the chaotic dynamics vanishes, the proposed algorithm is then fundamentally reined by the gradient descent dynamics and usually converges to a stable equilibrium point. The proposed algorithm has the advantages of both the maximum neural network and the chaotic neurodynamics. A large number of instances have been simulated to verify the proposed algorithm.
A Low-Power Tournament Branch Predictor
Sung Woo CHUNG Gi Ho PARK Sung Bae PARK

LETTER-Computer Systems

Page(s):
1962-1964
This letter proposes a low-power tournament branch predictor, in which the number of accesses to the branch predictors (local predictor or global predictor) is reduced. Analysis results with Samsung Memory Compiler show that the proposed branch predictor reduces the power consumption by 24-45%, compared to the conventional tournament branch predictor, not requiring any additional storage arrays, not incurring any additional delay and never harming accuracy.
A Method to Preserve Layered Architectural Style in Development Phases
Chanjin PARK Euyseok HONG Chisu WU

LETTER-Software Engineering

Page(s):
1965-1970
This paper proposes a new type of relationship between layers in layered architecture and shows how to concretize the relationship between layers into design constraints. The meaning of layer relationship is explained with examples from design patterns and Microsoft COM. In addition, a prototype tool to check conformance is implemented and the architecture document of an open-source software project is checked against the actual architecture extracted from source code developed by many international developers. As a result of checking, parts that do not conform to the architecture document are investigated and it is pointed out that their modifications should be controlled with caution.
A Local Learning Framework Based on Multiple Local Classifiers
BaekSop KIM HyeJeong SONG JongDae KIM

LETTER-Pattern Recognition

Page(s):
1971-1973
This paper presents a local learning framework in which the local classifiers can be pre-learned and the support size of each classifier can be selected to minimize the error bound. The proposed algorithm is compared with the conventional support vector machine (SVM). Experimental results show that our scheme using the user-defined parameters C and σ is more accurate and less sensitive than the conventional SVM.
Harmonic Model Based Excitation Enhancement for Low-Bit-Rate Speech Coding
Hong Kook KIM Mi Suk LEE Chul Hong KWON

LETTER-Speech and Hearing

Page(s):
1974-1977
A new excitation enhancement technique based on a harmonic model is proposed in this paper to improve the speech quality of low-bit-rate speech coders. This technique is employed only in the decoding process of speech coders and improves high-frequency components of excitation. We develop the procedure of harmonic model parameters estimation and harmonic generation and apply the technique to a current state-of-art low bit rate speech coder. Experiments on spectrum reading and spectrum distortion measurement show that the proposed excitation enhancement technique improves speech quality.
Distorted Speech Rejection for Automatic Speech Recognition in Wireless Communication
Joon-Hyuk CHANG Nam Soo KIM

LETTER-Speech and Hearing

Page(s):
1978-1981
This letter introduces a pre-rejection technique for wireless channel distorted speech with application to automatic speech recognition (ASR). Based on analysis of distorted speech signals over a wireless communication channel, we propose a method to reject the channel distorted speech with a small computational load. From a number of simulation results, we can discover that the pre-rejection algorithm enhances the robustness of speech recognition operation.
Document Genre Classification for User Interface of Web Search Engine
Kong-Joo LEE

LETTER-Natural Language Processing

Page(s):
1982-1986
In this letter we suggest sets of features to classify genres of web documents. Web documents are different from textual documents in that they contain URL and HTML tags within the pages. We introduce the features specific to web documents, which are extracted from URL and HTML tags. Experimental results enable us to evaluate their characteristics and performances. On the basis of the experimental results, we implement a user interface of a web search engine that presents documents grouped by genres.
Continuous Optimization for Item Selection in Collaborative Filtering
Kohei INOUE Kiichi URAHAMA

LETTER-Biocybernetics, Neurocomputing

Page(s):
1987-1988
A method is presented for selecting items asked for new users to input their preference rates on those items in recommendation systems based on the collaborative filtering. Optimal item selection is formulated by an integer programming problem and we solve it by using a kind of the Hopfield-network-like scheme for interior point methods.