IEICE globals.ieice.org Site

Author Search Result

[Author] Yoshinori TAKEUCHI(32hit)

1-20hit(32hit)

Effectiveness of a High Speed Context Switching Method Using Register Bank
Jun-ichi ITO Takumi NAKANO Yoshinori TAKEUCHI Masaharu IMAI

PAPER-LSI Architecture

Vol:
E81-A No:12
Page(s):
2661-2667
This paper proposes a method to reduce the context switching time using a register bank to store contexts of working tasks. Hardware cost and performance were measured by modeling the register bank and controller in VHDL. Following results were obtained: (1) The controller can be implemented with a much smaller amount of hardware cost compared to that of the register bank, which is realized by SRAM module. (2) Context switching time can be reduced to less than 50% compared to that by software implementation. (3) Combination of the proposed architecture with our previous work (RTOS implemented in HW) gives us much higher performance of a hard real-time system.
Acceleration of Genetic Programming by Hierarchical Structure Learning: A Case Study on Image Recognition Program Synthesis
Ukrit WATCHAREERUETAI Tetsuya MATSUMOTO Noboru OHNISHI Hiroaki KUDO Yoshinori TAKEUCHI

PAPER-Artificial Intelligence and Cognitive Science

Vol:
E92-D No:10
Page(s):
2094-2102
We propose a learning strategy for acceleration in learning speed of genetic programming (GP), named hierarchical structure GP (HSGP). The HSGP exploits multiple learning nodes (LNs) which are connected in a hierarchical structure, e.g., a binary tree. Each LN runs conventional evolutionary process to evolve its own population, and sends the evolved population into the connected higher-level LN. The lower-level LN evolves the population with a smaller subset of training data. The higher-level LN then integrates the evolved population from the connected lower-level LNs together, and evolves the integrated population further by using a larger subset of training data. In HSGP, evolutionary processes are sequentially executed from the bottom-level LNs to the top-level LN which evolves with the entire training data. In the experiments, we adopt conventional GPs and the HSGPs to evolve image recognition programs for given training images. The results show that the use of hierarchical structure learning can significantly improve learning speed of GPs. To achieve the same performance, the HSGPs need only 30-40% of the computation cost needed by conventional GPs.
Heuristic Instruction Scheduling Algorithm Using Available Distance for Partial Forwarding Processor
Takuji HIEDA Hiroaki TANAKA Keishi SAKANUSHI Yoshinori TAKEUCHI Masaharu IMAI

PAPER-Embedded, Real-Time and Reconfigurable Systems

Vol:
E92-A No:12
Page(s):
3258-3267
Partial forwarding is a design method to place forwarding paths on a part of processor pipeline. Hardware cost of processor can be reduced without performance loss by partial forwarding. However, compiler with the instruction scheduler which considers partial forwarding structure of the target processor is required since conventional scheduling algorithm cannot make the most of partial forwarding structure. In this paper, we propose a heuristic instruction scheduling method for processors with partial forwarding structure. The proposed algorithm uses available distance to schedule instructions which are suitable for the target partial forwarding processor. Experimental results show that the proposed method generates near-optimal solutions in practical time and some of the optimized codes for partial forwarding processor run in the shortest time among the target processors. It also shows that the proposed method is superior to hazard detection unit.
Code Efficiency Evaluation for Embedded Processors
Morgan Hirosuke MIKI Mamoru SAKAMOTO Shingo MIYAMOTO Yoshinori TAKEUCHI Toyohiko YOSHIDA Isao SHIRAKAWA

PAPER

Vol:
E85-A No:4
Page(s):
811-818
This paper evaluates the code efficiency of the ARM, Java, and x86 instruction sets by compiling the SPEC CPU95/CPU2000/JVM98 and CaffeineMark benchmarks, from the aspects of code sizes, basic block sizes, instruction distributions, and average instruction lengths. As a result, mainly because (i) the Java architecture is a stack machine, (ii) there are only four local variables which can be accessed by a 1-byte instruction, and (iii) additional instructions are provided for the network security, the code efficiency of Java turns out to be inferior to that of ARM Thumb. Moreover, through this efficiency analysis it should be stressed that there exists the high potential of constructing a more efficient code architecture by taking minute account of the customization of an instruction set as well as the number of registers.
FOREWORD Open Access
Yoshinori TAKEUCHI

FOREWORD

Vol:
E105-A No:3
Page(s):
436-436
Two Dimensional Space Partition Recursive Filtering Algorithm on Rectangular Processor Array
Yoshinori TAKEUCHI Hiroaki KUNIEDA

PAPER-Digital Signal Processing

Vol:
E74-A No:1
Page(s):
42-48
This paper studies the method of parallel processing for two dimensional recursive filters on a multiprocessor system. Conventional recursive filterings are sequential and iterative local, i.e. global processing. We decompose their global processings into space partition processings with a few global communications. We derive an efficient parallel algorithm for two dimensional recursive filterings using Roesser's model and investigate their speed-up, rate, efficiency and degradation.
An Optimization Algorithm for High Performance ASIP Design with Considering the RAM and ROM Sizes
Nguyen Ngoc BINH Masaharu IMAI Yoshinori TAKEUCHI

PAPER-Co-design

Vol:
E81-A No:12
Page(s):
2612-2620
In designing ASIPs (Application Specific Integrated Processors), the papers investigated so far have almost focused on the optimization of the CPU core and did not pay enough attention to the optimization of the RAM and ROM sizes together. This paper overcomes this limitation and proposes an optimization algorithm to define the best ratio between the CPU core, RAM and ROM of an ASIP chip to achieve the highest performance while satisfying design constraints on the chip area. The partitioning problem is formalized as a combinatorial optimization problem that partitions the operations into hardware and software so that the performance of the designed ASIP is maximized under given chip area constraint, where the chip area includes the HW cost of the register file for a given application program with associated input data set. The optimization problem is parameterized so that it can be applied with different technologies to synthesize CPU cores, RAMs or ROMs. The experimental results show that the proposed algorithm is found to be effective and efficient.
Parallel Processing Architecture Design for Two-Dimensional Image Processing Using Spatial Expansion of the Signal Flow Graph
Tsuyoshi ISSHIKI Yoshinori TAKEUCHI Hiroaki KUNIEDA

PAPER

Vol:
E76-A No:3
Page(s):
337-348
In this paper, a methodology for designing the architecture of the processor array for wide class of image processing algorithms is proposed. A concept of spatially expanding the SFG description which enables us to handle the problem as merely one-dimensional signal processing is used in constructing the methodology. Problem of I/O interface which is critical in real-time processing is also considered.
Synthesizable HDL Generation for Pipelined Processors from a Micro-Operation Description
Makiko ITOH Yoshinori TAKEUCHI Masaharu IMAI Akichika SHIOMI

PAPER

Vol:
E83-A No:3
Page(s):
394-400
A synthesizable HDL generation method for pipelined processors is proposed. By using the proposed method, data-path and control logic descriptions of a target processor is generated from a clock based instruction set specification. From the experimental results, feasibility of the proposed method is evaluated and the amount of processor design time was drastically reduced than that of conventional RT level manual design in HDL.
Segmentation of Depth-of-Field Images Based on the Response of ICA Filters
Andre CAVALCANTE Allan Kardec BARROS Yoshinori TAKEUCHI Noboru OHNISHI

LETTER-Image Recognition, Computer Vision

Vol:
E95-D No:4
Page(s):
1170-1173
In this letter, a new approach to segment depth-of-field (DoF) images is proposed. The methodology is based on a two-stage model of visual neuron. The first stage is a retinal filtering by means of luminance normalizing non-linearity. The second stage is a V1-like filtering using filters estimated by independent component analysis (ICA). Segmented image is generated by the response activity of the neuron measured in terms of kurtosis. Results demonstrate that the model can discriminate image parts in different levels of depth-of-field. Comparison with other methodologies and limitations of the proposed methodology are also presented.
Separation of Narrow Bandwidth Spectral Light from Femtosecond Pulses Using Optical Coupler with Fiber Grating
Asako BABA Hitomi MORIYA Shin-ichi WAKABAYASHI Yukio TOYODA Yoshinori TAKEUCHI

PAPER-Fibers

Vol:
E83-C No:6
Page(s):
824-829
We have developed spectral separation devices for processing femtosecond pulses. These devices are based on an optical coupler structure with fiber gratings. In a computer simulation, we confirmed that these devices could extract <1 nm bandwidth light with 80% efficiency. We fabricated the spectral separation devices using single mode fibers and highly Ge-doped fibers. These devices successfully extracted narrow spectral light of 0.3 nm bandwidth with 37% efficiency from femtosecond pulses of 40 nm bandwidth. We also fabricated 2-channel spectral separation devices, which could extract the light from each grating channel.
Optical Encoding and Decoding of Femtosecond Pulses in the Spectral Domain Using Optical Coupler with Fiber Gratings
Shin-ichi WAKABAYASHI Hitomi MORIYA Asako BABA Yoshinori TAKEUCHI

PAPER-OTDM Transmission System, Optical Regeneration and Coding

Vol:
E85-C No:1
Page(s):
135-140
We have developed optical encoding devices for processing femtosecond pulses. These devices are based on spectral separation devices and light modulators with fiber gratings. Experiments were made to encode a light pulse in the spectral domain. These experiments utilize the characteristics that a femtosecond light pulse has a very broad spectrum. An input femtosecond light pulse is decomposed into a series of wavelength components. Each wavelength component with narrow spectra <1 nm width is successfully extracted into a single mode fiber. Light modulators corresponding to wavelength components are assigned to the 1st bit, the 2nd bit, the 3rd bit, , the nth bit, respectively. All of the encoded wavelength components are again recombined into a single time-varying signal and transmitted through an optical fiber. Decoding at receiving site is made by the reverse operation. Encoding and decoding for 2-bit and 4-bit signals were demonstrated for 200 fs input light pulse with about 40 nm spectral width.
Distributed Load Balancing Schemes for Parallel Video Encoding System
Zhaochen HUANG Yoshinori TAKEUCHI Hiroaki KUNIEDA

PAPER-Parallel/Multidimensional Signal Processing

Vol:
E77-A No:5
Page(s):
923-930
We present distributed load balancing mechanisms implemented on multiprocessor systems for real time video encoding, which dynamically equalize load amounts among PE's to cope with extensive computing requirements. The loosely coupled multiprocessor system, e.g. a torus connected one, is treated as the objective system. Two decentralized controlled load balancicg algorithms are proposed, and mathematical analyses are provided to obtain some insights of our decentralized controlled mechanisms. We also prove the proposed algorithms are steady and effective theoretically and experimentally.
A Compiler Generation Method for HW/SW Codesign Based on Configurable Processors
Shinsuke KOBAYASHI Kentaro MITA Yoshinori TAKEUCHI Masaharu IMAI

PAPER-Hardware/Software Codesign

Vol:
E85-A No:12
Page(s):
2586-2595
This paper proposes a compiler generation method for PEAS-III (Practical Environment for ASIP development), which is a configurable processor development environment for application domain specific embedded systems. Using the PEAS-III system, not only the HDL description of a target processor but also its target compiler can be generated. Therefore, execution cycles and dynamic power consumption can be rapidly evaluated. Two processors and their derivatives were designed using the PEAS-III system in the experiment. Experimental results show that the trade-offs among area, performance and power consumption of processors were analyzed in about twelve hours and the optimal processor was selected under the design constraints by using generated compilers and processors.
A Performance Optimization Method for Pipelined ASIPs in Consideration of Clock Frequency
Katsuya SHINOHARA Norimasa OHTSUKI Yoshinori TAKEUCHI Masaharu IMAI

PAPER

Vol:
E82-A No:11
Page(s):
2356-2365
This paper proposes an ASIP performance optimization method taking clock frequency into account. The performance of an instruction set processor can be measured using the execution time of an application program, which can be determined by the clock cycles to perform the application program divided by the applied clock frequency. Therefore, the clock frequency should also be tuned in order to maximize the performance of the processor under the given design constraints. Experimental results show that the proposed method determines an optimal combination of FUs considering clock frequency.
RHINE: Reconfigurable Multiprocessor System for Video CODEC
Yoshinori TAKEUCHI Zhao-Chen HUANG Masatomo SAEKI Hiroaki KUNIEDA

PAPER-Methods and Circuits for Signal Processing

Vol:
E76-A No:6
Page(s):
947-956
This paper introduces the new application specific architecture RHINE (Reconfigurable Hierarchical Image Neo-multiprocessor Engine) that is a multiprocessor system for moving picture CODEC. The array processor is known to be originally suited for data parallel processing such as image signal processing which requires vast amount of computations and has the identical instruction sequences on data. However, the moving picture CODEC algorithm suffers from the large load imbalance in the processings on multi-processors with the separated sub-images. Some load balancing techniques are indispensable in such applications for the highest speed-up. RHINE gives one of the optimal solutions for such a load balancing due to its feature of the self reconfigurable architecture. RHINE consists of Block Processing Units (BPU) hierarchically, in each of which has a common bus architecture of multiprocessors with a block memory. Processors in a BPU move to the other BPU according to the load imbalance between BPUs by switching the bus connection between BPUs. The advantage of RHINE architecture is demonstrated by showing performance simulations for real moving pictures.
Incorporating Top-Down Guidance for Extracting Informative Patches for Image Classification
Shuang BAI Tetsuya MATSUMOTO Yoshinori TAKEUCHI Hiroaki KUDO Noboru OHNISHI

LETTER-Pattern Recognition

Vol:
E95-D No:3
Page(s):
880-883
In this letter, we introduce a novel patch sampling strategy for the task of image classification, which is fundamentally different from current patch sampling strategies. A top-down guidance learned from training images is used to guide patch sampling towards informative regions. Experiment results show that this approach achieved noticeable improvement over baseline patch sampling strategies for the classification of both object categories and scene categories.
Deformable Part Model Based Arrhythmia Detection Using Time Domain Features
Yuuka HIRAO Yoshinori TAKEUCHI Masaharu IMAI Jaehoon YU

PAPER-Digital Signal Processing

Vol:
E100-A No:11
Page(s):
2221-2229
Heart disease is one of the major causes of death in many advanced countries. For prevention or treatment of heart disease, getting an early diagnosis from a long time period of electrocardiogram (ECG) examination is necessary. However, it could be a large burden on medical experts to analyze this large amount of data. To reduce the burden and support the analysis, this paper proposes an arrhythmia detection method based on a deformable part model, which absorbs individual variation of ECG waveform and enables the detection of various arrhythmias. Moreover, to detect the arrhythmia in low processing delay, the proposed method only utilizes time domain features. In an experimental result, the proposed method achieved 0.91 F-measure for arrhythmia detection.
Multi-Objective Genetic Programming with Redundancy-Regulations for Automatic Construction of Image Feature Extractors
Ukrit WATCHAREERUETAI Tetsuya MATSUMOTO Yoshinori TAKEUCHI Hiroaki KUDO Noboru OHNISHI

PAPER-Biocybernetics, Neurocomputing

Vol:
E93-D No:9
Page(s):
2614-2625
We propose a new multi-objective genetic programming (MOGP) for automatic construction of image feature extraction programs (FEPs). The proposed method was originated from a well known multi-objective evolutionary algorithm (MOEA), i.e., NSGA-II. The key differences are that redundancy-regulation mechanisms are applied in three main processes of the MOGP, i.e., population truncation, sampling, and offspring generation, to improve population diversity as well as convergence rate. Experimental results indicate that the proposed MOGP-based FEP construction system outperforms the two conventional MOEAs (i.e., NSGA-II and SPEA2) for a test problem. Moreover, we compared the programs constructed by the proposed MOGP with four human-designed object recognition programs. The results show that the constructed programs are better than two human-designed methods and are comparable with the other two human-designed methods for the test problem.
VLSI Architecture for Real-Time Fractal Image Coding Processors
Hideki YAMAUCHI Yoshinori TAKEUCHI Masaharu IMAI

PAPER

Vol:
E83-A No:3
Page(s):
452-458
This paper proposes an efficient architecture for fractal image coding processors. The proposed architecture achieves high-speed image coding comparable to conventional JPEG processing. This architecture achieves less than 33.3 msec fractal image compression coding against a 512 512 pixel image and enables full-motion fractal image coding. The circuit size of the proposed architecture design is comparable to those of JPEG processors and much smaller than those of previously proposed fractal processors.

1-20hit(32hit)

Author Search Result

[Author] Yoshinori TAKEUCHI(32hit)

Effectiveness of a High Speed Context Switching Method Using Register Bank

Acceleration of Genetic Programming by Hierarchical Structure Learning: A Case Study on Image Recognition Program Synthesis

Heuristic Instruction Scheduling Algorithm Using Available Distance for Partial Forwarding Processor

Code Efficiency Evaluation for Embedded Processors

FOREWORD Open Access

Two Dimensional Space Partition Recursive Filtering Algorithm on Rectangular Processor Array

An Optimization Algorithm for High Performance ASIP Design with Considering the RAM and ROM Sizes

Parallel Processing Architecture Design for Two-Dimensional Image Processing Using Spatial Expansion of the Signal Flow Graph

Synthesizable HDL Generation for Pipelined Processors from a Micro-Operation Description

Segmentation of Depth-of-Field Images Based on the Response of ICA Filters

Separation of Narrow Bandwidth Spectral Light from Femtosecond Pulses Using Optical Coupler with Fiber Grating

Optical Encoding and Decoding of Femtosecond Pulses in the Spectral Domain Using Optical Coupler with Fiber Gratings

Distributed Load Balancing Schemes for Parallel Video Encoding System

A Compiler Generation Method for HW/SW Codesign Based on Configurable Processors

A Performance Optimization Method for Pipelined ASIPs in Consideration of Clock Frequency

RHINE: Reconfigurable Multiprocessor System for Video CODEC

Incorporating Top-Down Guidance for Extracting Informative Patches for Image Classification

Deformable Part Model Based Arrhythmia Detection Using Time Domain Features

Multi-Objective Genetic Programming with Redundancy-Regulations for Automatic Construction of Image Feature Extractors

VLSI Architecture for Real-Time Fractal Image Coding Processors

Latest Issue

FlyerIEICE has prepared a flyer regarding multilingual services. Please use the one in your native language.

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles