Author Search Result

[Author] Tatsuo HIGUCHI(67hit)

1-20hit(67hit)

  • Radix-2-4-8 CORDIC for Fast Vector Rotation

    Takafumi AOKI  Ichiro KITAORI  Tatsuo HIGUCHI  

     
    PAPER

      Vol:
    E83-A No:6
      Page(s):
    1106-1114

    This paper presents a constant-scale-factor radix-2-4-8 CORDIC algorithm for fast vector rotation and sine/cosine computation. The CORDIC algorithm is a well-known hardware algorithm for computing various elementary functions. Due to its sequential nature of computation, however, significant reduction in processing latency is required for real-time signal processing applications. The proposed radix-2-4-8 CORDIC algorithm dynamically changes the radix of computation during operation, and makes possible the reduction in the number of iterations by 37% for 64-bit precision. This paper also describes the hardware implementation of radix-2-4-8 CORDIC unit that can be installed into practical digital signal processors.

  • FOREWORD

    Tatsuo HIGUCHI  

     
    FOREWORD

      Vol:
    E77-A No:9
      Page(s):
    1415-1416
  • Low-Power 8-Valued Cellular Array VLSI for High-Speed Image Processing

    Takahiro HANYU  Maho KUWAHARA  Tatsuo HIGUCHI  

     
    PAPER

      Vol:
    E77-C No:7
      Page(s):
    1042-1048

    This paper presents a low-power 8-valued cellular array VLSI for high-speed image processing based on logical neighborhood operations with 33 windows. This array is useful for performing low-level image processing such as noise removal and edge detection, in intelligent integrated systems where immediate response to input change as well as high throughput is needed. In order to achieve high-speed image processing, template matching for neighborhood operations can be performed in parallel on each row. Each row of the image is operated in a pipelining manner. The direct 8-valued encoding of the matched results for three different 33 masks makes it possible to reduce the number of operations by one-third. In the hardware implementation, the matching cell for logical neighborhood operations can be implemented compactly using MOS transistors with different threshold voltage, which are programmed by multiple ion implants. Moreover, a new literal circuit for detecting multiple-valued signals using a dynamic design style eliminates hazards due to timing skews in the difference of various input voltage levels, so that the dynamic power dissipation of the proposed circuit is greatly reduced. Finally, it is demonstrated that the processing time of the proposed cellular array is reduced to about 40 percent in comparison with that of a corresponding binary circuit when power dissipation/area = 0.3 W/100 mm2.

  • Unified Scheduling of High Performance Parallel VLSI Processors for Robotics

    Bumchul KIM  Michitaka KAMEYAMA  Tatsuo HIGUCHI  

     
    PAPER-Parallel Processor Scheduling

      Vol:
    E76-A No:6
      Page(s):
    904-910

    The performance of processing elements can be improved by the progress of VLSI circuit technology, while the communication overhead can not be negligible in parallel processing system. This paper presents a unified scheduling that allocates tasks having different task processing times in multiple processing elements. The objective function is formulated to measure communication time between processing elements. By employing constraint conditions, the scheduling efficiently generates an optimal solution using an integer programming so that minimum communication time can be achieved. We also propose a VLSI processor for robotics whose latency is very small. In the VLSI processor, the data transfer between two processing elements can be done very quickly, so that the communication cycle time is greatly reduced.

  • Evolutionary Synthesis of Fast Constant-Coefficient Multipliers

    Naofumi HOMMA  Takafumi AOKI  Tatsuo HIGUCHI  

     
    PAPER-Nonlinear Problems

      Vol:
    E83-A No:9
      Page(s):
    1767-1777

    This paper presents an efficient graph-based evolutionary optimization technique called Evolutionary Graph Generation (EGG), and its application to the design of fast constant-coefficient multipliers using parallel counter-tree architecture. An important feature of EGG is its capability to handle the general graph structures directly in evolution process instead of encoding the graph structures into indirect representations, such as bit strings and trees. This paper also addresses the major problem of EGG regarding the significant computation time required for verifying the function of generated circuits. To solve this problem, a new functional verification technique for arithmetic circuits is proposed. It is demonstrated that the EGG system can create efficient multiplier structures which are comparable or superior to the known conventional designs.

  • Rule-Programmable Multiple-Valued Matching VLSI Processor for Real-Time Rule-Based Systems

    Takahiro HANYU  Koichi TAKEDA  Tatsuo HIGUCHI  

     
    PAPER

      Vol:
    E76-C No:3
      Page(s):
    472-479

    This paper presents a design of a new multiple-valued matching VLSI processor for high-speed reasoning. It is useful in the application for real-time rule-based systems with large knowledge bases which are programmable. In order to realize high-speed reasoning, the matching VLSI processor can perform the fully parallel pattern matching between an input data and rules. On the based of direct multiple-valued encoding of each attribute in an input data and rules, pattern matching can be described by using only a programmable delta literal. Moreover, the programmable delta literal circuit can be easily implemented using two kinds of floating-gate MOS devices whose threshold voltages are controllable. In fact, it is demonstrated that four kinds of threshold voltages in a practical floating-gate MOS device can be easily programmable by appropriately controlling the gate, the drain and the source voltage. Finally, the inference time of the quaternary matching VLSI processor with 256 rules and conflict resolution circuits is estimated at about 360 (ns), and the chip area is reduced to about 30 percent, in comparison with the equivalent binary implementation.

  • Digital Reaction-Diffusion System--A Foundation of Bio-Inspired Texture Image Processing--

    Koichi ITO  Takafumi AOKI  Tatsuo HIGUCHI  

     
    PAPER-Image/Visual Signal Processing

      Vol:
    E84-A No:8
      Page(s):
    1909-1918

    This paper presents a digital reaction-diffusion system (DRDS)--a model of a discrete-time discrete-space reaction-diffusion dynamical system--for designing new image processing algorithms inspired by biological pattern formation phenomena. The original idea is based on the Turing's model of pattern formation which is widely known in mathematical biology. We first show that the Turing's morphogenesis can be understood by analyzing the pattern forming property of the DRDS within the framework of multidimensional digital signal processing theory. This paper also describes the design of an adaptive DRDS for image processing tasks, such as enhancement and restoration of fingerprint images.

  • Highly Parallel Collision Detection Processor for Intelligent Robots

    Michitaka KAMEYAMA  Tadao AMADA  Tatsuo HIGUCHI  

     
    PAPER

      Vol:
    E75-C No:4
      Page(s):
    398-404

    In intelligent robots capable of autonomous work, the development of a high-performance special-purpose VLSI processor for collison detection will become very important for automatic motion planning. Conventionally, this kind of processing is performed by general-purpose processors. In this paper, a first collision detection VLSI processor is proposed to achieve ultrahigh-performance processing with an ideal parallel processing scheme. A large number of coordinate transformations and memory accesses to the obstacle memory are fully utilized in the processing algorithm, so that direct collision detection can be executed with a VLSI-oriented regular data flow. The structure of each processing element (PE) is very simple because a PE mainly consists of a COordinate Rotation DIgital Computer (CORDIC) arithmetic unit for the coordinate transformation and memories for the storage of manipulator and obstacle information. When 100 PE's are used to make parallel processing, the performance is about 10 000 times faster than that of conventional approaches using a single general-purpose microprocessor.

  • Design of a Matrix Multiply-Addition VLSI Processor for Robot Inverse Dynamics Computation

    Somchai KITTICHAIKOONKIT  Michitaka KAMEYAMA  Tatsuo HIGUCHI  

     
    PAPER-Dedicated Processors

      Vol:
    E74-C No:11
      Page(s):
    3819-3828

    This paper proposes the design of a matrix multiply-addition VLSI processor (MMP) for minimum-delaytime inverse dynamics computation based on linear array architecture. The MMP mainly consists of four multiply-adders, thus performing 44 matrix multiply-additions with a regular data flow. The delay time becomes minimum based on the concept of "odd-even alternative computation". VLSI-oriented architecture which supports high-speed computation of the odd-even alternative computation both in the MMP level and in the array level, is achieved through the use of two types of the data-dependence graphs. By layout evaluation, it is demonstrated that the MMP can be easily implemented in a single chip. A linear array of MMPs is capable of performing inverse dynamics computation of any manipulator with minimum-delay time. The estimated performance with regard to the delay time is the highest in the architectures reported until now.

  • A Perfect-Reconstruction Encryption Scheme by Using Periodically Time-Varying Digital Filters

    Xuedong YANG  Masayuki KAWAMATA  Tatsuo HIGUCHI  

     
    LETTER-Digital Signal Processing

      Vol:
    E81-A No:1
      Page(s):
    192-196

    This letter proposes a Perfect-Reconstruction (PR) encryption scheme based on a PR QMF bank. Using the proposed scheme, signals can be encrypted and reconstructed perfectly by using two Periodically Time-Varying (PTV) digital filters respectively. Also we find that the proposed scheme has a "good" encryption effect and compares favorably with frequency scramble in the aspects of computation complexity, PR property, and degree of security.

  • A Unified Approach to the Minimization of Quantization Effects in Separable Denominator Multi-Dimensional Digital Filters

    ZHAO Qingfu  Masayuki KAWAMATA  Tatsuo HIGUCHI  

     
    LETTER-Digital Signal Processing

      Vol:
    E70-E No:11
      Page(s):
    1092-1095

    This paper proposes a statistical expression of the output error variance due to coefficient quantization in separable denominator M-D digital filters. Using this expression, this paper shows that minimization of overall quantization errors can be performed by minimizing the roundoff noise.

  • Fingerprint Restoration Using Digital Reaction-Diffusion System and Its Evaluation

    Koichi ITO  Takafumi AOKI  Tatsuo HIGUCHI  

     
    PAPER

      Vol:
    E86-A No:8
      Page(s):
    1916-1924

    This paper presents an algorithm for fingerprint image restoration using Digital Reaction-Diffusion System (DRDS). The DRDS is a model of a discrete-time discrete-space nonlinear reaction-diffusion dynamical system, which is useful for generating biological textures, patterns and structures. This paper focuses on the design of a fingerprint restoration algorithm that combines (i) a ridge orientation estimation technique using an iterative coarse-to-fine processing strategy and (ii) an adaptive DRDS having a capability of enhancing low-quality fingerprint images using the estimated ridge orientation. The phase-only image matching technique is employed for evaluating the similarity between an original fingerprint image and a restored image. The proposed algorithm may be useful for person identification applications using fingerprint images.

  • Design of a Field-Programmable Digital Filter Chip Using Multiple-Valued Current-Mode Logic

    Katsuhiko DEGAWA  Takafumi AOKI  Tatsuo HIGUCHI  

     
    PAPER

      Vol:
    E86-A No:8
      Page(s):
    2001-2010

    This paper presents a Field-Programmable Digital Filter (FPDF) IC that employs carry-propagation-free redundant arithmetic algorithms for faster computation and multiple-valued current-mode circuit technology for high-density low-power implementation. The original contribution of this paper is to evaluate, through actual chip fabrication, the potential impact of multiple-valued current-mode circuit technology on the reduction of hardware complexity required for DSP-oriented programmable ICs. The prototype FPDF fabrication with 0.6 µm CMOS technology demonstrates that the chip area and power consumption can be reduced to 41% and 71%, respectively, compared with the standard binary logic implementation.

  • Counter Tree Diagrams: A Unified Framework for Analyzing Fast Addition Algorithms

    Jun SAKIYAMA  Naofumi HOMMA  Takafumi AOKI  Tatsuo HIGUCHI  

     
    PAPER-IP Design

      Vol:
    E86-A No:12
      Page(s):
    3009-3019

    This paper presents a unified representation of fast addition algorithms based on Counter Tree Diagrams (CTDs). By using CTDs, we can describe and analyze various adder architectures in a systematic way without using specific knowledge about underlying arithmetic algorithms. Examples of adder architectures that can be handled by CTDs include Redundant-Binary (RB) adders, Signed-Digit (SD) adders, Positive-Digit (PD) adders, carry-save adders, parallel counters (e.g., 3-2 counters and 4-2 counters) and networks of such basic adders/counters. This paper also discusses the CTD-based analysis of carry-propagation-free adders using various number representations.

  • A Palmprint Recognition Algorithm Using Phase-Only Correlation

    Koichi ITO  Takafumi AOKI  Hiroshi NAKAJIMA  Koji KOBAYASHI  Tatsuo HIGUCHI  

     
    PAPER

      Vol:
    E91-A No:4
      Page(s):
    1023-1030

    This paper presents a palmprint recognition algorithm using Phase-Only Correlation (POC). The use of phase components in 2D (two-dimensional) discrete Fourier transforms of palmprint images makes it possible to achieve highly robust image registration and matching. In the proposed algorithm, POC is used to align scaling, rotation and translation between two palmprint images, and evaluate similarity between them. Experimental evaluation using a palmprint image database clearly demonstrates efficient matching performance of the proposed algorithm.

  • Design of Wave-Parallel Computing Architectures and Its Application to Massively Parallel Image Processing

    Yasushi YUMINAKA  Takafumi AOKI  Tatsuo HIGUCHI  

     
    PAPER-Multiple-Valued Architectures and Systems

      Vol:
    E76-C No:7
      Page(s):
    1133-1143

    This paper proposes new architecture LSIs based on wave-parallel computing to provide an essential solution to the interconnection problems in massively parallel processing. The basic concept is ferquency multiplexing of digital information, which enables us to utilize the parallelism of electrical (or optical) waves for parallel processing. This wave-parallel computing concept is capable of performing several independent binary funtions in parallel with a single module. In this paper, we discuss the design of wave-parallel image processing LSI to demonstrate the feasibility of reducing the number of interconnections among modules.

  • Design of Robust-Fault-Tolerant Multiple-Valued Arithmetic Circuits and Their Evaluation

    Takeshi KASUGA  Michitaka KAMEYAMA  Tatsuo HIGUCHI  

     
    PAPER

      Vol:
    E76-C No:3
      Page(s):
    428-435

    Robust-fault tolerance is a property that a computational result becomes nearly equal to the correct one at the occurrence of faults in digital system. There are many cases where the safety of digital control systems can be maintained if the property is satisfied. In this paper, robust-fault-tolerant three-valued arithmetic modules such as an adder and a multiplier are proposed. The positive and negative integers are represented by the number of 1's and 1's, respectively. The design concept of the arithmetic modules is that a fault makes linearly additive effect with a small value to the final result. Each arithmetic module consists of identical submodules linearly connected, so that multi-stage structure is formed to generate the final output from the last submodule. Between the input and output digits in the submodule some simple functional relation is satisfied with respect to the number of 1's and 1's. Moreover, the output digit value depends on very small portion of the submodules including the input digits. These properties make the linearly additive effect with a small value to the final result in the arithmetic modules even if multiple faults are occurred at the input and output of any gates in the submodules. Not only direct three-valued representation but also the use of three-valued logic circuits is inherently suitable for efficient implementation of the arithmetic VLSI system. The evaluation of the robust-fault-tolerant three-valued arithmetic modules is done with regard to the chip size and the speed using the standard CMOS design rule. As a result, it is made clear that the chip size can be greatly reduced.

  • Prospects of Multiple-Valued VLSI Processors

    Takahiro HANYU  Michitaka KAMEYAMA  Tatsuo HIGUCHI  

     
    INVITED PAPER

      Vol:
    E76-C No:3
      Page(s):
    383-392

    Rapid advances in integrated circuit technology based on binary logic have made possible the fabrication of digital circuits or digital VLSI systems with not only a very large number of devices on a single chip or wafer, but also high-speed processing capability. However, the advance of processing speeds and improvement in cost/performance ratio based on conventional binary logic will not always continue unabated in submicron geometry. Submicron integrated circuits can handle multiple-valued signals at high speed rather than binary signals, especially at data communication level because of the reduced interconnections. The use of nonbinary logic or discrete-analog signal processing will not be out of the question if the multiple-valued hardware algorithms are developed for fast parallel operations. Moreover, in VLSI or ULSI processors the delay time due to global communications between functional modules or chips instead of each functional module itself is the most important factors to determine the total performance. Locally computable hardware implementation and new parallel hardware algorithms natural to multiple-valued data representation and circuit technologies are the key properties to develop VLSI processors in submicron geometry. As a result, multiple-valued VLSI processors make it possible to improve the effective chip density together with the processing speed significantly. In this paper, we summarize several potential advantages of multiple-valued VLSI processors in submicron geometry due to great reduction of interconnection and due to the suitability to locally computable hardware implementation, and demonstrate that some examples of special-purpose multiple-valued VLSI processors, which are a signed-digit arithmetic VLSI processor, a residue arithmetic VLSI processor and a matching VLSI processor can achieve higher performance for real-world computing system.

  • Multiple-Valued VLSI Image Processor Based on Residue Arithmetic and Its Evaluation

    Makoto HONDA  Michitaka KAMEYAMA  Tatsuo HIGUCHI  

     
    PAPER

      Vol:
    E76-C No:3
      Page(s):
    455-462

    The demand for high-speed image processing is obvious in many real-world computations such as robot vision. Not only high throughput but also small latency becomes an important factor of the performance, because of the requirement of frequent visual feedback. In this paper, a high-performance VLSI image processor based on the multiple-valued residue arithmetic circuit is proposed for such applications. Parallelism is hierarchically used to realize the high-performance VLSI image processor. First, spatially parallel architecture that is different from pipeline architecture is considered to reduce the latency. Secondly, residue number arithmetic is introduced. In the residue number arithmetic, data communication between the mod mi arithmetic units is not necessary, so that multiple mod mi arithmetic units can be completely separated to different chips. Therefore, a number of mod mi multiply adders can be implemented on a single VLSI chip based on the modulus-slice concept. Finally, each mod mi arithmetic unit can be effectively implemented in parallel structure using the concept of a pseudoprimitive root and the multiple-valued current-mode circuit technology. Thus, it is made clear that the throughout use of parallelism makes the latency 1/3 in comparison with the ordinary binary implementation.

  • Design of a Multiple-Valued VLSI Processor for Digital Control

    Katsuhiko SHIMABUKURO  Michitaka KAMEYAMA  Tatsuo HIGUCHI  

     
    PAPER-Computer Hardware and Design

      Vol:
    E75-D No:5
      Page(s):
    709-717

    It is well known that the multiple-valued signed-digit (SD) arithmetic circuits have the attractive features of compactness and high-speed operation. However, both of these features have yet to be utilized fully. In this paper, we consider the application of a parallel-structure-based VLSI processor. A high-performance parallel-structure-based multiple-valued VLSI processor using the radix-2 SD number system is proposed. Its compactness makes the parallelism high under chip size limitations in comparison with the ordinary binary arithmetic circuits. Moreover, the speed of the single arithmetic module is very high in the SD arithmetic circuits, so that we can take advantage of the high-speed operation in the parallel-structure-based VLSI processor chip. The multiple-valued bidirectional current-mode technology is used not only in high-speed small sized arithmetic circuits, but also in reducing the number of connections in the parallel-structure-based VLSI processor. The proposed processor is specially developed for real-time digital control, where the performance is evaluated by delay time. Performance estimation using SPICE simulators shows that the delay time of proposed processor for matrix operations such as matrix multiplication is greatly reduced in comparison with a conventional binary processor.

1-20hit(67hit)

FlyerIEICE has prepared a flyer regarding multilingual services. Please use the one in your native language.