IEICE globals.ieice.org Site

Keyword Search Result

[Keyword] retiming(7hit)

1-7hit

High-Performance VLSI Architecture of the LMS Adaptive Filter Using 4-2 Adders
Kyo TAKAHASHI Shingo SATO Tadamichi KUDO Yoshitaka TSUNEKAWA

LETTER-Digital Signal Processing

Vol:
E92-A No:2
Page(s):
633-637
In this report, we propose a high-performance pipelined VLSI architecture of the LMS adaptive filter derived by a cut-set retiming technique. The proposed architecture has a peculiar pipelined form with 3 adaptation delays, and the FIR filter portion has a peculiar class of the transposed form providing a minimum output latency and coefficient delay. Both the delays, the adaptation delay and coefficient delay, are compensated by a look-ahead conversion. A new high-speed 4-input and 2-output CSA type adder with a small hardware is employed. The proposed architecture can achieve a good convergence property, high-sampling rate, minimum output latency, small hardware, and lower power dissipation, simultaneously, and is very suitable to implement on the VLSI.
A Fast Gate-Level Register Relocation Method for Circuit Size Reduction in General-Synchronous Framework
Yukihide KOHIRA Atsushi TAKAHASHI

PAPER-VLSI Design Technology and CAD

Vol:
E91-A No:10
Page(s):
3030-3037
Under the assumption that the clock can be inputted to each register at an arbitrary timing, the minimum feasible clock period might be reduced by register relocation while maintaining the circuit behavior and topology. However, if the minimum feasible clock period is reduced, then the number of registers tends to be increased. In this paper, we propose a gate-level register relocation method that reduces the number of registers while keeping the target clock period. In experiments, the proposed method reduces the number of registers in the practical time in most circuits.
Gate-Level Register Relocation in Generalized Synchronous Framework for Clock Period Minimization
Yukihide KOHIRA Atsushi TAKAHASHI

PAPER

Vol:
E90-A No:4
Page(s):
800-807
Under the assumption that clock can be inputted to each register at an arbitrary timing, the minimum feasible clock period can be determined if delays between registers are given. This minimum feasible clock period might be reduced by register relocation maintaining the circuit behavior and topology. In this paper, we propose a gate-level register relocation method to reduce the minimum feasible clock period. The proposed method is a greedy local circuit modification method. We prove that the proposed method achieves the clock period achieved by retiming with delay decomposition, if the delay of each element in the circuit is unique. Experiments show that the computation time of the proposed method and the number of registers of a circuit obtained by the proposed method are smaller than those obtained by the retiming method in the conventional synchronous framework.
On Multiple-Voltage High-Level Synthesis Using Algorithmic Transformations
Lan-Rong DUNG Hsueh-Chih YANG

PAPER-Logic Synthesis

Vol:
E87-A No:12
Page(s):
3100-3108
This paper presents a multiple-voltage high-level synthesis approach for low power DSP applications using algorithmic transformation techniques. Our approach is motivated by maximization of task mobilities in that the increase of mobilities may raise the possibility of assigning tasks to low-voltage components. The mobility means the ability to schedule the starting time of a task. It is defined as the distance between its as-late-as-possible (ALAP) schedule time and its as-soon-as-possible (ASAP) schedule time. To earn task mobilities, we use loop shrinking, retiming and unfolding techniques. The loop shrinking can first reduce the iteration period bound (IPB) and, then, the others are employed for shortening the iteration period (IP) as much as possible. The minimization of IP results in high task mobilities. Finally, we can assign tasks with high mobilities to low-voltage components and, thus, minimize energy under resource and latency constraints. With considering the overhead of level conversion, our approach can achieve significant power reduction. In the case of the third-order IIR filter, the proposed approach can save up to 40.2% of power consumption.
Retiming for Sequential Circuits with a Specified Initial State and Its Application to Testability Enhancement
Hiroyuki YOTSUYANAGI Seiji KAJIHARA Kozo KINOSHITA

PAPER

Vol:
E78-D No:7
Page(s):
861-867
Retiming is a technique to resynthesize a synchronous sequential circuit by rearranging flip-flops. In view of logic optimization, retiming can potentially derive a circuit which is more simplified and testable because retiming can convert several sequential redundancies into combinational redundancies. Retiming methods proposed before have no guarantee to generate the same output sequences when the circuit start from a specified initial state such as the reset state. If the circuit with a specified initial state must have the same output sequences after retiming, rearrangement of flip-flops should be restricted. This paper presents a retiming method for circuits with a specified initial state so that retimed circuits give the same output sequences of the original circuits for any input sequences. In the proposed method, during the procedure of retiming each flip-flop keeps a value corresponding to the initial state and unification of flip-flops with different value is avoided. Our procedures uses 5-valued logic on gate level implementation to describe and calculate the values of flip-flops. Therefore after optimization using our method, the circuit has completely the same behavior as that of the original. Experimental results for ISCAS'89 benchmark circuits show the method can be used to optimize the circuits as well as a method without considering the initial state. And testability of the retimed circuit is more enhanced than that of the original circuit.
Computer Simulation of Jitter Characteristics of PLL for Arbitrary Data and Jitter Patterns
Kenichi NAKASHI Hiroyuki SHIRAHAMA Kenji TANIGUCHI Osamu TSUKAHARA Tohru EZAKI

PAPER-Analog Circuits and Signal Processing

Vol:
E77-A No:6
Page(s):
977-984
In order to investigate the jitter characteristics of PLLs for practical applications, we have developed a computer simulation program of PLL, which can deal with arbitrary patterns both of data and jitters, as well as a conceivable nonlinearity of the circuit performance. We used a time-domain method, namely, we solved the state equation of a charge pump type PLL with a constant time step. The jitter transfer characteristics of a conventional PLL were calculated for periodic input data patterns with sinusoidal jitters. The result agreed fairly well with the corresponding experiments. And we have revealed that an ordinary PD (Phase Detector), which detects the phase difference between input and VCO signals at only rising edges, shows the folded jitter transfer characteristics at the half of the equivalent frequency of the input signal. This folded jitter characteristics increases the total jitter for long successive '1' or '0' data patterns, because of their low equivalent sampling frequency, and might increase the jitter even for the random data patterns. Based on simulation results, we devised an improved phase detector for PLL having a low jitter characteristics. And we also applied the simulation to an FDD (Frequency Difference Detector) type fast pull-in PLL which we have proposed recently, and obtained that the jitter of it was smaller than that of a conventional PLL by 25% for PRBS (pseudo random bit sequence) NRZ code.
Optimizing and Scheduling DSP Programs for High Performance VLSI Designs
Frederico Buchholz MACIEL Yoshikazu MIYANAGA Koji TOCHINAI

PAPER

Vol:
E75-A No:10
Page(s):
1191-1201
The throughput of a parallel execution of a Digital Signal Processing (DSP) algorithm is limited by the iteration bound, which is the minimum period between the start of consecutive iterations. It is given by T=max (Ti/Di), where Ti and Di are the total time of operations and the number of delays in loop i, respectively. A schedule is said rate-optimal if its iteration period is T. The throughput of a DSP algorithm execution can be increased by reducing the Ti's, which can be done by taking as many operations as possible out of loops without changing the semantic of the calculation. This paper presents an optimization technique, called Loop Shrinking, which reduces the iteration bound this way by using commutativity, associativity and distributivity. Also, this paper presents a scheduling method, called Period-Driven Scheduling, which gives rate-optimal schedules more efficiently than existing approaches. An implementation of both is then presented for a system in development by the authors. The system shows reduction in the iteration bound near or equal to careful hand-tunning, and hardware-optimal designs in most of the cases.