Keyword Search Result

[Keyword] CORDIC(18hit)

1-18hit
  • Design and VLSI Implementation of a Sorted MMSE QR Decomposition for 4×4 MIMO Detectors

    Lu SUN  Bin WU  Tianchun YE  

     
    LETTER-VLSI Design Technology and CAD

      Pubricized:
    2020/10/12
      Vol:
    E104-A No:4
      Page(s):
    762-767

    In this letter, a low latency, high throughput and hardware efficient sorted MMSE QR decomposition (MMSE-SQRD) for multiple-input multiple-output (MIMO) systems is presented. In contrast to the method of extending the complex matrix to real model and thereafter applying real-valued QR decomposition (QRD), we develop a highly parallel decomposition scheme based on coordinate rotation digital computer (CORDIC) which performs the QRD in complex domain directly and then converting the complex result to its real counterpart. The proposed scheme can greatly improve the processing parallelism and curtail the nullification and sorting procedures. Besides, we also design the corresponding pipelined hardware architecture of the MMSE-SQRD based on highly parallel Givens rotation structure with CORDIC algorithm for 4×4 MIMO detectors. The proposed MMSE-SQRD is implemented in SMIC 55nm CMOS technology achieving up to 50M QRD/s throughput and a latency of 59 clock cycles with only 218 kilo-gates (KG). Compared to the previous works, the proposed design achieves the highest normalized throughput efficiency and lowest processing latency.

  • A High-Throughput Low-Energy Arithmetic Processor

    Hong-Thu NGUYEN  Xuan-Thuan NGUYEN  Cong-Kha PHAM  

     
    BRIEF PAPER

      Vol:
    E101-C No:4
      Page(s):
    281-284

    In this paper, the hardware architecture of a CORDIC-based Arithmetic Processor utilizing both angle recoding (ARD) CORDIC algorithm and scaling-free (SCFE) CORDIC algorithm is proposed and implemented in 180 nm CMOS technology. The arithmetic processor is capable of calculating the sine, cosine, sine hyperbolic, cosine hyperbolic, and multiplication function. The experimental results prove that the design is able to work at 100 MHz frequency and requires 12.96 mW power consumption. In comparison with some previous work, the design can be seen as a good choice for high-throughput low-energy applications.

  • A Low-Latency Parallel Pipeline CORDIC

    Hong-Thu NGUYEN  Xuan-Thuan NGUYEN  Cong-Kha PHAM  

     
    PAPER

      Vol:
    E100-C No:4
      Page(s):
    391-398

    COordinate Rotation DIgital Computer (CORDIC) is an efficient algorithm to compute elementary arithmetic such as trigonometric, exponent, and logarithm. However, the main drawback of the conventional CORDIC is that the number of iterations is equal to the number of angle constants. Among a great deal of research to overcome this disadvantage, angle recording method is an effective method because it is capable of reducing 50% of the number of iterations. Nevertheless, the hardware architecture of this algorithm is difficult to implement in pipeline. Therefore, a low-latency parallel pipeline hybrid adaptive CORDIC (PP-CORDIC) architecture is proposed in this paper. In the design hybrid architecture was exploited together with pipeline and parallel technique to achieve low latency. This design is able to operate at 122.6 MHz frequency and costs 8, 12, and 15 clock cycles latency in the best, average, and worst case, respectively. More significantly, the latency of PP-CORDIC in the worst case is 1.1X lower than that of the Altera's commercial floating-point sine and cosine IP cores.

  • Efficient CORDIC-Based Processing Elements in Scalable Complex Matrix Inversion

    Huan HE  Feng YU  Bei ZHAO  

     
    LETTER-Algorithms and Data Structures

      Vol:
    E97-A No:5
      Page(s):
    1144-1148

    In this paper we apply angle recoding to the CORDIC-based processing elements in a scalable architecture for complex matrix inversion. We extend the processing elements from the scalable real matrix inversion architecture to the complex domain and obtain the novel scalable complex matrix inversion architecture, which can significantly reduce computational complexity. We rearrange the CORDIC elements to make one half of the processing elements simple and compact. For the other half of the processing elements, the efficient use of angler recoding reduces the number of microrotation steps of the CORDIC elements to 3/4. Consequently, only 3 CORDIC elements are required for the processing elements with full utilization.

  • Low Complexity Systolic Array Structure for Extended QRD-RLS Equalizer

    Ji-Hye SHIN  Young-Beom JANG  

     
    PAPER-Digital Signal Processing

      Vol:
    E95-A No:12
      Page(s):
    2407-2414

    In this paper, a new systolic array structure for the extended QR decomposition based recursive least-square (QRD-RLS) equalizer is proposed. The fact that the vectoring and rotation mode coordinate rotation digital computer (CORDIC) processors rotate in the same direction is used to show that the hardware complexity of the systolic array can be reduced. Furthermore, since the vectoring and rotation mode CORDIC processors in the proposed structure rotate simultaneously, operation time is also reduced. The performance of the proposed equalizer is analyzed by observing the flatness obtained by multiplying the frequency responses of the unknown channel with the proposed equalizer. Simulation results through hardware description language (HDL) coding and synthesis show that 23.8% of the chip implementation area can be reduced.

  • Low Cost CORDIC-Based Configurable FFT/IFFT Processor for OFDM Systems

    Dongpei LIU  Hengzhu LIU  Botao ZHANG  Jianfeng ZHANG  Shixian WANG  Zhengfa LIANG  

     
    PAPER-OFDM

      Vol:
    E95-A No:10
      Page(s):
    1683-1691

    High-performance FFT processor is indispensable for real-time OFDM communication systems. This paper presents a CORDIC based design of variable-length FFT processor which can perform various FFT lengths of 64/128/256/512/1024/2048/4096/8192-point. The proposed FFT processor employs memory based architecture in which mixed radix 4/2 algorithm, pipelined CORDIC, and conflict-free parallel memory access scheme are exploited. Besides, the CORDIC rotation angles are generated internally based on the transform of butterfly counter, which eliminates the need of ROM making it memory-efficient. The proposed architecture has a lower hardware complexity because it is ROM-free and with no dedicated complex multiplier. We implemented the proposed FFT processor and verified it on FPGA development platform. Additionally, the processor is also synthesized in 0.18 µm technology, the core area of the processor is 3.47 mm2 and the maximum operating frequency can be up to 500 MHz. The proposed FFT processor is better trade off performance and hardware overhead, and it can meet the speed requirement of most modern OFDM system, such as IEEE 802.11n, WiMax, 3GPP-LTE and DVB-T/H.

  • Hierarchical MFMO Circuit Modules for an Energy-Efficient SDR DBF

    Jeich MAR  Chi-Cheng KUO  Shin-Ru WU  You-Rong LIN  

     
    PAPER-Application

      Vol:
    E95-D No:2
      Page(s):
    413-425

    The hierarchical multi-function matrix operation (MFMO) circuit modules are designed using coordinate rotations digital computer (CORDIC) algorithm for realizing the intensive computation of matrix operations. The paper emphasizes that the designed hierarchical MFMO circuit modules can be used to develop a power-efficient software-defined radio (SDR) digital beamformer (DBF). The formulas of the processing time for the scalable MFMO circuit modules implemented in field programmable gate array (FPGA) are derived to allocate the proper logic resources for the hardware reconfiguration. The hierarchical MFMO circuit modules are scalable to the changing number of array branches employed for the SDR DBF to achieve the purpose of power saving. The efficient reuse of the common MFMO circuit modules in the SDR DBF can also lead to energy reduction. Finally, the power dissipation and reconfiguration function in the different modes of the SDR DBF are observed from the experiment results.

  • Resource and Performance Evaluations of Fixed Point QRD-RLS Systolic Array through FPGA Implementation

    Yoshiaki YOKOYAMA  Minseok KIM  Hiroyuki ARAI  

     
    PAPER-Wireless Communication Technologies

      Vol:
    E91-B No:4
      Page(s):
    1068-1075

    At present, when using space-time processing techniques with multiple antennas for mobile radio communication, real-time weight adaptation is necessary. Due to the progress of integrated circuit technology, dedicated processor implementation with ASIC or FPGA can be employed to implement various wireless applications. This paper presents a resource and performance evaluation of the QRD-RLS systolic array processor based on fixed-point CORDIC algorithm with FPGA. In this paper, to save hardware resources, we propose the shared architecture of a complex CORDIC processor. The required precision of internal calculation, the circuit area for the number of antenna elements and wordlength, and the processing speed will be evaluated. The resource estimation provides a possible processor configuration with a current FPGA on the market. Computer simulations assuming a fading channel will show a fast convergence property with a finite number of training symbols. The proposed architecture has also been implemented and its operation was verified by beamforming evaluation through a radio propagation experiment.

  • Fixed-Point Error Analysis of CORDIC Arithmetic for Special-Purpose Signal Processors

    Tze-Yun SUNG  Hsi-Chin HSIN  

     
    LETTER-Digital Signal Processing

      Vol:
    E90-A No:9
      Page(s):
    2006-2013

    CORDIC (COordinate Rotation DIgital Computer) is a well known algorithm using simple adders and shifters to evaluate various elementary functions. Thus, CORDIC is suitable for the design of high performance chips using VLSI technology. In this paper, a complete analysis of the computation error of both the (conventional) CORDIC algorithm and the CORDIC algorithm with expanded convergence range is derived to facilitate the design task. The resulting formulas regarding the relative and absolute approximation errors and the truncation error are summarized in the tabular form. As the numerical accuracy of CORDIC processors is determined by the word length of operands and the number of iterations, three reference tables are constructed for the optimal choice of these numbers. These tables can be used to facilitate the design of cost-effective CORDIC processors in terms of areas and performances. In addition, two design examples: singular value decomposition (SVD) and lattice filter for digital signal processing systems are given to demonstrate the goal and benefit of the derived numerical analysis of CORDIC.

  • Steady-State Properties of a CORDIC-Based Adaptive ARMA Lattice Filter

    Shin'ichi SHIRAISHI  Miki HASEYAMA  Hideo KITAJIMA  

     
    LETTER-Digital Signal Processing

      Vol:
    E89-A No:12
      Page(s):
    3724-3729

    This paper analyzes the steady-state properties of a CORDIC-based adaptive ARMA lattice filter. In our previous study, the convergence properties of the filter in the non-steady state were clarified; however, its behavior in the steady state was not discussed. Therefore, we develop a distinct analysis technique based on a Markov chain in order to investigate the steady-state properties of the filter. By using the proposed technique, the relationship between step size and coefficient estimation error is revealed.

  • A Compact CORDIC Algorithm for Synchronization of Carrier Frequency Offset in OFDM Modems

    Kyu In LEE  Jonghan KIM  Jaekon LEE  Yong Soo CHO  

     
    LETTER-Devices/Circuits for Communications

      Vol:
    E89-B No:3
      Page(s):
    952-954

    In this letter, a compact CORDIC algorithm is proposed to efficiently implement a synchronization block for carrier frequency offset (CFO) in OFDM modems. The compact CORDIC algorithm allows us to combine a CFO estimation block and a CFO compensation block into a single CFO synchronization block. It is shown by FPGA implementation results that the compact CORDIC algorithm can achieve a significant reduction in hardware complexity and latency for implementing the synchronization block in OFDM modems.

  • Convergence Properties of a CORDIC-Based Adaptive ARMA Lattice Filter

    Shin'ichi SHIRAISHI  Miki HASEYAMA  Hideo KITAJIMA  

     
    PAPER-Digital Signal Processing

      Vol:
    E88-A No:8
      Page(s):
    2154-2164

    This paper presents a theoretical convergence analysis of a CORDIC-based adaptive ARMA lattice filter. In previous literatures, several investigation methods for adaptive lattice filters have been proposed; however, they are available only for AR-type filters. Therefore, we have developed a distinct technique that can reveal the convergence properties of the CORDIC ARMA lattice filter. The derived technique provides a quantitative convergence analysis, which facilitates an efficient hardware design for the filter. Moreover, our analysis technique can be applied to popular multiplier-based filters by slight modifications. Hence, the presented convergence analysis is significant as a leading attempt to investigate ARMA lattice filters.

  • A Cost-Effective CORDIC-Based Architecture for Adaptive Lattice Filters

    Shin'ichi SHIRAISHI  Miki HASEYAMA  Hideo KITAJIMA  

     
    PAPER-Audio/Speech Coding

      Vol:
    E87-A No:3
      Page(s):
    567-576

    This paper presents a cost-effective CORDIC-based architecture for adaptive lattice filters. An implementation method for an ARMA lattice filter using the CORDIC algorithm has been proposed. The previously proposed method can provide a simple filter architecture; however, it has problems such as redundant structure and numerical inaccuracy. Therefore, by solving each problem we derive a new non-redundant filter architecture with improved numerical accuracy. The obtained filter architecture provides a low cost ARMA lattice filter in which high-precision data processing is feasible. In addition, the proposed architecture can be applied to AR-type lattice filters, so that it may have several applications in adaptive signal processing. The presented filter architecture is useful from a hardware point of view because it facilitates an effective VLSI design of various adaptive lattice filters.

  • A Low-Cost Floating Point Vectoring Algorithm Based on CORDIC

    Jeong-A LEE  Kees-Jan van der KOLK  Ed F. A. DEPRETTERE  

     
    PAPER-Digital Signal Processing

      Vol:
    E83-A No:8
      Page(s):
    1654-1662

    In this paper we develop a CORDIC-based floating-point vectoring algorithm which reduces significantly the amount of microrotation steps as compared to the conventional algorithm. The overhead required to accomplish this is minimized by the introduction of an angle selection function which considers only a few of the total amount of bits used to represent the vector being rotated. At the same time, the cost of individual microrotations is kept low by the utilization of a fast rotations angle base.

  • Radix-2-4-8 CORDIC for Fast Vector Rotation

    Takafumi AOKI  Ichiro KITAORI  Tatsuo HIGUCHI  

     
    PAPER

      Vol:
    E83-A No:6
      Page(s):
    1106-1114

    This paper presents a constant-scale-factor radix-2-4-8 CORDIC algorithm for fast vector rotation and sine/cosine computation. The CORDIC algorithm is a well-known hardware algorithm for computing various elementary functions. Due to its sequential nature of computation, however, significant reduction in processing latency is required for real-time signal processing applications. The proposed radix-2-4-8 CORDIC algorithm dynamically changes the radix of computation during operation, and makes possible the reduction in the number of iterations by 37% for 64-bit precision. This paper also describes the hardware implementation of radix-2-4-8 CORDIC unit that can be installed into practical digital signal processors.

  • CORDIC-Based Direct Digital Frequency Synthesizer: Comparison with a ROM-Based Architecture in FPGA Implementation

    Minkyoung PARK  Kiseon KIM  Jeong-A LEE  

     
    LETTER-Digital Signal Processing

      Vol:
    E83-A No:6
      Page(s):
    1282-1285

    This paper describes a CORDIC-based direct digital frequency synthesizer in comparison with a ROM-based architecture. To optimize the hardware design parameters, we perform numerical analysis of the quantization effects for ROM and CORDIC-based architectures. The hardware costs of them are estimated in FPGA, which shows that the CORDIC-based architecture becomes better than the ROM-based when the required accuracy is 9 bits or more.

  • A Transformation Method of a CORDIC ARMA Lattice Filter for Signal Synthesis

    Shin'ichi SHIRAISHI  Miki HASEYAMA  Hideo KITAJIMA  

     
    PAPER

      Vol:
    E82-A No:2
      Page(s):
    230-237

    This paper proposes a method to transform a CORDIC ARMA lattice filter, which is originally realized for signal analysis, into a signal synthesis lattice filter (CORDIC ARMA lattice synthesis filter). In order to perform such a transformation and then obtain the CORDIC ARMA lattice synthesis filter, we must implement the followings with CORDIC: (1) the structure of the altered lattice filter; and (2) an angle calculation module. However, we cannot achieve such an implementation as an extension of the CORDIC ARMA lattice filter algorithm. Therefore, this paper proposes CORDIC implementation schemes for both the structure and module, and then we realize the CORDIC ARMA lattice synthesis filter. By using CORDIC processors, the elementary sections of the CORDIC ARMA lattice synthesis filter are efficiently implemented without any multipliers. Since the obtained signal synthesis lattice filter consists of dedicated CORDIC processors, it keeps the advantage of the CORDIC ARMA lattice filter, that is a simple structure.

  • Design of a CAM-Based Collision Detection VLSI Processor for Robotics

    Masanori HARIYAMA  Michitaka KANEYAMA  

     
    PAPER

      Vol:
    E77-C No:7
      Page(s):
    1108-1115

    Real-time collision detection is one of the most important intelligent processings in robotics. In collision detection, a large storage capasity is usually required to store the 3-dimensional information on the obstacles located in a workspace. Moreover, high-computational power is essential in not only coordinate transformation but also matching operation. In the proposed collision detection VLSI processor, the matching operation is drastically accelerated by using a content-addressable memory (CAM). A new obstacle representation based on a union of rectangular solids is also used to reduce the obstacle memory capacity, so that the collision detection can be performed by only magnitude comparison in parallel. Parallel architecture using several identical processor elements (PEs) is employed to perform the coordinate transformation at high speed, and each PE performs coordinate transformation at high speed based on the COordinate Rotation DIgital Computation (CORDIC) algorithms. When the 16 PEs and 144-kb CAM are used, the performance is evaluated to be 90 ms.

FlyerIEICE has prepared a flyer regarding multilingual services. Please use the one in your native language.