IEICE globals.ieice.org Site

Keyword Search Result

[Keyword] LDPC decoder(9hit)

1-9hit

Self-Adaptive Scaled Min-Sum Algorithm for LDPC Decoders Based on Delta-Min
Keol CHO Ki-Seok CHUNG

LETTER-Coding Theory

Vol:
E99-A No:8
Page(s):
1632-1634
A self-adaptive scaled min-sum algorithm for LDPC decoding based on the difference between the first two minima of the check node messages (Δmin) is proposed. Δmin is utilized for adjusting the scaling factor of the check node messages, and simulation results show that the proposed algorithm improves the error correcting performance compared to existing algorithms.
An Area-Efficient Reconfigurable LDPC Decoder with Conflict Resolution
Changsheng ZHOU Yuebin HUANG Shuangqu HUANG Yun CHEN Xiaoyang ZENG

PAPER

Vol:
E95-C No:4
Page(s):
478-486
Based on Turbo-Decoding Message-Passing (TDMP) and Normalized Min-Sum (NMS) algorithm, an area efficient LDPC decoder that supports both structured and unstructured LDPC codes is proposed in this paper. We introduce a solution to solve the memory access conflict problem caused by TDMP algorithm. We also arrange the main timing schedule carefully to handle the operations of our solution while avoiding much additional hardware consumption. To reduce the memory bits needed, the extrinsic message storing strategy is also optimized. Besides the extrinsic message recover and the accumulate operation are merged together. To verify our architecture, a LDPC decoder that supports both China Multimedia Mobile Broadcasting (CMMB) and Digital Terrestrial/ Television Multimedia Broadcasting (DTMB) standards is developed using SMIC 0.13 µm standard CMOS process. The core area is 4.75 mm2 and the maximum operating clock frequency is 200 MHz. The estimated power consumption is 48.4 mW at 25 MHz for CMMB and 130.9 mW at 50 MHz for DTMB with 5 iterations and 1.2 V supply.
A 6.72-Gb/s 8 pJ/bit/iteration IEEE 802.15.3c LDPC Decoder Chip
Zhixiang CHEN Xiao PENG Xiongxin ZHAO Leona OKAMURA Dajiang ZHOU Satoshi GOTO

PAPER-High-Level Synthesis and System-Level Design

Vol:
E94-A No:12
Page(s):
2587-2596
In this paper, we introduce an LDPC decoder design for decoding a length-672 multi-rate code adopted in IEEE 802.15.3c standard. The proposed decoder features high performances in both data rate and power efficiency. A macro-layer level fully parallel layered decoding architecture is proposed to support the throughput requirement in the standard. For the proposed decoder, it takes only 4 clock cycles to process one decoding iteration. While parallelism increases, the chip routing congestion problem becomes more severe because a more complicated interconnection network is needed for message passing during the decoding process. This problem is nicely solved by our proposed efficient message permutation scheme utilizing exploited parity check matrix features. The proposed message permutation network features high compatibility and zero-logic-gate VLSI implementation, which contribute to the remarkable improvements in both area utilization ratio and total gate count. Meanwhile, frame-level pipeline decoding is applied in the design to shorten the critical path. To verify the above techniques, the proposed decoder is implemented on a chip fabricated using Fujitsu 65 nm 1P12L LVT CMOS process. The chip occupies a core area of 1.30 mm2 with area utilization ratio 86.3%. According to the measurement results, working at 1.2 V, 400 MHz and 10 iterations the proposed decoder delivers a 6.72 Gb/s data throughput and dissipates a power of 537.6 mW, resulting in an energy efficiency 8.0 pJ/bit/iteration. Moreover, a decoder of the same architecture but with no pipeline stage for low-profile application is also implemented and evaluated at post-layout level.
A Fast Systematic Optimized Comparison Algorithm for CNU Design of LDPC Decoders
Jui-Hui HUNG Sau-Gee CHEN

PAPER-Communication Theory and Signals

Vol:
E94-A No:11
Page(s):
2246-2253
This work first investigates two existing check node unit (CNU) architectures for LDPC decoding: self-message-excluded CNU (SME-CNU) and two-minimum CNU (TM-CNU) architectures, and analyzes their area and timing complexities based on various realization approaches. Compared to TM-CNU architecture, SME-CNU architecture is faster in speed but with much higher complexity for comparison operations. To overcome this problem, this work proposes a novel systematic optimization algorithm for comparison operations required by SME-CNU architectures. The algorithm can automatically synthesize an optimized fast comparison operation that guarantees a shortest comparison delay time and a minimized total number of 2-input comparators. High speed is achieved by adopting parallel divide-and-conquer comparison operations, while the required comparators are minimized by developing a novel set construction algorithm that maximizes shareable comparison operations. As a result, the proposed design significantly reduces the required number of comparison operations, compared to conventional SME-CNU architectures, under the condition that both designs have the same speed performance. Besides, our preliminary hardware simulations show that the proposed design has comparable hardware complexity to low-complexity TM-CNU architectures.
Generic Permutation Network for QC-LDPC Decoder
Xiao PENG Xiongxin ZHAO Zhixiang CHEN Fumiaki MAEHARA Satoshi GOTO

PAPER-High-Level Synthesis and System-Level Design

Vol:
E93-A No:12
Page(s):
2551-2559
Permutation network plays an important role in the reconfigurable QC-LDPC decoder for most modern wireless communication systems with multiple code rates and various code lengths. This paper presents the generic permutation network (GPN) for the reconfigurable QC-LDPC decoder. Compared with conventional permutation networks, this proposal could break through the input number restriction, such as power of 2 and other limited number, and optimize the network for any application in demand. Moreover, the proposed scheme could greatly reduce the latency because of less stages and efficient control signal generating algorithm. In addition, the proposed network processes the nature of high parallelism which could enable several groups of data to be cyclically shifted simultaneously. The synthesis results using the 90 nm technology demonstrate that this architecture can be implemented with the gate count of 18.3k for WiMAX standard at the frequency of 600 MHz and 10.9k for WiFi standard at the frequency of 800 MHz.
Permutation Network for Reconfigurable LDPC Decoder Based on Banyan Network
Xiao PENG Zhixiang CHEN Xiongxin ZHAO Fumiaki MAEHARA Satoshi GOTO

PAPER

Vol:
E93-C No:3
Page(s):
270-278
Since the structured quasi-cyclic low-density parity-check (QC-LDPC) codes for most modern wireless communication systems include multiple code rates, various block lengths, and the corresponding different sizes of submatrices in parity check matrix (PCM), the reconfigurable LDPC decoder is desirable and the permutation network is needed to accommodate any input number (IN) and shift number (SN) for cyclic shift. In this paper, we propose a novel permutation network architecture for the reconfigurable QC-LDPC decoders based on Banyan network. We prove that Banyan network has the nonblocking property for cyclic shift when the IN is power of 2, and give the control signal generating algorithm. Through introducing the bypass network, we put forward the nonblocking scheme for any IN and SN. In addition, we present the hardware design of the control signal generator, which can greatly reduce the hardware complexity and latency. The synthesis results using the TSMC 0.18 µm library demonstrate that the proposed permutation network can be implemented with the area of 0.546 mm2 and the frequency of 292 MHz.
Efficient Fully-Parallel LDPC Decoder Design with Improved Simplified Min-Sum Algorithms
Qi WANG Kazunori SHIMIZU Takeshi IKENAGA Satoshi GOTO

PAPER-VLSI Architecture for Communication/Server Systems

Vol:
E90-C No:10
Page(s):
1964-1971
In this paper we introduce an area and power efficient fully-parallel LDPC decoder design, which keeps the BER performance while consuming less hardware resources and lower power compared with conventional decoders. For this decoder, we firstly propose two improved simplified min-sum algorithms, which enable the decoder to reduce the hardware implementation complexity and area: hardware consumption of check operation module is reduced by 40%, while achieving a negligible performance loss compared with the general min-sum algorithm. To reduce the power dissipation of the decoder, we also proposed a power-saved strategy, according to which the message evolution halts as the parity-check condition is satisfied. This strategy reduces more than 50% power under good channel condition. The synthesis result in 0.18 µm CMOS technology shows our decoder based on (648,540) irregular LDPC code of WLAN (802.11n) protocol achieves 810 [Mbps] throughput with 283 [mW] power consumption.
Power-Efficient LDPC Decoder Architecture Based on Accelerated Message-Passing Schedule
Kazunori SHIMIZU Tatsuyuki ISHIKAWA Nozomu TOGAWA Takeshi IKENAGA Satoshi GOTO

PAPER-VLSI Architecture

Vol:
E89-A No:12
Page(s):
3602-3612
In this paper, we propose a power-efficient LDPC decoder architecture based on an accelerated message-passing schedule. The proposed decoder architecture is characterized as follows: (i) Partitioning a pipelined operation not to read and write intermediate messages simultaneously enables the accelerated message-passing schedule to be implemented with single-port SRAMs. (ii) FIFO-based buffering reduces the number of SRAM banks and words of the LDPC decoder based on the accelerated message-passing schedule. The proposed LDPC decoder keeps a single message for each non-zero bit in a parity check matrix as well as a classical schedule while achieving the accelerated message-passing schedule. Implementation results in 0.18 [µm] CMOS technology show that the proposed decoder architecture reduces an area of the LDPC decoder by 43% and a power dissipation by 29% compared to the conventional architecture based on the accelerated message-passing schedule.
Partially-Parallel LDPC Decoder Achieving High-Efficiency Message-Passing Schedule
Kazunori SHIMIZU Tatsuyuki ISHIKAWA Nozomu TOGAWA Takeshi IKENAGA Satoshi GOTO

PAPER

Vol:
E89-A No:4
Page(s):
969-978
In this paper, we propose a partially-parallel LDPC decoder which achieves a high-efficiency message-passing schedule. The proposed LDPC decoder is characterized as follows: (i) The column operations follow the row operations in a pipelined architecture to ensure that the row and column operations are performed concurrently. (ii) The proposed parallel pipelined bit functional unit enables the column operation module to compute every message in each bit node which is updated by the row operations. These column operations can be performed without extending the single iterative decoding delay when the row and column operations are performed concurrently. Therefore, the proposed decoder performs the column operations more frequently in a single iterative decoding, and achieves a high-efficiency message-passing schedule within the limited decoding delay time. Hardware implementation on an FPGA and simulation results show that the proposed partially-parallel LDPC decoder improves the decoding throughput and bit error performance with a small hardware overhead.