1-13hit |
Takuya SAWADA Taku TOSHIKAWA Kumpei YOSHIKAWA Hidehiro TAKATA Koji NII Makoto NAGATA
The susceptibility of a static random access memory (SRAM) core against static and dynamic variation of power supply voltage is evaluated, by using on-chip diagnosis structures of memory built-in self testing (MBIST) and on-chip voltage waveform monitoring (OCM). The SRAM core of interest in this paper is a synthesizable version applicable to general systems-on-a-chip (SoC) design, and fabricated in a 90 nm CMOS technology. RF power injection to power supply networks is quantified by OCM. The number of resultant erroneous bits as well as their distribution in the cell array is given by MBIST. The frequency-dependent sensitivity reflects the highly capacitive nature of densely integrated SRAM cells.
Kazutami ARIMOTO Toshihiro HATTORI Hidehiro TAKATA Atsushi HASEGAWA Toru SHIMIZU
Many embedded system application in ubiquitous network strongly require the high performance SoC with overcoming the physical limitations in the advanced CMOS. To develop these SoC, the continuous design efforts have been done. The initial efforts are the primitive level circuit technique and power switching control method for suppressing the standby currents. However, the additional physical limitations and system enhancements becomes main factors, the new design efforts have been proposed. These design efforts are the application-oriented technologies from the system level to device level. This paper introduces the self voltage controlled technique to cancel the PVT (process, voltage, and temperature) variation, power distribution and power management for cellular phone application, parallel algorithm and optimized layout DSP, and massively parallel fine-grained SIMD processor for next multimedia application. The high performance SoC for the embedded are achieved by providing the components of the system level IPs and making the application oriented SoC platform.
Takashi KURAFUJI Yasunobu NAKASE Hidehiro TAKATA Yukinaga IMAMURA Rei AKIYAMA Tadao YAMANAKA Atsushi IWABU Shutarou YASUDA Toshitsugu MIWA Yasuhiro NUNOMURA Niichi ITOH Tetsuya KAGEMOTO Nobuharu YOSHIOKA Takeshi SHIBAGAKI Hiroyuki KONDO Masayuki KOYAMA Takahiko ARAKAWA Shuhei IWADE
We apply a selective-sets resizable cache and a complete hierarchy SRAM for the high-performance and low-power RISC CPU core. The selective-sets resizable cache can change the cache memory size by varying the number of cache sets. It reduces the leakage current by 23% with slight degradation of the worst case operating speed from 213 MHz to 210 MHz. The complete hierarchy SRAM enables the partial swing operation not only in the bit lines, but also in the global signal lines. It reduces the current consumption of the memory by 4.6%, and attains the high-speed access of 1.4 ns in the typical case.
Akira YAMADA Yasuhiro NUNOMURA Hiroaki SUZUKI Hisakazu SATO Niichi ITOH Tetsuya KAGEMOTO Hironobu ITO Takashi KURAFUJI Nobuharu YOSHIOKA Jingo NAKANISHI Hiromi NOTANI Rei AKIYAMA Atsushi IWABU Tadao YAMANAKA Hidehiro TAKATA Takeshi SHIBAGAKI Takahiko ARAKAWA Hiroshi MAKINO Osamu TOMISAWA Shuhei IWADE
A high-speed 32-bit RISC microcontroller has been developed. In order to realize high-speed operation with minimum hardware resource, we have developed new design and analysis methods such as a clock distribution, a bus-line layout, and an IR drop analysis. As a result, high-speed operation of 400 MHz has been achieved with power dissipation of 0.96 W at 1.8 V.
Toyohiko YOSHIDA Akira YAMADA Edgar HOLMANN Hidehiro TAKATA Atsushi MOHRI Yukihiko SHIMAZU Kiyoshi NAKAKIMURA Keiichi HIGASHITANI
A dual-issue VLIW processor, running at 250 MHz, is enhanced with multimedia instructions for a sustained peak performance of 1000MOPS. The multimedia processor integrates 300 K transistors in an 8 mm2 core area and it is fabricated onto a 6 mm6. 2 mm chip with 32 kB instruction and 32 kB data RAMs in a 0. 3-micrometer, four-layer metal CMOS process. It consumes 1. 2 W at 2. 0 V running at 250 MHz. The VLIW processor achieves a speed-up of more than 4 times over a single-issue RISC for MPEG video block decoding. A decoder implemented on the multimedia processor with a small amount of dedicated hardware, such as the Huffman decoder and a DMA controller will decode the worst case 88 video block data in 754 cycles, leading to a real-time MPEG-2 system, video, and audio decoding system.
Mitsuya FUKAZAWA Masanori KURIMOTO Rei AKIYAMA Hidehiro TAKATA Makoto NAGATA
Logical operations in CMOS digital integration are highly prone to fail as the amount of power supply (PS) drop approaches to failure threshold. PS voltage variation is characterized by built-in noise monitors in a 32-bit microprocessor of 90-nm CMOS technology, and related with operation failures by instruction-level programming for logical failure analysis. Combination of voltage drop size and activated logic path determines failure sensitivity and class of failures. Experimental observation as well as simplified simulation is applied for the detailed understanding of the impact of PS noise on logical operations of digital integrated circuits.
Hidehiro TAKATA Rei AKIYAMA Tadao YAMANAKA Haruyuki OHKUMA Yasue SUETSUGU Toshihiro KANAOKA Satoshi KUMAKI Kazuya ISHIHARA Atsuo HANAMI Tetsuya MATSUMURA Tetsuya WATANABE Yoshihide AJIOKA Yoshio MATSUDA Syuhei IWADE
An on-chip, 64-Mb, embedded, DRAM MPEG-2 encoder LSI with a multimedia processor has been developed. To implement this large-scale and high-speed LSI, we have developed the hierarchical skew control of multi-clocks, with timing verification, in which cross-talk noise is considered, and simple measures taken against the IR drop in the power lines through decoupling capacitors. As a result, the target performance of 263 MHz at 1.5 V has been successfully attained and verified, the cross-talk noise has been considered, and, in addition, it has become possible to restrain the IR drop to 166 mV in the 162 MHz operation block.
Fumiyasu ASAI Shinji KOMORI Toshiyuki TAMURA Hisakazu SATO Hidehiro TAKATA Yoshihiro SEGUCHI Takeshi TOKUDA Hiroaki TERADA
This paper details a unique VLSI design scheme which employs self-timed circuits. A 32-bit 50-MFLOPS data-driven microprocessor has been designed using a self-timed clocking scheme. This high performance data-driven microprocessor with sophisticated functions has been designed by a combination of several kinds of self-timed components. All functional blocks in the microprocessor are driven by self-timed clocks. The microprocessor integrates 700,000 devices in a 14.65 mm14.65 mm die area using double polysilicon double metal 0.8 µm CMOS technology.
Hisakazu SATO Yasuhiro NUNOMURA Niichi ITOH Koji NII Kanako YOSHIDA Hironobu ITO Jingo NAKANISHI Hidehiro TAKATA Yasunobu NAKASE Hiroshi MAKINO Akira YAMADA Takahiko ARAKAWA Toru SHIMIZU Yuichi HIRANO Takashi IPPOSHI Shuhei IWADE
A low-power microcontroller has been developed with 0.10 µm bulk compatible body-tied SOI technology. For this work, only two new masks are required. For the other layers, existing masks of a prior work developed with 0.18 µm bulk CMOS technology can be applied without any changes. With the SOI technology, the high-speed operation of over 600 MHz has been achieved at a supply voltage of 1.2 V, which is 1.5 times faster than prior work. Also, a five times improvement in the power-delay product has been achieved at a supply voltage 0.8 V. Moreover, the compatibility of the SOI technology with bulk CMOS has been verified, because all circuit blocks of the chip, including logic, memory, analog circuit, and PLL, are completely functional, even though only two new masks are used.
Jinmyoung KIM Toru NAKURA Hidehiro TAKATA Koichiro ISHIBASHI Makoto IKEDA Kunihiro ASADA
This paper presents an on-chip resonant supply noise canceller utilizing parasitic capacitance of sleep blocks. The test chip was fabricated in a 0.18 µm CMOS process and measurement results show 43.3% and 12.5% supply noise reduction on the abrupt supply voltage switching and the abrupt wake-up of a sleep block, respectively. The proposed method requires 1.5% area overhead for four 100 k-gate blocks, which is 7.1 X noise reduction efficient comparing with the conventional decap for the same power supply noise, while achieves 47% improvement of settling time. These results make fast switching of power mode possible for dynamic voltage scaling and power gating.
Tetsuya MATSUMURA Satoshi KUMAKI Hiroshi SEGAWA Kazuya ISHIHARA Atsuo HANAMI Yoshinori MATSUURA Stefan SCOTZNIOVSKY Hidehiro TAKATA Akira YAMADA Shu MURAYAMA Tetsuro WADA Hideo OHIRA Toshiaki SHIMADA Ken-ichi ASANO Toyohiko YOSHIDA Masahiko YOSHIMOTO Koji TSUCHIHASHI Yasutaka HORIBA
A single-chip MPEG-2 video, audio, and system encoder LSI has been developed. It performs concurrent real-time processing of MPEG-2 422P@ML video encoding, 2-channel Dolby Digital or MPEG-1 audio encoding, and system encoding that generates a multiplexed transport stream (TS) or a program stream (PS). Advanced hybrid architecture, which combines a high performance VLIW media-processor D30V and hardwired video processing circuits, has been adopted to satisfy the demands of both high flexibility and enormous computational capability. A unified control scheme has been newly proposed that hierarchically manages adaptive task priority control over asynchronous video, audio, and system encoding processes in order to achieve real-time concurrent processing using a single D30V. Dual dedicated motion estimation cores consisting of a coarse ME core (CME) for wide range searches and a fine ME core (FME) for precise searches have been integrated to produce high picture quality while using a small amount of hardware. Adopting these features, a single-chip encoder has been fabricated using 0.25-micron 4-layer metal CMOS technology, and integrated into a 14.2 mm 14.2 mm die with 11 million transistors.
Jinmyoung KIM Toru NAKURA Hidehiro TAKATA Koichiro ISHIBASHI Makoto IKEDA Kunihiro ASADA
Switched parasitic capacitors of sleep blocks with a tri-mode power gating structure are implemented to reduce on-chip resonant supply noise in 1.2 V, 65 nm standard CMOS process. The tri-mode power gating structure makes it possible to store charge into the parasitic capacitance of the power gated blocks. The proposed method achieves 53.1% and 57.9% noise reduction for wake-up noise and 130 MHz periodic supply noise, respectively. It also realizes noise cancelling without discharging time before using parasitic capacitors of sleep blocks, and shows 8.4x boost of the effective capacitance value with 2.1% chip area overhead. The proposed method can save the chip area for reducing resonant supply noise more effectively.
Taketora SHIRAISI Koji KAWAMOTO Kazuyuki ISHIKAWA Eiichi TERAOKA Hidehiro TAKATA Takeshi TOKUDA Kouichi NISHIDA
A low power consumption 16-bit fixed point Digital Signal Processor (DSP) has been developed to realize a half-rate CODEC for the Personal Digital Cellular (PDC) system. Dual datapath architecture has been employed to execute multiply-accumulate (MAC) operations with a high degree of efficiency. With this architecture. 86.3% of total MAC operations in the Pitch Synchronous Innovation Code Excited Linear Prediction (PSI-CELP) program are executed in parallel, so that total instruction cycles are reduced by 23.1%. The area overhead for the dual datapath architecture is only 3.0% of the total area. Furthermore, in order to reduce power consumption, circuit design techniques are also extensively applied to RAMs. ROMs, and clock circuits, which consume the great majority of power. By reducing the number of precharging bit lines, a power reduction of 49.8% is achieved in RAMs, and above 40% in ROMs. By applying gated clock to clock lines, a power reduction of 5.0% is achieved in the DSP that performs the PSI-CELP algorithm. The DSP is fabricated in 0.5 µm single-poly, double-metal CMOS technology. The PSI-CELP algorithm for the PDC half-rate CODEC can operate at 22.5 MHz instruction frequency and 1.6 V supply voltage. resulting in a low-power consumption of 28 mW.