Junnosuke HOSHIDO Tonan KAMATA Tsutomu ANSAI Ryuhei UEHARA
Shin-ichi NAKANO
Shang LU Kohei HATANO Shuji KIJIMA Eiji TAKIMOTO
Lin ZHOU Yanxiang CAO Qirui WANG Yunling CHENG Chenghao ZHUANG Yuxi DENG
Zhen WANG Longye WANG
Naohiro TODA Tetsuya NAKAGAMI
Haijun Wang Tao Hu Dongdong Chen Huiwei Yao Runze He Di Wu Zhifu Tian
Jianqiang NI Gaoli WANG Yingxin LI Siwei SUN
Rui CHENG Yun JIANG Qinglin ZHANG Qiaoqiao XIA
Ren TOGO Rintaro YANAGI Masato KAWAI Takahiro OGAWA Miki HASEYAMA
Naoki TATTA Yuki SAKATA Rie JINKI Yuukou HORITA
Kundan LAL DAS Munehisa SEKIKAWA Naohiko INABA
Menglong WU Tianao YAO Zhe XING Jianwen ZHANG Yumeng LIN
Jian ZHANG Zhao GUANG Wanjuan SONG Zhiyan XU
Shinya Matsumoto Daiki Ikemoto Takuya Abe Kan Okubo Kiyoshi Nishikawa
Kazuki HARADA Yuta MARUYAMA Tomonori TASHIRO Gosuke OHASHI
Zezhong WANG Masayuki SHIMODA Atsushi TAKAHASHI
Pierpaolo AGAMENNONE
Jianmao XIAO Jianyu ZOU Yuanlong CAO Yong ZHOU Ziwei YE Xun SHAO
Kazumasa ARIMURA Ryoichi MIYAUCHI Koichi TANNO
Shinichi NISHIZAWA Shinji KIMURA
Zhe LIU Wu GUAN Ziqin YAN Liping LIANG
Shuichi OHNO Shenjian WANG Kiyotsugu TAKABA
Yindong CHEN Wandong CHEN Dancheng HUANG
Xiaohe HE Zongwang LI Wei HUANG Junyan XIANG Chengxi ZHANG Zhuochen XIE Xuwen LIANG
Conggai LI Feng LIU Yingying LI Yanli XU
Siwei Yang Tingli Li Tao Hu Wenzhi Zhao
Takahiro FUJITA Kazuyuki WADA
Kazuma TAKA Tatsuya ISHIKAWA Kosei SAKAMOTO Takanori ISOBE
Quang-Thang DUONG Kohei MATSUKAWA Quoc-Trinh VO Minoru OKADA
Sihua LIU Xiaodong ZHU Kai KANG Li WAN Yong WANG
Kazuya YAMAMOTO Nobukazu TAKAI
Yasuhiro Sugimoto Nobukazu Takai
Ho-Lim CHOI
Weibang DAI Xiaogang CHEN Houpeng CHEN Sannian SONG Yichen SONG Shunfen LI Tao HONG Zhitang SONG
Duo Zhang Shishan Qi
Young Ghyu Sun Soo Hyun Kim Dong In Kim Jin Young Kim
Hongbin ZHANG Ao ZHAN Jing HAN Chengyu WU Zhengqiang WANG
Yuli YANG Jianxin SONG Dan YU Xiaoyan HAO Yongle CHEN
Kazuki IWAHANA Naoto YANAI Atsuo INOMATA Toru FUJIWARA
Rikuto KURAHARA Kosei SAKAMOTO Takanori ISOBE
Elham AMIRI Mojtaba JOODAKI
Qingqi ZHANG Xiaoan BAO Ren WU Mitsuru NAKATA Qi-Wei GE
Jiaqi Wang Aijun Liu Changjun Yu
Ruo-Fei Wang Jia Zhang Jun-Feng Liu Jing-Wei Tang
Yingnan QI Chuhong TANG Haiyang LIU Lianrong MA
Yi XIONG Senanayake THILAK Daisuke ARAI Jun IMAOKA Masayoshi YAMAMOTO
Zhenhai TAN Yun YANG Xiaoman WANG Fayez ALQAHTANI
Chenrui CHANG Tongwei LU Feng YAO
Takuma TSUCHIDA Rikuho MIYATA Hironori WASHIZAKI Kensuke SUMOTO Nobukazu YOSHIOKA Yoshiaki FUKAZAWA
Shoichi HIROSE Kazuhiko MINEMATSU
Toshimitsu USHIO
Yuta FUKUDA Kota YOSHIDA Takeshi FUJINO
Qingping YU Yuan SUN You ZHANG Longye WANG Xingwang LI
Qiuyu XU Kanghui ZHAO Tao LU Zhongyuan WANG Ruimin HU
Lei Zhang Xi-Lin Guo Guang Han Di-Hui Zeng
Meng HUANG Honglei WEI
Yang LIU Jialong WEI Shujian ZHAO Wenhua XIE Niankuan CHEN Jie LI Xin CHEN Kaixuan YANG Yongwei LI Zhen ZHAO
Ngoc-Son DUONG Lan-Nhi VU THI Sinh-Cong LAM Phuong-Dung CHU THI Thai-Mai DINH THI
Lan XIE Qiang WANG Yongqiang JI Yu GU Gaozheng XU Zheng ZHU Yuxing WANG Yuwei LI
Jihui LIU Hui ZHANG Wei SU Rong LUO
Shota NAKAYAMA Koichi KOBAYASHI Yuh YAMASHITA
Wataru NAKAMURA Kenta TAKAHASHI
Chunfeng FU Renjie JIN Longjiang QU Zijian ZHOU
Masaki KOBAYASHI
Shinichi NISHIZAWA Masahiro MATSUDA Shinji KIMURA
Keisuke FUKADA Tatsuhiko SHIRAI Nozomu TOGAWA
Yuta NAGAHAMA Tetsuya MANABE
Baoxian Wang Ze Gao Hongbin Xu Shoupeng Qin Zhao Tan Xuchao Shi
Maki TSUKAHARA Yusaku HARADA Haruka HIRATA Daiki MIYAHARA Yang LI Yuko HARA-AZUMI Kazuo SAKIYAMA
Guijie LIN Jianxiao XIE Zejun ZHANG
Hiroki FURUE Yasuhiko IKEMATSU
Longye WANG Lingguo KONG Xiaoli ZENG Qingping YU
Ayaka FUJITA Mashiho MUKAIDA Tadahiro AZETSU Noriaki SUETAKE
Xingan SHA Masao YANAGISAWA Youhua SHI
Jiqian XU Lijin FANG Qiankun ZHAO Yingcai WAN Yue GAO Huaizhen WANG
Sei TAKANO Mitsuji MUNEYASU Soh YOSHIDA Akira ASANO Nanae DEWAKE Nobuo YOSHINARI Keiichi UCHIDA
Kohei DOI Takeshi SUGAWARA
Yuta FUKUDA Kota YOSHIDA Takeshi FUJINO
Mingjie LIU Chunyang WANG Jian GONG Ming TAN Changlin ZHOU
Hironori UCHIKAWA Manabu HAGIWARA
Atsuko MIYAJI Tatsuhiro YAMATSUKI Tomoka TAKAHASHI Ping-Lun WANG Tomoaki MIMOTO
Kazuya TANIGUCHI Satoshi TAYU Atsushi TAKAHASHI Mathieu MOLONGO Makoto MINAMI Katsuya NISHIOKA
Masayuki SHIMODA Atsushi TAKAHASHI
Yuya Ichikawa Naoko Misawa Chihiro Matsui Ken Takeuchi
Katsutoshi OTSUKA Kazuhito ITO
Rei UEDA Tsunato NAKAI Kota YOSHIDA Takeshi FUJINO
Motonari OHTSUKA Takahiro ISHIMARU Yuta TSUKIE Shingo KUKITA Kohtaro WATANABE
Iori KODAMA Tetsuya KOJIMA
Yusuke MATSUOKA
Yosuke SUGIURA Ryota NOGUCHI Tetsuya SHIMAMURA
Tadashi WADAYAMA Ayano NAKAI-KASAI
Li Cheng Huaixing Wang
Beining ZHANG Xile ZHANG Qin WANG Guan GUI Lin SHAN
Soh YOSHIDA Nozomi YATOH Mitsuji MUNEYASU
Ryo YOSHIDA Soh YOSHIDA Mitsuji MUNEYASU
Nichika YUGE Hiroyuki ISHIHARA Morikazu NAKAMURA Takayuki NAKACHI
Ling ZHU Takayuki NAKACHI Bai ZHANG Yitu WANG
Toshiyuki MIYAMOTO Hiroki AKAMATSU
Yanchao LIU Xina CHENG Takeshi IKENAGA
Kengo HASHIMOTO Ken-ichi IWATA
Hiroshi FUJISAKI
Tota SUKO Manabu KOBAYASHI
Akira KAMATSUKA Koki KAZAMA Takahiro YOSHIDA
Manabu HAGIWARA
Shinya HONDA Takayuki WAKABAYASHI Hiroyuki TOMIYAMA Hiroaki TAKADA
With the growing design complexity of contemporary embedded systems, real-time operating systems (RTOSs) have become one of important components of such complex embedded systems. This paper presents an RTOS-centric hardware/software cosimulator which we have developed for embedded system design. One of the most remarkable features in our cosimulator is that it has a complete simulation model of an RTOS which is widely used in industry, so that application tasks including RTOS service calls are natively executed on a host computer. Our cosimulator also features cosimulation with functional simulation models of hardware written in C/C++ and cosimulation with HDL simulators. A case study with a JPEG decoder application demonstrates the effectiveness of our cosimulator.
Kazunori SHIMIZU Jumpei UCHIDA Yuichiro MIYAOKA Nozomu TOGAWA Masao YANAGISAWA Tatsuo OHTSUKI
In this paper, we propose a reconfigurable adaptive FEC system. In adaptive FEC schemes, the error correction capability t is changed dynamically according to the communication channel condition. If a particular error correction capability t is given, we can implement an FEC decoder which is optimal for t by taking the number of operations into consideration. Thus, reconfiguring the optimal FEC decoder dynamically for each error correction capability allows us to maximize the throughput of each decoder within a limited hardware resource. Based on this concept, our reconfigurable adaptive FEC system can reduce the packet dropping rate more efficiently than conventional fixed hardware systems. We can improve data transmission throughput for a reliable transport protocol. Practical simulation results are also shown.
This paper presents a VLSI design methodology for the MAC-level DWT/IDWT processor based on a novel limited-resource scheduling algorithm. The r-split Fully-specified Signal Flow Graph (FSFG) of limited-resource FIR filtering has been developed for the scheduling of the MAC-level DWT/IDWT signal processing. Given a set of architecture constraints and DWT parameters, the scheduling algorithm can generate four scheduling matrices that drive the data path to perform the DWT computation. Because the memory for the inter-octave is considered with the register of FIR filter, the memory size is less than the traditional architecture. Besides, based on the limited-resource scheduling algorithm, an automated DWT processor synthesizer has been developed and generates constrained DWT processors in the form of silicon intelligent property (SIP). The DWT SIP can be embedded into a SOC or mapped to program codes for commercial off-the-shelf (COTS) DSP processors with programmable devices. As a result, it has been successfully proven that a variety of DWT SIPs can be efficiently realized by tuning the parameters and applied for signal processing applications.
Michiaki MURAOKA Hiroaki NISHI Rafael K. MORIZAWA Hideaki YOKOTA Yoichi ONISHI
We propose a sophisticated synthesis methodology for SoC (System-on-Chip) architectures from the system level specification based on reusable high-level IPs named as Virtual Cores (VCores), in this paper. This synthesis methodology generates an initial architecture that consists of a CPU, buses, IPs, peripherals, I/Os and an RTOS (Real Time Operating System), as well as making tradeoffs to the architecture, between hardware and software on assigned software VCores and hardware VCores. The results of an architecture level design experiment, using the proposed methodology, shows that the partial automation of the architecture synthesis process, allied with design reuse, accelerates the architecture design, therefore, reducing the time required to design an architecture of SoC.
Fumio ARAKAWA Motokazu OZAWA Osamu NISHII Toshihiro HATTORI Takeshi YOSHINAGA Tomoichi HAYASHI Yoshikazu KIYOSHIGE Takashi OKADA Masakazu NISHIBORI Tomoyuki KODAMA Tatsuya KAMEI Makoto ISHIKAWA
A SuperHTM embedded processor core implemented in a 130-nm CMOS process running at 400 MHz achieved 720 MIPS and 2.8 GFLOPS at a power of 250 mW in worst-case conditions. It has a dual-issue seven-stage pipeline architecture but maintains the 1.8 MIPS/MHz of the previous five-stage processor. The processor meets the requirements of a wide range of applications, and is suitable for digital appliances aimed at the consumer market, such as cellular phones, digital still/video cameras, and car navigation systems.
Jumpei UCHIDA Nozomu TOGAWA Masao YANAGISAWA Tatsuo OHTSUKI
This paper proposes a thread partitioning algorithm in low power high-level synthesis. The algorithm is applied to high-level synthesis systems. In the systems, we can describe parallel behaving circuit blocks (threads) explicitly. First it focuses on a local register file RF in a thread. It partitions a thread into two sub-threads, one of which has RF and the other does not have RF. The partitioned sub-threads need to be synchronized with each other to keep the data dependency of the original thread. Since the partitioned sub-threads have waiting time for synchronization, gated clocks can be applied to each sub-thread. Then we can synthesize a low power circuit with a low area overhead, compared to the original circuit. Experimental results demonstrate effectiveness and efficiency of the algorithm.
Luca FANUCCI Riccardo LOCATELLI Andrea MINGHI
This paper presents the definition and implementation design of a low power data bus encoding scheme dedicated to system on chip video architectures. Trends in CMOS technologies focus the attention on the energy consumption issue related to on-chip global communication; this is especially true for data dominated applications such as video processing. Taking into account scaling effects a novel coupling-aware bus power model is used to investigate the statistical properties of video data collected in the system bus of a reference hardware/software H.263/MPEG-4 video coder architecture. The results of this analysis and the low complexity requirements drive the definition of a bus encoding scheme called CDSPBI (Coupling Driven Separated Partial Bus Invert), optimized ad-hoc for video data. A VLSI implementation of the coding circuits completes the work with an area/delay/power characterization that shows the effectiveness of the proposed scheme in terms of global power saving for a small circuit area overhead.
Kyung Tae DO Yang Hyo KIM Young Hwan KIM Jung Yun CHOI
We present a new approach to the power modeling of synthesizable soft macros, which uses the characteristics of individual input signals for high accuracy. We also present the parameterized power model, developed using the proposed approach, which can relieve us from the power characterization for all possible macro sizes. Extensive experiments illustrate that the proposed approaches exhibit the overall modeling errors below 4.24% and 4.71% for benchmark macros before and after parameterization, when compared with the results of gate-level analysis.
This paper presents a multiple-voltage high-level synthesis approach for low power DSP applications using algorithmic transformation techniques. Our approach is motivated by maximization of task mobilities in that the increase of mobilities may raise the possibility of assigning tasks to low-voltage components. The mobility means the ability to schedule the starting time of a task. It is defined as the distance between its as-late-as-possible (ALAP) schedule time and its as-soon-as-possible (ASAP) schedule time. To earn task mobilities, we use loop shrinking, retiming and unfolding techniques. The loop shrinking can first reduce the iteration period bound (IPB) and, then, the others are employed for shortening the iteration period (IP) as much as possible. The minimization of IP results in high task mobilities. Finally, we can assign tasks with high mobilities to low-voltage components and, thus, minimize energy under resource and latency constraints. With considering the overhead of level conversion, our approach can achieve significant power reduction. In the case of the third-order IIR filter, the proposed approach can save up to 40.2% of power consumption.
In this paper, we investigate a low-power architecture for designs modeled as an Extended Finite State Machine (EFSM). It is based on the general dynamic power management concept, in which the redundant computation can be dynamically disabled to reduce the overall power dissipation. The contribution of this paper is mainly a systematic procedure to identify almost maximal amount of redundant computation in a design given as an EFSM. There are two levels of redundant computation to be exploited--one is based on the machine state information, while the other is based on the transition information. After the extraction of the redundant computation, a low-power architecture using input gating is proposed to synthesize the final circuit. We tested the technique on a design computing a number's modulo inverse. Experimental results show that 31% power reduction can be achieved at the costs of 2% timing penalty and 16% area overhead.
Kimiyoshi USAMI Hiroshi YOSHIOKA
Leakage power is predicted to become dominant in the total operation power as the transistor technology gets advanced. Even in the current technology, dramatic increase of leakage power at elevated temperature is a big problem. Burn-in testing, which is typically performed at 125
Katsunori TANAKA Yahiko KAMBAYASHI
The Transduction Method is a powerful way to design logic circuits, utilizing already existing circuits. A set of permissible functions (SPF) plays an essential role in such circuit transformation/reduction, and is computed at each point (connection or gate output). Currently, two types of SPFs have been used: the maximum SPFs (MSPFs) and compatible SPFs (CSPFs). At each point, the MSPF is literally the set of all PF's, and CSPF is a subset of the MSPF. When CSPFs are calculated, priorities are first assigned to all gates in the circuit. Based on the priorities, it is decided which subset is to be selected as the CSPF. The quality of the results depends on the priorities. In this paper, the concept of super-sets of permissible functions (SSPFs) is introduced to reduce the effect of the priorities that CSPFs depend on. In order to loosen the dependency, each SSPF is computed to contain CSPFs which are candidates to be selected. The experimental results show that the SSPF-based Transduction Method has intermediate reduction capability and takes an intermediate computation time between the MSPF-based and CSPF-based ones. The capability and the time are considered as an acceptably good trade-off. In addition, without any transformations, since SSPFs are the maximum super-set, SSPFs are applicable for analyzing the maximum performance of the CSPF-based transformation, for comparison with the MSPF-based one. Theoretically, the number of connectable gate pairs detected by the MSPFs is 100%. According to the experimental results obtained using SSPFs, on average, 99% are detectable by SSPFs and 1% are detectable only by using the MSPFs. The results show that by using CSPFs, 72% of connectable gate pairs are detectable with any priority assignment and 99% (SSPFs capability) are detectable on average even when the best priorities are assigned. According to the experimental results of CSPF calculation with five priorities, 82% to 93% are practically detectable on average. This is the first quantitative analysis realized by SSPFs which compares the CSPF-based and MSPF-based Transduction Methods with respect to the coverage of PF's.
Debatosh DEBNATH Tsutomu SASAO
Checking the equivalence of two Boolean functions under permutation of the variables is an important problem in the synthesis of multiplexer-based field-programmable gate arrays (FPGAs), and the problem is known as Boolean matching. This paper presents an efficient breadth-first search technique for computing a canonical form--namely P-representative--of Boolean functions under permutation of the variables. Two functions match if they have the same P-representative. On an ordinary workstation, on the average, the method requires several microseconds to check the Boolean matching of functions with up to eight variables against a library with tens of thousands of cells.
Hui QIN Tsutomu SASAO Munehiro MATSUURA Shinobu NAGAYAMA Kazuyuki NAKAMURA Yukihiro IGUCHI
A look-up table (LUT) cascade is a new type of a programmable logic device (PLD) that provides an alternative way to realize multiple-output functions. An LUT ring is an emulator for an LUT cascade. Compared with an LUT cascade, the LUT ring is more flexible. In this paper we discuss the realization of multiple-output functions with the LUT ring. Unlike an FPGA realization of a logic function, accurate prediction of the delay time is easy in an LUT ring realization. A prototype of an LUT ring has been custom-designed with 0.35 µm CMOS technology. Simulation results show that the LUT ring is 80 to 241 times faster than software programs on an SH-1, and 36 to 93 times faster than software programs on a PentiumIII when the frequencies for the LUT ring and the MPUs are the same, but is slightly slower than commercial FPGAs.
Ko YOSHIKAWA Keisuke KANAMARU Yasuhiko HAGIHARA Shigeto INUI Yuichi NAKAMURA Takeshi YOSHIMURA
Latch-based circuits have advantages for timing and are widely used for high-speed custom circuits. ASIC design flows, however, are based on circuits with flip-flops. This paper describes a new timing optimization algorithm by replacing the flip-flops in high-end ASICs by latches without changing the functionality of the circuits. Timing is optimized by using a fixed-phase retiming minimizing the impact of clock skew and jitter. A formal equivalence verification method that assures the logical correctness of the latch-replaced circuits is also proposed. Experimental results show that the optimization algorithm decreases the delay of benchmark circuits by as much as 17%.
Signal integrity problem arises as one of the main issues in digital circuits manufactured by today's deep submicron technology. The coupling capacitance of neighboring lines may cause delays of circuit and it may affect the functionality of circuit. These effects are usually referred to as crosstalk. Since it requires additional design cost to fix crosstalk noise, the false aggressor nodes that cannot affect on victim node have to be eliminated. In this paper, we propose efficient heuristic algorithm that considers functional correlation for false aggressor pruning in crosstalk noise analysis. The false aggressors are detected by a path sensitization algorithm and logic implication. The efficiency of our algorithm has been verified on Benchmark circuits with a 0.18 µm standard cell library. Experimental results show an average of 5.4% false aggressor detection and an average improvement of 14.6% in the accuracy of timing analysis.
Suk-Jin KIM Jeong-Gun LEE Kiseon KIM
Inter-domain communications on a chip require a synchronizer to resolve the timing problems between an input and a clock of a destination. This paper presents a parallel flop synchronizer and its interface circuit for transferring asynchronous data to the clock domain. The proposed scheme uses a bank of independent two-flops in parallel and supports a two-phase handshake protocol. Compared to the conventional two-flop synchronizer, performance analysis shows that the proposed scheme can reduce latency up to one and a half of clock cycles while retaining its safety to a tolerable level. All designs have been implemented in a 0.25 µm CMOS technology to verify performance analysis of the proposed synchronization.
Makoto SUGIHARA Kazuaki MURAKAMI Yusuke MATSUNAGA
In this paper, a test architecture optimization for system-on-a-chip under floorplanning constraints is proposed. The models of previous test architecture optimizations were too ideal to be applied to industrial SOCs. To make matters worse, they couldn't treat topological locality of cores, that is, floorplanning constraints. The optimization proposed in this paper can avoid long wires for TAMs in consideration of floorplanning constraints and finish optimizing test architectures within reasonable computation time.
A large memory is typically designed with multiple identical memory blocks for reducing delay and power. The circuit verification of individual memory blocks can be effectively handled by the Symbolic Trajectory Evaluation (STE) approach. However, if multiple memory blocks are integrated into a single system, the STE approach cannot verify it economically. This paper introduces algorithms for verifying block-level connectivity of memories. The verification time of a large memory can be reduced drastically by using bottom-up verification scheme. That is, a memory block is first verified thoroughly, and then only the interconnection between memory blocks of the large memory needs to be verified. The proposed verification algorithms require (3n+2(
Youhua SHI Shinji KIMURA Masao YANAGISAWA Tatsuo OHTSUKI
In this paper, we present a test data compression technique to reduce test data volume for multiscan-based designs. In our method the internal scan chains are divided into equal sized groups and two dictionaries were build to encode either an entire slice or a subset of the slice. Depending on the codeword, the decompressor may load all scan chains or may load only a group of the scan chains, which can enhance the effectiveness of dictionary-based compression. In contrast to previous dictionary coding techniques, even for the CUT with a large number of scan chains, the proposed approach can achieve satisfied reduction in test data volume with a reasonable smaller dictionary. Experimental results showed the proposed test scheme works particularly well for the large ISCAS'89 benchmarks.
Tsuyoshi IWAGAKI Satoshi OHTAKE Hideo FUJIWARA
This paper presents a non-scan design scheme to enhance delay fault testability of controllers. In this scheme, we utilize a given state transition graph (STG) to test delay faults in its synthesized controller. The original behavior of the STG is used during test application. For faults that cannot be detected by using the original behavior, we design an extra logic, called an invalid test state and transition generator, to make those faults detectable. Our scheme allows achieving short test application time and at-speed testing. We show the effectiveness of our method by experiments.
Youhua SHI Shinji KIMURA Masao YANAGISAWA Tatsuo OHTSUKI
Test data volume and power consumption for scan-based designs are two major concerns in system-on-a-chip testing. However, test set compaction by filling the don't-cares will invariably increase the scan-in power dissipation for scan testing, then the goals of test data reduction and low-power scan testing appear to be conflicted. Therefore, in this paper we present a selective scan chain reconfiguration method for test data compression and scan-in power reduction. The proposed method analyzes the compatibility of the internal scan cells for a given test set and then divides the scan cells into compatible classes. After the scan chain reconfiguration a dictionary is built to indicate the run-length of each compatible class and only the scan-in data for each class should be transferred from the ATE to the CUT so as to reduce test data volume. Experimental results for the larger ISCAS'89 benchmarks show that the proposed approach overcomes the limitations of traditional run-length coding techniques, and leads to highly reduced test data volume with significant power savings during scan testing in all cases.
Soo-Hyun KIM Ho-Yong CHOI Kiseon KIM Dong-Ik LEE
In this paper, usage of undefined states on a State Transition Graph (STG) is addressed to obtain high fault coverage, in the area of Synthesis For Testability (SFT) of synchronous sequential circuits. Basically, a given STG could be modified by adding undefined states and distinguishable transitions so that each state might be included in one strongly-connected component as much as possible. Such modification decreases the number of redundant faults caused by the existence of unreachable states on an STG. For the modification, we propose two algorithms for both incompletely-specified STGs and completely-specified STGs, respectively. In case of incompletely-specified STGs, undefined states are added using unspecified transitions of defined states. In case of completely-specified STGs, undefined states are added by changing transitions specified on an STG while preserving state equivalence. Experimental results with MCNC benchmarks show that the number of redundant faults of gate-level circuits synthesized by our modified STGs are reduced, resulting in high fault coverage as well as short test generation time
Ning FU Shigetoshi NAKATAKE Yasuhiro TAKASHIMA Yoji KAJITANI
The success in topdown design of recent huge system LSIs is in a seamless transfer of the information resulted from the high level design to the lower level of floorplanning. For the purpose, we introduce a new concept abstract floorplan which is included in the output of high level design. From the abstract floorplan, the pillar blocks are derived which are critical sets of blocks that are expected to determine the width and height of the chip, named the frame. Since the frame and pillar blocks are obtained in the high level stage, they are useful to keep the consistency in the low level physical design if we apply optimization regarding them as constraints. Experiments to MCNC benchmarks showed that abstract floorplanning by pillar blocks output a placement faithful to the one physically optimized block placement with respect to the chip area and the wire-length.
Hua-An ZHAO Chen LIU Yoji KAJITANI Keishi SAKANUSHI
A floorplan specifies the layout of modules in very large scale integration (VLSI) design, and a new code, called the EQ-sequence, for representing a floorplan is presented in this paper. The EQ-sequence is based on a Q-sequence. The EQ-sequence can preserve the adjacent relationships of rooms on a floorplan, but the Q-sequence cannot. The algorithms for encoding, moving and decoding of an EQ-sequence are introduced. With the EQ-sequence, we can check whether two modules abut each other on a floorplan. It has been proved that any floorplan of n rooms is uniquely encoded by an EQ-sequence and any EQ-sequence is uniquely decoded to a floorplan, both in O(n) time.
Kazuki FUKUOKA Masaaki IIJIMA Kenji HAMADA Masahiro NUMA Akira TADA
This paper presents a novel layout approach using dual supply voltage technique. In Placing and Routing (P&R) phase, conventional approaches for dual supply voltages need to separate low supply voltage cells from high voltage ones. Consequently its layout tends to be complex compared with single supply voltage layout. Our layout approach uses cells having two supply voltage rails. Making these cells is difficult in bulk due to increase in area by n-well isolation or in delay by negative body bias caused by sharing n-well. On the other hand, making cells with two supply voltage rails is easy in body-tied PD-SOI owing to trench isolation of each body of transistor. Since our approach for dual supply voltages offers freedom for placement as much as conventional ones for single supply voltage, exsting P&R tools can be used without special operation. Simulation results with MCNC circuits and adders show that our approach reduces power by 23% and 25%, respectively, showing almost the same delay with single supply voltage layout.
Masanori HASHIMOTO Hidetoshi ONODERA
This paper proposes a post-layout transistor sizing method for crosstalk noise reduction. The proposed method downsizes the drivers of aggressor wires for noise reduction, utilizing the precise interconnect information extracted from the detail-routed layouts. We develop a transistor sizing algorithm for crosstalk noise reduction under delay constraints, and construct a crosstalk noise optimization method utilizing an analytic crosstalk noise model and a transistor sizing framework that have been developed. Our method exploits the transistor sizing framework that can vary transistor widths inside cells with interconnects unchanged. Our optimization method therefore never causes a new crosstalk noise problem, and does not need iterative layout optimization. The effectiveness of the proposed method is experimentally examined using 2 circuits. The maximum noise voltage is reduced by more than 50% without delay violation. These results show that the risk of crosstalk noise problems can be considerably reduced after detail-routing.
Keiji KIDA Xiaoke ZHU Changwen ZHUANG Yasuhiro TAKASHIMA Shigetoshi NAKATAKE
This paper presents a novel algorithm for crosspoint assignment (CPA) that takes into consideration crosstalk noise and shielding effects in deep sub-micron design. We introduce a conditional constraint which is imposed on a sensitive net-pair to detach one net from the other or to put another insensitive net between them for shielding. We provide two algorithms which can handle the conditional constraint: One is based on an ILP, which outputs an exact optimum solution. The other is a fast heuristics whose time complexity is O(n2 log n), where n is the number of pins. In experiments, we tested these algorithms for industrial examples. The results showed that the conditional constraint for shielding released algorithms from a tight space of feasible assignments. Our heuristics ran quickly and attained near optimum solutions.
Weikun GUO Sheldon X.-D. TAN Zuying LUO Xianlong HONG
This paper proposes a new simulation algorithm for analyzing large power distribution networks, modeled as linear RLC circuits, based on a novel partial random walk concept. The random walk simulation method has been shown to be an efficient way to solve for voltages of small number of nodes in a large power distribution network, but the algorithm becomes expensive to solve for voltages of nodes that are more than a few with high accuracy. In this paper, we combine direct methods like LU factorization with the random walk concept to solve power distribution networks when voltage waveforms from a large number of nodes are required. We extend the random walk algorithm to deal with general RLC networks and show that Norton companion models for capacitors and self-inductors are more amenable for transient analysis by using random walks than Thevenin companion models. We also show that by nodal analysis (NA) formulation for all the voltage sources, LU-based direct simulations of subcircuits can be speeded up. Experimental results demonstrate that the resulting algorithm, called partial random walk (PRW), has significant advantages over the existing random walk method especially when the VDD/GND nodes are sparse and accuracy requirement is high.
Jingjing FU Zuying LUO Xianlong HONG Yici CAI Sheldon X.-D. TAN Zhu PAN
In this paper, we present an efficient method to budget on-chip decoupling capacitors (decaps) to optimize power delivery networks in an area efficient way. Our algorithm is based on an efficient gradient-based non-linear programming method for searching the solution. Our contributions are an efficient gradient computation method (time-domain merged adjoint network method) and a novel equivalent circuit modeling technique to speed up the optimization process. Experimental results demonstrate that the algorithm is capable of efficiently optimizing very large scale P/G networks.
Herng-Jer LEE Ming-Hong LAI Chia-Chi CHU Wu-Shiung FENG
A new moment computation technique for general lumped R(L)C interconnect circuits with multiple resistor loops is proposed. Using the concept of tearing, a lumped R(L)C network can be partitioned into a spanning tree and several resistor links. The contributions of network moments from each tree and the corresponding links can be determined independently. By combining the conventional moment computation algorithms and the reduced ordered binary decision diagram (ROBDD), the proposed method can compute system moments efficiently. Experimental results have demonstrate that the proposed method can indeed obtain accurate moments and is more efficient than the conventional approach.
Tetsuya IIZUKA Makoto IKEDA Kunihiro ASADA
This paper proposes a cell layout synthesis method via Boolean Satisfiability (SAT). Cell layout synthesis problems are first transformed into SAT problems by our formulations. Our method realizes a high-speed layout synthesis for CMOS logic cells and guarantees to generate the minimum-width cells with routability under our layout styles. It considers complementary P-/N-MOSFETs individually during transistor placement, and can generate smaller width layout compared with pairing the complementary P-/N-MOSFETs case. To demonstrate the effectiveness of our SAT-based cell synthesis, we present experimental results which compare it with the 0-1 ILP-based transistor placement method and a commercial cell generation tool. The experimental results show that our SAT-based method can generate minimum-width placements in much shorter run time than the 0-1 ILP-based transistor placement method, and can generate the cell layouts of 32 static dual CMOS logic circuits in 54% run time compared with the commercial tool. Area increase of our method without compaction is only 3% compared with the commercial tool with compaction.
Takashi NOJIMA Xiaoke ZHU Yasuhiro TAKASHIMA Shigetoshi NAKATAKE Yoji KAJITANI
A challenge to an automated layout of analog ICs starts with the insight into high quality placements crafted by experts. We observe first that matched devices or elemental functions such as input, output, amplifiers, etc are clustered. Second, devices in the same cluster are located faithfully to the drawn schema. Third, these two features are simultaneously fulfilled in a well-compacted placement. This paper proposes a novel device-level placement that simulates the above features based on Sequence-Pair. A slight modification of the meaning, say, of relation "A is left-of B" to relation "A is not right-of B" enlarges the freedom and allows a neater compaction of clusters allowing zigzag border curves. As the consequence, clusters are placed faithfully to relative position in the schema. We tested our algorithm for industrial instances and compared results with those by manual design. The results showed better features in performance figures than the those of manual designs by, on average, 13.5% and 21.2% with respect to the area and total net-length.
Zhao LI Ravikanth SURAVARAPU Kartikeya MAYARAM C.-J. Richard SHI
This paper presents CrtSmile--a CAD tool for the automatic extraction of layout-dependent substrate effects for RF MOSFET modeling. CrtSmile incorporates a new scalable substrate model, which depends not only on the geometric layout information of a transistor (the number of gate fingers, finger width, channel length and bulk contact location), but also on the transistor layout and bulk patterns. We show that this model is simple to extract and has good agreement with measured data for a 0.35 µm CMOS process. CrtSmile reads in the layout information of RF transistors in the CIF/GDSII format, performs a pattern-based layout extraction to recognize the transistor layout and bulk patterns. A scalable layout-dependent substrate model is automatically generated and attached to the standard BSIM3 device model as a sub-circuit for use in circuit simulation. A low noise amplifier is evaluated with the proposed CrtSmile tool, showing the importance of layout effects for RF transistor substrate modeling.
Kazumi HATAYAMA Michinobu NAKAO Yoshikazu KIYOSHIGE Koichiro NATSUME Yasuo SATO Takaharu NAGUMO
This letter presents a practical approach for high-quality built-in test using a test pattern generator called neighborhood pattern generator (NPG). NPG is practical mainly because its structure is independent of circuit under test and it can realize high fault coverage not only for stuck-at faults but also for transition faults. Some techniques are also proposed for further improvement in practical applicability of NPG. Experimental results for large industrial circuits illustrate the efficiency of the proposed approach.
Pao-Lung CHEN Ching-Che CHUNG Chen-Yi LEE
In this paper, a novel digitally-controlled varactor (DCV) for portable delay cell design is presented. The proposed varactor uses the gate capacitance differences of NAND/NOR gates under different digital control inputs to build up a digitally-controlled varactor. Then the proposed varactor is applied to design a high resolution delay cell and to achieve a fine delay resolution. Different types of NAND/NOR gates (2-input or 3-input) for DCV design are also investigated in this paper. The proposed DCV can be implemented with standard cells, thus it can be easily ported to different processes in a short time. A test chip fabricated on a standard 0.35 µm CMOS 2P4M process proves that the proposed delay cell has a fine delay resolution about 1.55 ps. As a result, the proposed DCV exhibits finer resolution, better linearity, and better portability than traditional delay elements, and is very suitable for portable delay cell design.
Mitsuhiko YAGYU Akinori NISHIHARA
This paper presents optimum and sub-optimal designs of noise-shaping FIR filters for single- and multi-bit data converters. In the designs, only three parameters, the number of taps, oversampling ratio (OSR) and l1-norm of the filter coefficients are specified, and the in-band peak of the amplitude response is minimized under the specifications. The minimization problem is formulated with the overload-free condition, which guarantees the rigorous stability, and an overload-free converter generates no distortion in any output signals. In the optimum design, the minimization problem is directly and exactly solved, but the sub-optimal method solves this problem by iteratively utilizing the simplex method. The iterative sub-optimal method without the exact optimality is far faster and more efficient than the optimum method. In design examples, optimum and sub-optimal noise-shaping FIR filters for single- and multi-bit data converters are designed, and their optimal performance is revealed. For single-bit data converters with OSR 64, a noise-shaping FIR filter is designed and then shown to achieve a signal to noise and distortion ratio (SNDR) 107.6 [dB] in the band of interest.
Bong Gyun ROH Chang-Su KIM Sang-Uk LEE
In this paper, we propose a progressive encoding algorithm for binary voxel models, which represent 3D object shapes. For progressive transmission, multi-resolution models are generated by decimating an input voxel model. Then, each resolution model is encoded by employing the pattern code representation(PCR). In PCR, the voxel model is represented with a series of pattern codes. The pattern of a voxel informs of the local shape of the model around that voxel. PCR can achieve a coding gain, since the pattern codes are highly correlated. In the multi-resolution framework, the coding gain can be further improved by exploiting the decimation constraints from the lower resolution models. Furthermore, the shell classification scheme is proposed to reduce the number of pattern codes to represent the whole voxel model. Simulation results show that the proposed algorithm provides about 1.1-1.3 times higher coding gain than the conventional PCR algorithm.
Yong XIANG Wensheng YU Jingxin ZHANG Senjian AN
This paper presents a new method for blind source separation by exploiting phase and frequency redundancy of cyclostationary signals in a complementary way. It requires a weaker separation condition than those methods which only exploit the phase diversity or the frequency diversity of the source signals. The separation criterion is to diagonalize a polynomial matrix whose coefficient matrices consist of the correlation and cyclic correlation matrices, at time delay τ= 0, of multiple measurements. An algorithm is proposed to perform the blind source separation. Computer simulation results illustrate the performance of the new algorithm in comparison with the existing ones.
Seiichi NAKAMORI Raquel CABALLERO-AGUILA Aurora HERMOSO-CARAZO Josefa LINARES-PEREZ
This paper treats the least-squares linear filtering and smoothing problems of discrete-time signals from uncertain observations when the random interruptions in the observation process are modelled by a sequence of independent Bernoulli random variables. Using an innovation approach we obtain the filtering algorithm and a general expression for the smoother which leads to fixed-point, fixed-interval and fixed-lag smoothing recursive algorithms. The proposed algorithms do not require the knowledge of the state-space model generating the signal, but only the covariance information of the signal and the observation noise, as well as the probability that the signal exists in the observed values.
Nari TANABE Toshihiro FURUKAWA Kohichi SAKANIWA Shigeo TSUJII
We propose a practical blind channel identification algorithm based on the principal component analysis. The algorithm estimates (1) the channel order, (2) the noise variance, and then identifies (3) the channel impulse response, from the autocorrelation of the channel output signal without using the eigenvalue and singular-value decomposition. The special features of the proposed algorithm are (1) practical method to find the channel order and (2) reduction of computational complexity. Numerical examples show the effectiveness of the proposed algorithm.
Jae-Hun KIM Hyunseok SHIN Euntai KIM Mignon PARK
This paper presents a fuzzy model-based approach for synchronization of time-delay chaotic system with input saturation. Time-delay chaotic drive and response system is respectively represented by Takagi-Sugeno (T-S) fuzzy model. Specially, the response system contains input saturation. Using the unidirectional linear error feedback and the parallel distributed compensation (PDC) scheme, we design fuzzy chaotic synchronization system and analyze local stability for synchronization error dynamics. Since time-delay in the transmission channel always exists, we also take it into consideration. The sufficient condition for the local stability of the fuzzy synchronization system with input saturation and channel time-delay is derived by applying Lyapunov-Krasovskii theory and solving linear matrix inequalities (LMI's) problem. Numerical examples are given to demonstrate the validity of the proposed approach.
A moment matrix analysis (MMA) method can derive macroscopic statistical properties such as moments, response time, and power spectra of non-linear equations without solving the equations. MMA expands a non-linear equation into simultaneous linear equations of moments, and reduces it to a linear equation of their coefficient matrix and a moment vector. We can analyze the statistical properties from the eigenvalues and eigenvectors of the coefficient matrix. This paper presents (1) a systematic procedure to linearize non-linear equations and (2) an expansion of the previous work of MMA to derive the statistical properties of various non-linear equations. The statistical properties of the logistic map were evaluated by using MMA and computer simulation, and it is shown that the proposed systematic procedure was effective and that MMA could accurately approximate the statistical properties of the logistic map even though such a map had strong non-linearity.
It is an important problem to estimate component reliabilities. For a series system due to cost and time constraints associated with failure analysis, all components cannot be investigated and the cause of failure is narrowed to a subset of components in some cases. When such a case occurs, we say that the cause of failure is masked. It is also necessary in some cases to take account of the influence of an environmental stress on all components. In this paper, we consider 2 and 3-component series systems when the component lifelengths are exponentially distributed and an environmental stress follows either a gamma or an inverse Gaussian distribution. We show that the lifelength of the system and the cause of failure are independent of each other. By comparison between the hazard functions in both models, we see that quite short and long lifelengths are more likely to occur in a gamma model than in an inverse Gaussian one. Assuming that the masking probabilities do not depend on which component actually fails, we show that the likelihood function can be factorized into three parts by a reparametrization. For some special cases, some estimators are given in closed-form. We use the computer failure data to see that our model is useful to analyze the real masked data. As compared with the Kaplan-Meier estimator, our models fit this computer data better than no environmental stress model. Further, we determine a suitable model using AIC. We see that the gamma model is fitted to the data better than the inverse Gaussian one. From a limited simulation study for a 3-component series system, we see that the relative errors of some estimators are inversely proportional to the square root of the expected number of systems whose cause of failure is identified.
Hristo KOSTADINOV Hiroyoshi MORITA Nikolai MANEV
In this paper we present the exact expressions for the bit error probability over a Gaussian noise channel of coded QAM using single error correcting integer codes. It is shown that the proposed integer codes have a better performance with respect to the lower on the bit error probability for trellis coded modulation.
Jinil HONG Woo Suk YANG Dongmin KIM Young-Ju KIM
In this paper, we introduce a new technology to extract the unique features from an iris image, which uses scale-space filtering. Resulting iris code can be used to develop a system for rapid and automatic human identification with high reliability and confidence levels. First, an iris part is separated from the whole image and the radius and center of the iris are evaluated. Next, the regions that have a high possibility of being noise are discriminated and the features presented in the highly detailed pattern are then extracted. In order to conserve the original signal while minimizing the effect of noise, scale-space filtering is applied. Experiments are performed using a set of 272 iris images taken from 18 persons. Test results show that the iris feature patterns of different persons are clearly discriminated from those of the same person.
Junji SUZUKI Isao FURUKAWA Sadayasu ONO
Digital cinema will continue, for some time, to use image signals converted from the density values of film stock through some form of digitization. This paper investigates the required numbers of quantization bits for both intensity and density. Equations for the color differences created by quantization distortion are derived on the premise that the uniform color space L* a* b* can be used to evaluate color differences in digitized pictorial color images. The location of the quantized sample that yields the maximum color difference in the color gamut is theoretically analyzed with the proviso that the color difference must be below the perceivable limit of human visual systems. The result shows that the maximum color difference is located on a ridge line or a surface of the color gamut. This can reduce the computational burden for determining the required precision for color quantization. Design examples of quantization resolution are also shown by applying the proposed evaluation method to three actual color spaces: NTSC, HDTV, and ROMM.
The Fourier integrals are treated as a rigorous extension of the Fourier series expansion. The reward for this is that so called, in the Fourier integrals, singular functions that are not absolutely integrable, e.g., trigonometric functions can be discussed within a field of ordinary function giving a foundation for the delta function as distribution.
Yasunari YOKOTA Hideaki IWATA Motoki SHIGA
This study investigates the effect of the method of time division in frequency domain ICA on estimation accuracy of ICA. We show that source signals expressed in the frequency domain lose non-Gaussianity and independence because of the long and overlapping window function, respectively, in time division. Consequently, the estimation accuracy of ICA decreases.
A new construction of sequences having both a low peak factor (crest factor) and flat power spectrum is proposed. The flat power spectrum provides zero auto-correlation except for the case of zero shift. The proposed construction is based on a systematic scheme that does not require a search, and affords sequences of length 4n(2n+1) for an arbitrary integer n.
Jeongpyo KIM Yongchul SONG Beomsup KIM
This paper describes a technique for background digital multistage calibration in the removal of nonlinearities caused by design limitations in pipelined analog-to-digital converters (ADCs). Foreground initialization reduces the calibration time. Furthermore, an improved background skip-and-fill method enables the ADC to trace environmental changes. This method uses a least mean square adaptive algorithm that is digitally implemented with a significantly reduced number of tap coefficients.
Jung-Su KIM Tae-Woong YOON Claudio DE PERSIS
A switched nonlinear system is considered, and the interval between two consecutive switchings is assumed to be greater than a value called "the dwell time." When switching among nonlinear systems, using a constant dwell time generally fails to lead to stability. In this letter, a state dependent dwell time function with convergence guarantees is presented for discrete-time stable nonlinear systems.
The orthogonal function approach is developed in this paper to solve the Takagi-Sugeno (TS) fuzzy-model-based dynamic equations. The new method simplifies the procedure of solving the TS-fuzzy-model-based dynamic equations into the successive solution of a system of recursive formulae only involving matrix algebra. Based on the presented recursive formulae, an algorithm only involving straightforward algebraic computation is also proposed in this paper. The computational complexity can therefore be reduced remarkably. An illustrated example shows that the proposed method based on the orthogonal functions can obtain satisfactory results.
Shuhong WANG Guilin WANG Feng BAO Jie WANG
In 2000, Wang et al. proposed a (t,n) threshold signature scheme with (k,l) threshold shared verification, and a (t,n) threshold authenticated encryption scheme with (k,l) threshold shared verification. Later, Tseng et al. mounted some attacks against Wang et al.'s schemes. At the same, they also presented the improvements. In this paper, we first point out that Tseng et al.'s attacks are actually invalid due to their misunderstanding of Wang et al.'s Schemes. Then, we show that both Wang et al.'s schemes and Tseng et al.'s improvements are indeed insecure by demonstrating several effective attacks.
Hyunduk KANG Insoo KOO Vladimir KATKOVNIK Kiseon KIM
In cellular systems, a code division multiple access (CDMA) technology with array antennas can significantly reduce interferences by taking advantage of the combination of spreading spectrum and spatial filtering. We investigate performance of cellular CDMA systems through adopting two types of array antennas, switched beam forming (SBF) and tracking beam forming (TBF) in the base station. Through Monte-Carlo simulations, we evaluate average bit-error-rate (BER) and outage probability of the systems under log-normal shadowing channels with multi-cell environment. When we consider 2 beams and 4 beams per sector for the SBF method, it is observed that the TBF method gives at least 10% and 30% capacity improvement over the SBF method in aspects of 10-3 BER and 1% outage probability, respectively.