Aram KAWEWONG Yutaro HONDA Manabu TSUBOYAMA Osamu HASEGAWA
Robot path-planning is one of the important issues in robotic navigation. This paper presents a novel robot path-planning approach based on the associative memory using Self-Organizing Incremental Neural Networks (SOINN). By the proposed method, an environment is first autonomously divided into a set of path-fragments by junctions. Each fragment is represented by a sequence of preliminarily generated common patterns (CPs). In an online manner, a robot regards the current path as the associative path-fragments, each connected by junctions. The reasoning technique is additionally proposed for decision making at each junction to speed up the exploration time. Distinct from other methods, our method does not ignore the important information about the regions between junctions (path-fragments). The resultant number of path-fragments is also less than other method. Evaluation is done via Webots physical 3D-simulated and real robot experiments, where only distance sensors are available. Results show that our method can represent the environment effectively; it enables the robot to solve the goal-oriented navigation problem in only one episode, which is actually less than that necessary for most of the Reinforcement Learning (RL) based methods. The running time is proved finite and scales well with the environment. The resultant number of path-fragments matches well to the environment.
Toshihiro MATSUDA Shinsuke ISHIMARU Shingo NOHARA Hideyuki IWATA Kiyotaka KOMOKU Takayuki MORISHITA Takashi OHZONE
MOS capacitors with Si-implanted thermal oxide and CVD deposited oxide of 30 nm thickness were fabricated for applications of non-volatile memory and electroluminescence devices. Current-voltage (I-V) and I-V hysteresis characteristics were measured, and the hysteresis window (HW) and the integrated charge of HW (ICHW) extracted from the hysteresis data were discussed. The HW characteristics of high Si dose samples showed the asymmetrical double-peaks curves with the hump in both tails. The ICHW almost converged after the 4th cycle and had the voltage sweep speed dependence. All +ICHW and -ICHW characteristics were closely related to the static (+I)-(+VG) and (-I)-(-VG) curves, respectively. For the high Si dose samples, the clear hump currents in the static I-VG characteristics contribute to lower the rising voltage and to steepen the ICHW increase, which correspond to the large stored charge in the oxide.
Florin BALASA Ilie I. LUICAN Hongwei ZHU Doru V. NASUI
Many signal processing systems, particularly in the multimedia and telecommunication domains, are synthesized to execute data-intensive applications: their cost related aspects -- namely power consumption and chip area -- are heavily influenced, if not dominated, by the data access and storage aspects. This paper presents an energy-aware memory allocation methodology. Starting from the high-level behavioral specification of a given application, this framework performs the assignment of the multidimensional signals to the memory layers -- the on-chip scratch-pad memory and the off-chip main memory -- the goal being the reduction of the dynamic energy consumption in the memory subsystem. Based on the assignment results, the framework subsequently performs the mapping of signals into both memory layers such that the overall amount of data storage be reduced. This software system yields a complete allocation solution: the exact storage amount on each memory layer, the mapping functions that determine the exact locations for any array element (scalar signal) in the specification, and an estimation of the dynamic energy consumption in the memory subsystem.
Yanni ZHAO Jinian BIAN Shujun DENG Zhiqiu KONG Kang ZHAO
Despite the growing research effort in formal verification, industrial verification often relies on the constrained random simulation methodology, which is supported by constraint solvers as the stimulus generator integrated within simulator, especially for the large design with complex constraints nowadays. These stimulus generators need to be fast and well-distributed to maintain simulation performance. In this paper, we propose a dynamic method to guide stimulus generation by SAT solvers. An adjusting strategy named Tabu Search with Memory (TSwM) is integrated in the stimulus generator for the search and prune processes along with the constraint solver. Experimental results show that the method proposed in this paper could generate well-distributed stimuli with good performance.
Shanq-Jang RUAN Jui-Yuan HSIEH Chia-Han LEE
This paper presents a gate-block selection algorithm, which can synthesize a proper parameter extractor of the pre-computation-based content-addressable memory (PB-CAM) to enhance power efficiency for specific applications such as embedded systems, microprocessor and SOC, etc. Furthermore, a novel CAM cell design with single bit-line is proposed. The proposed CAM cell design requires only one heavy loading bit-line and merely is constructed with eight transistors. The whole PB-CAM design was described in Spice with TSMC 0.35 µm double-poly quadruple-metal CMOS process. We used Synopsys Nanosim to estimate power consumption. With a 128 words by 32 bits CAM size, the experimental results showed that our proposed PB-CAM effectively reduces 18.21% of comparison operations in the CAM and saves 16.75% in power reduction by synthesizing a proper parameter extractor of the PB-CAM compared with the 1's count PB-CAM. This implies that our proposed PB-CAM is more flexible and adaptive for specific applications.
Shinya KAJIYAMA Masamichi FUJITO Hideo KASAI Makoto MIZUNO Takanori YAMAGUCHI Yutaka SHINAGAWA
A novel 300 MHz embedded flash memory for dual-core microcontrollers with a shared ROM architecture is proposed. One of its features is a three-stage pipeline read operation, which enables reduced access pitch and therefore reduces performance penalty due to conflict of shared ROM accesses. Another feature is a highly sensitive sense amplifier that achieves efficient pipeline operation with two-cycle latency one-cycle pitch as a result of a shortened sense time of 0.63 ns. The combination of the pipeline architecture and proposed sense amplifiers significantly reduces access-conflict penalties with shared ROM and enhances performance of 32-bit RISC dual-core microcontrollers by 30%.
Shuzhuang ZHANG Hao LUO Binxing FANG Xiaochun YUN
Scanning packet payload at a high speed has become a crucial task in modern network management due to its wide variety applications on network security and application-specific services. Traditionally, Deterministic finite automatons (DFAs) are used to perform this operation in linear time. However, the memory requirements of DFAs are prohibitively high for patterns used in practical packet scanning, especially when many patterns are compiled into a single DFA. Existing solutions for memory blow-up are making a trade-off between memory requirement and memory access of processing per input character. In this paper we proposed a novel method to drastically reduce the memory requirements of DFAs while still maintain the high matching speed and provide worst-case guarantees. We removed the duplicate transitions between states by dividing all the DFA states into a number of groups and making each group of states share a merged transition table. We also proposed an efficient algorithm for transition sharing between states. The high efficiency in time and space made our approach adapted to frequently updated DFAs. We performed several experiments on real world rule sets. Overall, for all rule sets and approach evaluated, our approach offers the best memory versus run-time trade-offs.
A channel code is constructed using sparse matrices for stationary memoryless channels that do not necessarily have a symmetric property like a binary symmetric channel. It is also shown that the constructed code has the following remarkable properties. 1. Joint source-channel coding: Combining channel code with lossy source code, which is also constructed by sparse matrices, a simpler joint source-channel code can be constructed than that constructed by the ordinary block code. 2. Universal coding: The constructed channel code has a universal property under a specified condition.
Hiroki SUGANO Takahiko MASUZAKI Hiroshi TSUTSUI Takao ONOYE Hiroyuki OCHI Yukihiro NAKAMURA
The encoding/decoding process of JPEG2000 requires much more computation power than that of conventional JPEG mainly due to the complexity of the entropy encoding/decoding. Thus usually multiple entropy codec hardware modules are implemented in parallel to process the entropy encoding/decoding. This module, however, requests many small-size memories to store intermediate data, and when multiple modules are implemented on a chip, employment of the large number of SRAMs increases difficulty of whole chip layout. In this paper, an efficient memory organization framework for the entropy encoding/decoding module is proposed, in which not only existing memory organizations but also our proposed novel memory organization methods are attempted to expand the design space to be explored. As a result, the efficient memory organization for a target process technology can be explored.
In this letter, we propose a two-bit representation method for turbo decoder extrinsic information based on bit error count minimization and parameter reset. We show that the performance of the proposed system approaches that of the full precision decoder within 0.17 dB and 0.48 dB at 1 % packet error rate for packet lengths of 500 and 10,000 information bits. The idea of parameter reset we introduce can be used not only in turbo decoder but also in many other iterative algorithms.
Hideki TANAKA Takashi MORIE Kazuyuki AIHARA
In this paper, we propose an analog CMOS circuit which achieves spiking neural networks with spike-timing dependent synaptic plasticity (STDP). In particular, we propose a STDP circuit with symmetric function for the first time, and also we demonstrate associative memory operation in a Hopfield-type feedback network with STDP learning. In our spiking neuron model, analog information expressing processing results is given by the relative timing of spike firing events. It is well known that a biological neuron changes its synaptic weights by STDP, which provides learning rules depending on relative timing between asynchronous spikes. Therefore, STDP can be used for spiking neural systems with learning function. The measurement results of fabricated chips using TSMC 0.25 µm CMOS process technology demonstrate that our spiking neuron circuit can construct feedback networks and update synaptic weights based on relative timing between asynchronous spikes by a symmetric or an asymmetric STDP circuits.
Takashi MORI Yuuki SATO Hitoshi KAWAGUCHI
Optical buffer memory for 10-Gb/s data signal is demonstrated experimentally using a polarization bistable vertical-cavity surface-emitting laser (VCSEL). The optical buffer memory is based on an optical AND gate function and the polarization bistability of the VCSEL. Fast AND gate operation responsive to 50-ps-width optical pulses is achieved experimentally by increasing the detuning frequency between an injection light into the VCSEL and a lasing light from the VCSEL. A specified bit is extracted from the 10-Gb/s data signal by the fast AND gate operation and is stored as the polarization state of the VCSEL by the polarization bistability. The corresponding numerical simulations are also performed using two-mode rate equations taking into account the detuning frequency. The simulation results confirm the fast AND gate operation by increasing the detuning frequency as well as the experimental results.
Checker synthesis for assertion based verification becomes popular because of the recent progress on the FPGA prototyping environment. In the paper, we propose a checker synthesis method based on the finite input-memory automaton suitable for embedded RAM modules in FPGA. There are more than 1 Mbit memories in medium size FPGA's and such embedded memory cells have the capability to be used as the shift registers. The main idea is to construct a checker circuit using the finite input-memory automata and implement shift register chain by logic elements or embedded RAM modules. When using RAM module, the method does not consume any logic element for storing the value. Note that the shift register chain of input memory can be shared with different assertions and we can reduce the hardware resource significantly. We have checked the effectiveness of the proposed method using several assertions.
Yoon KIM Seongjae CHO Gil Sung LEE Il Han PARK Jong Duk LEE Hyungcheol SHIN Byung-Gook PARK
We propose a 3-dimensional terraced NAND flash memory. It has a vertical channel so it is possible to make a long enough channel in 1F2 size. And it has 3-dimensional structure whose channel is connected vertically along with two stairs. So we can obtain high density as in the stacked array structure, without silicon stacking process. We can make NAND flash memory with 3F2 cell size. Using SILVACO ATLAS simulation, we study terraced NAND flash memory characteristics such as program, erase, and read. Also, its fabrication method is proposed.
Chul-Woong YANG Ki YONG LEE Myoung HO KIM Yoon-Joon LEE
In this paper, we present an efficient index structure for NAND flash memory, called the Dynamic Forest (D-Forest). Since write operations incur high overhead on NAND flash memory, D-Forest is designed to minimize write operations for index updates. The experimental results show that D-Forest significantly reduces write operations compared to the conventional B+-tree.
Doo-Hyun KIM Il Han PARK Seongjae CHO Jong Duk LEE Hyungcheol SHIN Byung-Gook PARK
This paper presents a detailed study of the retention characteristics in scaled multi-bit SONOS flash memories. By calculating the oxide field and tunneling currents, we evaluate the charge trapping mechanism. We calculate transient retention dynamics with the ONO fields, trapped charge, and tunneling currents. All the parameters were obtained by physics-based equations and without any fitting parameters or optimization steps. The results can be used with nanoscale nonvolatile memory. This modeling accounts for the VT shift as a function of trapped charge density, time, silicon fin thickness and type of trapped charge, and can be used for optimizing the ONO geometry and parameters for maximum performance.
Seongjae CHO Jung Hoon LEE Gil Sung LEE Jong Duk LEE Hyungcheol SHIN Byung-Gook PARK
Recently, various types of 3-D nonvolatile memory (NVM) devices have been researched to improve the integration density [1]-[3]. The NVM device of pillar structure can be considered as one of the candidates [4],[5]. When this is applied to a NAND flash memory array, bottom end of the device channel is connected to the bulk silicon. In this case, the current in vertical direction varies depending on the thickness of silicon channel. When the channel is thick, the difference of saturation current levels between on/off states of individual device is more obvious. On the other hand, when the channel is thin, the on/off current increases simultaneously whereas the saturation currents do not differ very much. The reason is that the channel potential barrier seen by drain electrons is lowered by read voltage on the opposite sidewall control gate. This phenomenon that can occur in 3-D structure devices due to proximity can be called gate-induced barrier lowering (GIBL). In this work, the dependence of GIBL on silicon channel thickness is investigated, which will be the criteria in the implementation of reliable ultra-small NVM devices.
Kazuhiro SHIMANOE Katsunori MAKIHARA Mitsuhisa IKEDA Seiichi MIYAZAKI
We have studied the formation of Pd-nanodots on SiO2 from ultrathin Pd films being exposed to remote hydrogen plasma at room temperature, in which parameters such as the gas pressure and input power to generate H2 plasma and the Pd film thickness were selected to get some insights into surface migration of Pd atoms induced with atomic hydrogen irradiation and resultant agglomeration with cohesive action. The areal dot density was controlled in the range from 3.4 to 6.51011 cm - 2 while the dot size distribution was changed from 7 to 1.5 in average dot height with 40% variation in full-width at half maximum. We also fabricated MOS capacitors with a Pd-nanodots floating gate and confirmed the flat-band voltage shift in capacitance-voltage characteristic due to electron injection to and emission from the dots floating gate.
Ryusuke NEBASHI Noboru SAKIMURA Tadahiko SUGIBAYASHI Naoki KASAI
We propose an MRAM macro architecture for SoCs to reduce their area size. The shared write-selection transistor (SWST) architecture is based on 2T1MTJ MRAM cell technology, which enables the same fast access time with a smaller cell area than that of 6T SRAMs. We designed a 4-Mb macro using the SWST architecture with a 0.15-µm CMOS process and a 0.24-µm MRAM process. The macro cell array consists of 81T64MTJ cell array elements, each storing 64 bits of data. The area size is reduced by more than 30%. By introducing a leakage-replication (LR) read scheme, a wide read margin on a test chip is accomplished and 50-ns access time is achieved with SPICE simulation. The 2T1MTJ macro and 81T64MTJ macro can be integrated into a single SoC.
Hiroki SHIMANO Fukashi MORISHITA Katsumi DOSAKA Kazutami ARIMOTO
The proposed built-in Power-Cut scheme intended for a wide range of dynamically data retaining memories including embedded SoC memories enables the system-level power management to handle SoC on which the several high density and low voltage scalable memory macros are embedded. This scheme handles the deep standby mode in which the SoC memories keep the stored data in the ultra low standby current, and quick recovery to the normal operation mode and precise power management are realized, in addition to the conventional full power-off mode in which the SoC memories stay in the negligibly low standby current but allow the stored data to disappear. The unique feature of the statically or dynamically changeable internal voltages of memory in the deep standby mode brings about much further reduction of the standby current. This scheme will contribute to the further lowering power of the mobile applications requiring larger memory capacity embedded SoC memories.