IEICE globals.ieice.org Site

Author Search Result

[Author] Fumio ARAKAWA(14hit)

1-14hit

Parallel Design of Feedback Control Systems Utilizing Dead Time for Embedded Multicore Processors
Yuta SUZUKI Kota SATA Jun'ichi KAKO Kohei YAMAGUCHI Fumio ARAKAWA Masato EDAHIRO

PAPER-Electronic Instrumentation and Control

Vol:
E99-C No:4
Page(s):
491-502
This paper presents a parallelization method utilizing dead time to implement higher precision feedback control systems in multicore processors. The feedback control system is known to be difficult to parallelize, and it is difficult to deal with the dead time in control systems. In our method, the dead time is explicitly represented as delay elements. Then, these delay elements are distributed to the overall systems with equivalent transformation so that the system can be simulated or executed in parallel pipeline operation. In addition, we introduce a method of delay-element addition for parallelization. For a spring-mass-damper model with a dead time, parallel execution of the model using our technique achieves 3.4 times performance acceleration compared with its sequential execution on an ideal four-core simulation and 1.8 times on a cycle-accurate simulator of a four-core embedded processor as a threaded application on a real-time operating system.
Reducing Consuming Clock Power Optimization of a 90 nm Embedded Processor Core
Tetsuya YAMADA Masahide ABE Yusuke NITTA Kenji OGURA Manabu KUSAOKE Makoto ISHIKAWA Motokazu OZAWA Kiwamu TAKADA Fumio ARAKAWA Osamu NISHII Toshihiro HATTORI

PAPER-Low Power Techniques

Vol:
E89-C No:3
Page(s):
287-294
A low-power SuperHTM embedded processor core, the SH-X2, has been designed in 90-nm CMOS technology. The power consumption was reduced by using hierarchical fine-grained clock gating to reduce the power consumption of the flip-flops and the clock-tree, synthesis and a layout that supports the implementation of the clock gating, and several-level power evaluations for RTL refinement. With this clock gating and RTL refinement, the power consumption of the clock-tree and flip-flops was reduced by 35% and 59%, including the process shrinking effects, respectively. As a result, the SH-X2 achieved 6,000 MIPS/W using a Renesas low-power process with a lowered voltage. Its performance-power efficiency was 25% better than that of a 130-nm-process SH-X.
A Low-Power Embedded RISC Microprocessor with an Integrated DSP for Mobile Applications
Tetsuya YAMADA Makoto ISHIKAWA Yuji OGATA Takanobu TSUNODA Takahiro IRITA Saneaki TAMAKI Kunihiko NISHIYAMA Tatsuya KAMEI Ken TATEZAWA Fumio ARAKAWA Takuichiro NAKAZAWA Toshihiro HATTORI Kunio UCHIYAMA

INVITED PAPER

Vol:
E85-C No:2
Page(s):
253-262
A 32-bit embedded RISC microprocessor core integrating a DSP has been developed using a 0.18-µm five-layer-metal CMOS technology. The integrated DSP has a single-MAC and exploits CPU resources to reduce hardware. The DSP occupies only 0.5 mm2. The processor core includes a large on-chip 128 kB SRAM called U-memory. A large capacity on-chip memory decreases the amount of traffic with an external memory. And it is effective for low-power and high-performance operation. To realize low-power dissipation for the U-memory access, the active ratio of U-memory's access is reduced. The critical path is a load path from the U-memory, and we optimized the path through the whole chip. The chip achieves 0.79 mA/MHz executing Dhrystone 1.1 at 108 MHz, which is suitable for mobile applications.
A 4500 MIPS/W, 86 µA Resume-Standby, 11 µA Ultra-Standby Application Processor for 3G Cellular Phones
Makoto ISHIKAWA Tatsuya KAMEI Yuki KONDO Masanao YAMAOKA Yasuhisa SHIMAZAKI Motokazu OZAWA Saneaki TAMAKI Mikio FURUYAMA Tadashi HOSHI Fumio ARAKAWA Osamu NISHII Kenji HIROSE Shinichi YOSHIOKA Toshihiro HATTORI

PAPER-Digital

Vol:
E88-C No:4
Page(s):
528-535
We have developed an application processor optimized for 3G cellular phones. It provides high energy efficiency by using various low power techniques. For low active power consumption, we use a hierarchical clock gating technique with a static clock gating controlled by software and a two-level dynamic clock gating controlled by hardware. This technique reduces clock power consumption by 35%. And we also apply a pointer-based pipeline to in the CPU core, which reduces the pipeline latch power by 25%. This processor contains 256 kB of on-chip user RAM (URAM) to reduce the external memory access power. The URAM read buffer (URB) enables high-throughput, low latency access to the URAM while keeping the CPU clock frequency high because the URAM read data is transferred to the URB in 256-bit widths at half the frequency of the CPU. The average miss penalty is 3.5 cycles at the CPU clock frequency, hit rate is 89% and the energy used for URAM reads is 8% less that what it would be for URAM without a URB. These techniques reduce the power consumption of the CPU core, and achieve 4500 MIPS/W at 1.0 V power supply (Dhrystone 2.1). For the low leakage requirements, we use internal power switches, and provides resume-standby (R-standby) and ultra-standby (U-standby) modes. Signals across a power boundary are transmitted through µI/O circuits to prevent invalid signal transmission. In the R-standby mode, the power supply to almost all the CPU core area, except for the URAM is cut off and the URAM is set to a retention mode. In the U-standby mode, the power supply to the URAM is also turned off for less leakage current. The leakage currents in the R-standby and in the U-standby modes are respectively only 98 and 12 µA. For quick recovery from the R-standby mode, the boot address register (BAR) and control register contents needed immediately after wake-up are saved by hardware into backup latches. The other contents are saved by software into URAM. It takes 2.8 ms to fully recover from R-standby.
Embedded Processor Core with 64-Bit Architecture and Its System-On-Chip Integration for Digital Consumer Products
Kunio UCHIYAMA Fumio ARAKAWA Yasuhiko SAITO Koki NOGUCHI Atsushi HASEGAWA Shinichi YOSHIOKA Naohiko IRIE Takeshi KITAHARA Mark DEBBAGE Andy STURGES

PAPER

Vol:
E84-C No:2
Page(s):
139-149
A 64-bit architecture for an embedded processor targeted for next-generation digital consumer products has been developed. It has dual-mode instruction sets and is optimized for high multimedia performance, provided by SIMD/floating-point vector instructions in 32-bit length ISA, and small code size, provided by a conventional 16-bit length ISA. Large register files, (6464b and 6432b), a split-branch mechanism, and virtual cache are also adopted in the architecture. A 714MIPS/9.6 GOPS/400 MHz processor core with the 64-bit architecture and a system LSI containing the core are developed using 0.15-µm technology. The LSI includes a 3.2 GB/sec high-bandwidth on-chip bus, a high-speed DRAM interface, a SRAM/Flash/ROM/Multiplexed-bus interface, and a 66 MHz PCI interface that provide the performance required for next-generation multimedia applications.
An Embedded Processor Core for Consumer Appliances with 2.8GFLOPS and 36 M Polygons/s FPU
Fumio ARAKAWA Motokazu OZAWA Osamu NISHII Toshihiro HATTORI Takeshi YOSHINAGA Tomoichi HAYASHI Yoshikazu KIYOSHIGE Takashi OKADA Masakazu NISHIBORI Tomoyuki KODAMA Tatsuya KAMEI Makoto ISHIKAWA

PAPER-System Level Design

Vol:
E87-A No:12
Page(s):
3068-3074
A SuperHTM embedded processor core implemented in a 130-nm CMOS process running at 400 MHz achieved 720 MIPS and 2.8 GFLOPS at a power of 250 mW in worst-case conditions. It has a dual-issue seven-stage pipeline architecture but maintains the 1.8 MIPS/MHz of the previous five-stage processor. The processor meets the requirements of a wide range of applications, and is suitable for digital appliances aimed at the consumer market, such as cellular phones, digital still/video cameras, and car navigation systems.
FOREWORD Open Access
Fumio ARAKAWA Makoto IKEDA

FOREWORD

Vol:
E104-C No:6
Page(s):
213-214
Branch Micro-Architecture of an Embedded Processor with Split Branch Architecture for Digital Consumer Products
Naohiko IRIE Fumio ARAKAWA Kunio UCHIYAMA Shinichi YOSHIOKA Atsushi HASEGAWA Kevin IADONATE Mark DEBBAGE David SHEPHERD Margaret GEARTY

PAPER-High-Performance Technologies

Vol:
E85-C No:2
Page(s):
315-322
An embedded processor core using split branch architecture has been developed. This processor core targets 400 MHz using 0.18 µm technology, and its higher frequency needs deeper pipeline than the conventional processor. To solve the increasing branch penalty problem caused by a deeper pipeline, this processor takes an active preload mechanism to preload the target instructions to internal buffers in order to hide the instruction cache latency. The processor also uses multiple instruction buffers to reduce branch penalty cycles of branch misprediction. The performance estimation result shows that about 70% of branch overhead cycles can be reduced from the conventional implementation. The area for this branch mechanism consumes only 1% of the total core, which is smaller than the conventional branch target buffer (BTB) scheme, and helps to achieve low power and low cost.
An Exact Leading Non-Zero Detector for a Floating-Point Unit
Fumio ARAKAWA Tomoichi HAYASHI Masakazu NISHIBORI

PAPER-Digital

Vol:
E88-C No:4
Page(s):
570-575
Parallel execution of the carry propagate adder (CPA) and leading non-zero (LNZ) detector that processes the CPA result is a common way to reduce the latencies of floating-point instructions. However, the conventional methods usually cause one-bit errors. We developed an exact LNZ detection circuit operating in parallel with the CPA. The circuit is implemented in the floating-point unit of our newly developed embedded processor core. Circuit simulation results show that the LNZ circuit has a similar speed to the CPA, and it contributes to make a small low-power FPU for an embedded processor core.
FOREWORD Open Access
Fumio ARAKAWA Makoto IKEDA

FOREWORD

Vol:
E105-C No:6
Page(s):
207-208
FOREWORD Open Access
Fumio ARAKAWA Makoto IKEDA

FOREWORD

Vol:
E98-C No:7
Page(s):
534-535
FOREWORD Open Access
Fumio ARAKAWA Makoto IKEDA

FOREWORD

Vol:
E99-C No:8
Page(s):
899-900
FOREWORD Open Access
Fumio ARAKAWA Makoto IKEDA

FOREWORD

Vol:
E103-C No:3
Page(s):
66-67
FOREWORD Open Access
Fumio ARAKAWA Makoto IKEDA

FOREWORD

Vol:
E100-C No:3
Page(s):
221-222

Author Search Result

[Author] Fumio ARAKAWA(14hit)

Parallel Design of Feedback Control Systems Utilizing Dead Time for Embedded Multicore Processors

Reducing Consuming Clock Power Optimization of a 90 nm Embedded Processor Core

A Low-Power Embedded RISC Microprocessor with an Integrated DSP for Mobile Applications

A 4500 MIPS/W, 86 µA Resume-Standby, 11 µA Ultra-Standby Application Processor for 3G Cellular Phones

Embedded Processor Core with 64-Bit Architecture and Its System-On-Chip Integration for Digital Consumer Products

An Embedded Processor Core for Consumer Appliances with 2.8GFLOPS and 36 M Polygons/s FPU

FOREWORD Open Access

Branch Micro-Architecture of an Embedded Processor with Split Branch Architecture for Digital Consumer Products

An Exact Leading Non-Zero Detector for a Floating-Point Unit

FOREWORD Open Access

FOREWORD Open Access

FOREWORD Open Access

FOREWORD Open Access

FOREWORD Open Access

Latest Issue

FlyerIEICE has prepared a flyer regarding multilingual services. Please use the one in your native language.

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles