INVITED PAPER Special Section on Circuits and Design Techniques for Advanced Large Scale Integration

# Low Power Platform for Embedded Processor LSIs

Toru SHIMIZU<sup>†a)</sup>, Member, Kazutami ARIMOTO<sup>†</sup>, Senior Member, Osamu NISHII<sup>†</sup>, Member, Sugako OTANI<sup>†</sup>, and Hiroyuki KONDO<sup>†</sup>, Nonmembers

**SUMMARY** Various low power technologies have been developed and applied to LSIs from the point of device and circuit design. A lot more CPU cores as well as function IPs are integrated on a single chip LSI today. Therefore, not only the device and circuit low power technologies, but software power control technologies are becoming more important to reduce active power of application systems. This paper overviews the low power technologies and defines power management platform as a combination of hardware functions and software programming interface. This paper discusses importance of the power management platform and direction of its development.

key words: low power, processor, operating system, distributed processing

#### 1. Introduction

Recent development of semiconductor circuit and design technology enables integration of multiple CPUs and media IPs in a single chip. Process technology node evolution reduces power dissipation of a single logic gate, however, increase of the number of gates integrated in a chip makes power reduction of a chip more difficult. Typical power consumption of a chip is increasing, in total.

In this basic trend of power increase, many design technologies have been proposed to reduce power of a logic LSI. When we overview these existing low power technologies, the combinational technologies of hardware and software technologies are further effective rather than hardware technology alone, because low power sleep and standby state control is a large source of power reduction and these states should be controlled based on software running status on a CPU.

While the logic gate count increase for a single LSI, the LSI design technology has been changed to be more module oriented. Logic LSIs are designed as integration of a lot of functional modules, i.e. IPs, rather than simple integration of huge number of gates. A CPU is the typical IP used for the LSI design. As a result of CPU integration for a logic LSI and multiple CPU integration trend of LSI design, the low power technologies should be modularized based on the combination of CPU and power control software. The modularization of the low power control based on CPU hardware and software enables reusable application software for logic LSI's low power control. In addition, the modularization of

<sup>†</sup>The authors are with Renesas Electronics Corporation, Itamishi, 664-0005 Japan.

a) E-mail: toru.shimizu.xn@renesas.com DOI: 10.1587/transele.E94.C.394

Manuscript received November 8, 2010. Manuscript revised December 13, 2010. the low power can be extended to cover not only single chip integrations, but multiple chip system configurations.

In order to enable reusability of application software design for low power control, standardization of the hardware mechanism and software programming interface for low power control is necessary. This paper calls it, "Low power platform". The platform functions and properties required are the main topic of this paper.

This paper is organized as follows. Section two overviews the current low power technologies and their classification. Section three describes the low power technologies we have developed, to show the current status of the technology development as the basis of the low power platform development. In section four, we describe functions and properties for the platform, especially for distributed parallel software systems. Section five proposes a future direction of the platform development.

# 2. Overview of Low Power Technologies

Low power technologies for LSI design can be described based on various metrics. Section 2.1 classifies the low power technologies based on the method of saving power. Section 2.2 describes the granularity of hardware control for low power consumption. Section 2.3 describes the low power control protocol. The low power control protocol is defined as procedures for entering into or exiting from low power states.

# 2.1 Classification of the Low Power Technologies

Power of CMOS circuit is calculated as follows:

$$P = \Sigma(1/2)\alpha CV^2 f + \Sigma I \text{ leak } V$$
 (1)

 $\alpha$ : activity ratio (0-1)

C: capacity

V: voltage

f: operating frequency

I leak: leak current of each circuit

The first term is a dynamic power term and the second term is a static (leak) power term. Many existing low power technologies are classified by their methods of managing the low power terms. Clock signal stopping reduces f during a long waiting period. Clock frequency control also reduces f for

required processing performance. Clock gating reduces  $\alpha$  with fine pitch of gate operation control. Power-gating [1], [2], in other words, a power supply on and off for a full or a part of a chip, is related to I leak reduction during non-operation period of some part of an LSI. Dynamic Voltage Frequency Scaling (DVFS) [3] reduces f and f. Adaptive Voltage Scaling (AVS) [4] reduces f with constant f.

#### 2.2 Granularity of Hardware Control for Low Power

Most of the technologies described in Sect. 2.1 share common characteristics as following. If the low power technologies are applied on smaller elements of LSI hardware, the expected period of power-down for some specific element becomes longer and the expected power saving amount for a total LSI becomes larger, in general. On the other hand, if the low power technologies are applied on smaller elements, LSI chip area overhead becomes larger to implement extra circuits for low power control. The tradeoff between the granularity and the area overhead of the low power control is one of the most important issues for low power LSI design.

The device selection is one of the key factors to satisfy the system requirements, because some kinds of process such as high performance, low leakage, low operation voltage etc. have been prepared in a same technology node after 130 nm CMOS standard process technology [5]. Additionally, multi Vt (threshold voltage) technique [6] has been also applied to each process to enhance speed and power characteristics. The implementation of multi Vt, technology is embedded in automatic CMOS design flow [7], for example, low Vt transistors in critical path and optimum tradeoff of the power and speed.

Continuously power reduction improvements provided the combination of devices and circuits, in the next step. Here two techniques of a back-gate bias [8] and a negative voltage bias [10] are introduced. The back-gate bias can control the threshold voltage of MOS transistor by triple well device structure, and dynamic control of back-gate bias can achieve both the suppression of stand-by leakage current and high speed switching circuits. The negative gate voltage bias suppresses the sub-threshold leakage current drastically by the circuit technologies of power switch [9] and hierarchical power line [10]. These technologies are used in current many kinds of battery operated mobile applications.

The clock signal stopping technology for a long period of waiting time is generally applied to LSI chips based on functional module by module. The clock stopping technology to a whole IP has been implemented from 1990's. A typical example of the clock stopping is shown in a moving picture codec controlled by a gated clock in which the gating condition is determined by a flip-flop (FF) value. The clock signal for the codec is stopped, when the FF value is set to zero. The clock gating can be applied to a part of IP as well as a whole IP.

Regarding to the power gating, larger granularity is applied to the technology. A System-on-a-Chip (SOC) de-

signed for a cellar phone codec [1] has 20 power domains in a single chip.

#### 2.3 Low Power Control Protocol

The clock gating mechanism of some IP is activated by some specific sequence of signals from some other IP, in general. The sequence is defined as a sort of protocol for low power control.

Regarding to the power gating mechanism, the more complicated protocol should be applied for low power control, as following:

- (i) In the case of power down, physical cut-off operation of the power supply must wait for the completion of some logical shutdown procedure, so that the procedure can save the internal running state of the IP into some registers and memory successfully. If the power supply is cut off in the middle of the logical shutdown procedure, the internal state of the IP cannot be saved correctly, so the IP cannot resume operation, when the power supply is turned on.
- (ii) In the case of power on, logical start up procedure should wait for some physical conditions, such as stable voltage, stable clock signal oscillation, and the reduction of rush currents [10].

Similar to the clock gating and the power gating technologies, other power saving technologies, such as DVFS [3] and AVS [4], have their own protocols for low power control. The protocols are generally implemented by a combination of hardware and software. The back-gate bias and the negative gate bias control circuits are the basic hardware solutions for DVFS and AVS. A software code stops the clock signal or the power supply by setting a certain command or special data into some clock or power control register field. In some case, a special instruction, such as sleep instruction, is provided for the purpose. A software code checks the hardware status for preparation conditions. The protocol sequence before the software is not running is controlled by some hardware.

# 3. LSI Examples of the Low Power Technology Application

This section introduces some LSI examples of the low power technologies described in the previous section.

#### 3.1 Adaptive Clock Delay for AVS

A developed multi-core LSI [11] implements an adaptive clock delay mechanism for AVS with wider voltage operation. The chip has multiple clock signals each for multiple CPUs and accelerators; each delay amount is controlled adaptively, by hardware and software. This adaptive control enables an operation with lower-voltage.

Figure 1 shows its chip micrograph. 16 sets of delay monitor block (marked as "stars", in Fig. 1), which consists of a digital counter and phase status flags, are placed on the chip.



Fig. 1 Chip micrograph of a multi-core LSI which implements adaptive clock delay mechanism [11].



Fig. 2 Power tree of a hierarchical power domain chip [10].

#### 3.2 Power-Gating for a Specific Purpose LSI

A 90 nm, 3G-cellular phone processor [1], [10] implements 20 power domains for many cellular phone low power use-case. Ground lines are divided for each power domain, and power transistors connect each ground line and the chip ground line. Turning off these power transistors enable partial power down in this LSI.

In order to implement efficiently this multi-power-domain chip, a hierarchical power domain rules are defined. Figure 2 describes an example of the hierarchical power control rules as a power tree.

One of the basic rules is "inclusion". In this power tree, a root state rules branch states, i.e. branch states are effective under the condition of their root state is power-on. This power tree is utilized to visualize available power states



Fig. 3 Chip micrograph of eight-processor chip [2].

**Table 1** Power state of each CPU and its URAM [2].

| Power modes | Normal | Light<br>Sleep | Sleep     | Resume<br>Power-off | Full<br>Power-off |
|-------------|--------|----------------|-----------|---------------------|-------------------|
| CPU         | active | Clock:off      | Clock:off | Clock:off           | Clock:off         |
|             |        | Power:on       | Power:on  | Power:off           | Power:off         |
| Cache       | active | active         | Clock:off | Clock:off           | Clock:off         |
|             |        |                | Power:on  | Power:off           | Power:off         |
| URAM        | active | active         | active    | Clock:off           | Clock:off         |
|             |        |                |           | Power:on            | Power:off         |

and to minimize the low power control design complexity by explicit showing of valid states and valid state-transitions.

#### 3.3 Power-Gating of a General Purpose LSI

A 90 nm, 8-CPU embedded multiprocessor [2] implements 17 power domains. The chip micrograph is shown in Fig. 3. These 17 domains consist of eight CPUs, eight SRAMs of 64K byte for each CPU, called URAM (SRAM with power switch for negative voltage bias), and a common power domain. Eight CPUs independently choose a power control state from five options: Normal, Light Sleep, Sleep, Resume Power-off, and Full Power-off. The Resume Power-off state cut off the power supply for the URAM while keeping the power supply for the CPU and its cache memory. The Full Power-off state cut off power supply for both URAM and CPU, in which the URAM data cannot be saved but stand-by power for the URAM and CPU becomes lower (Table 1).

## 3.4 Software Power Management for a Multiprocessor

A software power management technology is applied to a chip-multiprocessor [12] to control power supply based on processor by processor. The power management driver software is developed and provided as a package with a version of Linux operating system, which is supporting dynamic power management (DPM). The Linux and the power management driver can control,

- (i) Power on and off for each CPU independently,
- (ii) Operating clock frequency for each CPU indepen-

dently,

(iii) Operating voltage of the four CPUs.

This mechanism of hardware and software combination realizes effective power control based on the CPU processing load variations. The leveling of power consumption (suppression of peak current and IR-drop) by above power control gives the wide operational margin, higher reliability, and the long battery life as a result.

## 3.5 Low Power Mechanism for Many-Core Chip

Integration trend will enable many-core architecture with 64-CPU integration into a chip. In this many core, the significance of inter-CPU interconnect will increase. To decrease the total power of the many-core chip power (assumed to take fat-tree type interconnect), the following low power mechanism is proposed [13]:

- (i) Single cluster stop (where a cluster consists of four CPUs),
- (ii) Four cluster (16 CPUs) stop, that also stops interconnect router and memory controller those belong this four-cluster group.

# 3.6 Low Power Implementation of Massively Parallel Architecture Chip

For a larger number of processors on a chip, total optimization of the architecture and the circuit & layout design is necessary and effective. A massively parallel processor of Single-Instruction-Multiple-Data-stream (SIMD) architecture is shown in the [14], which is developed for datamassive application optimizing processing performance and power. This processor integrates 2048-way parallel 2-bit processing elements (PEs) and 1M bit data register. This processor achieves both high performance of 40GOPS based on 16 bit integer addition, and lower power of 250 mW dissipation with 90 nm CMOS process. The conventional CPU based signal processing algorithm used in any kinds of applications should be converted to the massively parallel algorithm to get higher energy efficiency. The co-design of hardware and software are the key issues of parallel architecture [15].

# 3.7 Low Power Techniques by In-Situ Monitoring and Feedback Techniques against Advanced Process Variations

The system and random variations become serious problems to degrade the chip performance in advanced CMOS process. Initially, wafer by wafer variations are tuned by farm ware mechanism. However, it is difficult to compensate the advanced process variations.

To overcome these variations, any kinds of in-site monitors (temp sensor, power line noise monitor) and self-feedback controls with hardware and software combined

AVS techniques [4] to provide the operating margin and reduce the power consumption are implemented on a chip.

## 4. Challenges for Introducing Low Power Platform

#### 4.1 Low Power Platform for Embedded Systems

As we have mentioned above, there are two power management technologies in hardware platforms such as SoCs and Microcontroller Units (MCUs):

- (1) Runtime power management strategies based on DVFS and AVS.
- (2) Standby power reduction by using clock gating and power gating.

We are able to fully exploit the energy management capabilities of the underlying hardware through software-based schemes, as pointed out [16]. Highly integrated processors with peripherals, buses and other special purpose circuits often include software-controlled clock management, voltage control capabilities and power supply on-off function to reduce power consumed. We are committed to enabling clock gating and power gating that can drive significant standby power reductions in system-wide energy consumption.

In general, dynamic power management strategy comes from outside of the operating systems. We expect this dynamic power management strategy to be defined in advance for each application, by a system designer familiar with the characteristic of the application system and its special features and requirements [17]. The program which realizes this strategy enables the low-level implementation of the dynamic power management capabilities under the management of the operating system. We call the program "power management driver", or "power driver" in short. By collecting information of tasks which are under control of the power driver, a system executes operations as followings:

- (a) The power driver changes the LSI operating state to a higher frequency and higher voltage state when a vast number of tasks run simultaneously or high priority real-time tasks run.
- (b) It changes the LSI operating state to a lower frequency and lower voltage state to reduce power consumption when few tasks run, which includes no critical tasks.
- (c) It activates circuits required for executing a task and supplies power and clock signal to them.
- (d) It inactivates circuits when the task described in (c) ends and stops power and clock to them.
- (e) It stops power and clock signal in the entire system, so that the system goes to sleep mode with ultra low power. Even the operating system also stops the operation in the state.
- (f) It resumes power supply and clock supply to the circuits described in (e) responding to a wake up event of the task. The operating system also restarts the operation to

#### Task and Operating System operations



Fig. 4 The dynamic power management and the operating system.

wake up the task.

An overview of the dynamic power management and the operating system is given in Fig. 4.

The power management architecture using the operating system and hardware as mentioned above, which is standardized as a programming interface, is defined as primitives of task programming. We call it "Low power platform". Defining the low power platform enables power-aware task programming to be portable across multiple hardware variations.

# 4.2 Expansion of the Low Power Platform for Distributed Operating System

This low power platform is based on a single operating system that controls the whole hardware systems such as SOCs and MCUs. However, as described in the previous section, an SOC with many CPU cores and CPU network on a chip (NOC), which integrates a large number of processors, have been realized recently. Furthermore, mega hardware systems which consist of SOCs and MCUs are getting popular. These kinds of hardware systems are distributed processing systems, on which multiple operating systems run. Note that not only one operating system controls the whole system. Therefore, information for power management must be transferred between operating systems. The low power platform must extend its features to support the distributed

Power management platform extension for multiple operating systems



Fig. 5 Low power platform extension for multiple operating systems.

processing systems.

The main purpose of the low power platform extension for distribution processing system is to realize an efficient dynamic power management by power supply on-off control in hardware units where each operating system manages. In this way, each operating system operates the power management using operations (e) and (f) in Sect. 4.1. Only operating systems that have active tasks and related hardware units are supplied with power. Cutting off power supply to operating systems which do not execute tasks enables to supply their true power needs. In other words, this low power platform extension achieves on demand power supply mode responded to requests with executing tasks. Low power platforms for a single operating system and multiple operating systems for distributed processing systems are illustrated in Fig. 5.

# 5. Functions Required for Low Power Platform for Distributed Processing Systems

The low power platform for distributed processing systems effectively assists (e), (f), i.e. clock gating and power gating by following three critical functions:

- (1) Operating system halt function.
- (2) Nonvolatile RAM memory, which keeps context and status information during the operating system halts.
  - (3) Operating system resuming function.

Figure 6 shows functions in the low power platform for distributed processing systems.

#### 5.1 Operating System Halt Function

The operating system halt function halts the operating system and cut off power supply to the operating system and related hardware circuits when active tasks on the operating system become nonexistent. To realize this function, it is necessary to save not only operating system's context but also multiple tasks' contexts running on operating systems. These data should be saved in the nonvolatile memory which is discussed later. Consequently, the key element in this scheme is large capacity nonvolatile memory and high speed data saving.

# Operating System State Active store contexts Non-Volatile Memory Idle

Fig. 6 Low power platform functions for distributed processing systems.

## 5.2 Nonvolatile Memory for Keeping Context Information

Nonvolatile memory, which is readable and writable RAM, can save and restore operating system contexts within a very short time and keep contexts during power cut off. The nonvolatile RAM meets a demand for high-speed response, because nonvolatile RAM enables to transfer massive data to/from memory and is rewritable without limitation.

#### 5.3 Operating System Restore Function

The operating system restore function senses request triggers of task executions from outside. This function resumes power supply to the related operating system and hardware circuits and resumes the operating system. Because the sensing elements monitoring requests of task execution are active even the operating system is in the halt state, the sensing element is required to run at a minimum power specification. When the operating system is resumed, massive data including operating system contexts and task contexts is restored from nonvolatile RAM.

# 5.4 Power Control Efficiency for Distributed Processing Systems

In summary, the following three functions: operating system halt/resume function and nonvolatile RAM, can achieve effective power control in distributed processing systems. The amount of time and power, due to stopping/resuming operating systems and saving/restoring contexts to nonvolatile RAM, is regarded as some system overhead. Therefore, it is important to minimize this time and power overhead for power efficiency, because fine-grain control over power supply on-off function increases overhead in time and areas. The smaller this overhead is, the lower the whole system power consumption is.

#### 6. Conclusions

Various low power technologies are proposed and applied to LSIs on the device and circuit level, as clock gating, power gating, DVFS and AVS. They are effective when simply applied to the design of LSIs, but they are more effective when combined with software control of dynamic power reduction, especially for embedded processor LSIs. The programming interface, i.e. a set of software functions, should be a platform for application software programming. The low power platform development becomes more important for distributed system LSIs, and modeling the basic common functions of the platform is effective for its standardization.

#### References

- [1] T. Hattori, T. Irita, M. Ito, E. Yamamoto, H. Kato, G. Sado, T. Yamada, K. Nishiyama, H. Yagi, T. Koike, Y. Tsuchihashi, M. Higashida, H. Asano, I. Hayashibara, K. Tatezawa, Y. Shimazaki, N. Morino, K. Hirose, S. Tamaki, S. Yoshioka, R. Tsuchihashi, N. Arai, T. Akiyama, and K. Ohno, "A power management scheme controlling 20 power domains for a single chip mobile processor," ISSCC Dig. Tech. Papers, pp.542–543, Feb. 2006.
- [2] M. Ito, T. Hattori, Y. Yoshida, K. Hayase, T. Hayashi, O. Nishii, Y. Yasu, A. Hasegawa, M. Ito, H. Mizuno, K. Uchiyama, T. Odaka, J. Shirako, M. Mase, K. Kimura, and H. Kasahara, "An 8640 MIPS SoC with independent power-off control of 8 CPUs and 8 RAMs by automatic parallelizing compiler," ISSCC Dig. Tech. Papers, pp.90–91, Feb. 2008.
- [3] J. Howard, S. Dighe, Y. Hoskote, S. Vangal, D. Finan, G. Ruhl, D. Jenkins, H. Wilson, N. Borkar, G. Schrom, F. Pailet, S. Jain, T. Jacob, S. Yada, S. Marella, P. Salihundam, V. Erraguntla, M. Konow, M. Riepen, G. Droege, J. Lindemann, M. Gries, T. Apel, K. Henriss, T. Larsen, S. Steibl, S. Borkar, V. De, R. Wijngaart, and T. Mattson, "A 48-core IA-32 message-passing processor with DVFS in 45 nm CMOS," ISSCC Dig. Tech. Papers, pp.108–109, Feb. 2010.
- [4] H. Mair, A. Wang, G. Gammie, D. Scott, P. Royannez, S. Gururajarao, M. Chau, R. Lagerquist, L. Ho, M. Basude, N. Culp, A. Sadate, D. Wilson, F. Dahan, J. Song, B. Carlson, and U. Ko, "A 65-nm mobile multimedia applications processor with an adaptive power management scheme to compensate for variations," Symposium on VLSI Circ. Digest of Tech. Papers, pp.224–225, June 2007.
- [5] T. Yamada, M. Abe, Y. Nitta, K. Ogura, M. Kusaoke, M. Ishikawa, M. Ozawa, K. Takada, F. Arakawa, O. Nishii, and T. Hattori, "A low-power design of a 90-nm superH processor core: SH-X2," Proc. ICCD, pp.258–263, Oct. 2005.
- [7] H. Hattori and K. Ogura, "A physical synthesis methodology for multi-threshold-voltage design in low-power embedded processor," IEICE Trans. Electron., vol.E87-C, no.4, pp.520–526, April 2004.
- [8] H. Mizuno, K. Ishibashi, T. Shimura, T. Hattori, S. Narita, K. Shiozawa, S. Ikeda, and K. Uchiyama, "A 18μA-standby-current 1.8 V 200 MHz microprocessor with self substrate-biased data-retention mode," ISSCC Dig. Tech. Papers, pp.280–281, Feb. 1999.
- [9] Y. Kanno, H. Mizuno, N. Oodaira, Y. Yasu, and K. Yanagisawa, "μl/O architecture for 0.13-μm wide-voltage-range System-on-a-Package (SoP) designs," Symp. VLSI Circuits Dig. Tech. Papers, pp.168–169, June 2002,

- [10] Y. Kanno, H. Mizuno, Y. Yasu, K. Hirose, Y. Shimazaki, T. Hoshi, Y. Miyairi, T. Ishii, T. Yamada, T. Irita, T. Hattori, K. Yanagisawa, and N. Irie, "Hierarchical power distribution with power tree in dozens of power domains for 90-nm low-power multi-CPU SoCs," IEEE J. Solid-State Circuits, vol.42, no.1, pp.74–83, Jan. 2007.
- [11] M. Nakajima, H. Kondo, N. Okumura, N. Masui, Y. Takata, T. Nasu, H. Takata, T. Higuchi, M. Sakugawa, H. Yoneda, H. Fujiwara, K. Ishida, K. Ishimi, S. Kaneko, T. Itoh, M. Sato, O. Yamamoto, and K. Arimoto, "Design of a multi-core SoC with configurable heterogeneous 9 CPUs and 2 matrix processors," Symposium on VLSI Circ. Digest of Tech. Papers, pp.14–15, June 2007.
- [12] A. Idehara, Y. Tawara, H. Yamamoto, H. Ohtani, and S. Ochiai, "Idle reduction: Dynamic power manager for embedded multicore processor," Embedded System Symposium, Oct. 2009.
- [13] A. Hasegawa, "A scalar-type many core processor architecture," Symposium on Low Power Many Core Processor System Technology, Feb. 2010.
- [14] H. Noda, M. Nakajima, K. Dosaka, K. Nakata, M. Higashida, O. Yamamoto, K. Mizumoto, T. Tanizaki, T. Gyohten, Y. Okuno, H. Kondo, Y. Shimazu, K. Arimoto, K. Saito, and T. Shimizu, "The design and implementation of the massively parallel processor based on the matrix architecture," IEEE J. Solid-State Circuits, vol.42, no.1, pp.183–192, Jan. 2007.
- [15] T. Sugimura, H. Yamasaki, H. Noda, O. Yamamoto, Y. Okuno, and K. Arimoto, "A high-performance and energy-efficient FFT implementation on super parallel processor (MX) for mobile multimedia applications," ISPACS 2008, pp.1–4, Feb. 2009.
- [16] T. Hattori, "EXREAL platform: SOC design challenges for embedded systems," Cool Chips X, April 2007.
- [17] IBM and Montavista, Dynamic power management for embedded systems, (http://www.research.ibm.com/arl/publications/papers/ DPM\_V1.1.pdf) (accessed 2010-10-25)



in the world. He moved to Renesas Electronics Corporation in 2010. He is now in charge of the R&D of LSI development platforms and EDA tools for the product design, as general manager of Platform Integration Division of Renesas Electronics Corporation. He had been serving the International Solid-State Circuit Conference (ISSCC) from 2003 to 2009 in the technical program committee. He has been serving the Asian Solid-State Circuit Conference (A-SSCC) from 2005, and he is the organization committee chair of the A-SSCC 2012. He has some experience of lecturer in the University of Tokyo and Waseda University. He is a senior member of the IEEE.



Kazutami Arimoto received the B.S., M.S., and Ph.D., degrees in electric engineering from Osaka University, Osaka, Japan, in 1979, 1981, and 1993, respectively. He joined the LSI Laboratory, Mitsubishi Electric Corporation, Itami, Hyogo, Japan, in 1981. Since then he has been engaged in the design and development of DRAM's, and IP's for system LSI. He transferred to Renesas Technology Corp. in 2003. Currently, he focuses on wire-line/wireless communication IP's, memory based design IP's and

reconfigurable processor for multi-media, security, network, mobile applications and future intelligent systems. He is currently interested in mixed signal design for system LSI and solution, low power/low voltage design, high bandwidth architecture, high speed wire/wireless communication, intelligent memory-IP, SOI circuits technology, reconfigurable processor and safety and secure dependable systems in Renesas Electronics Corp. Dr. Arimoto has 192 USP and 69 JP patents issued. He has been a guest professor at Ritsumei University since 2007 and serves as a member of ISSCC and A-SSCC technical program committees. He is a Fellow member of IEEE



Osamu Nishii received the B.S. and M.S degrees in mathematical engineering and instrumentation physics from the University of Tokyo, Japan, in 1985 and 1987, respectively. In 1987, he joined Hitachi, Ltd. He moved to Renesas Technology Corporation in 2004. Since 2010, he has been with Renesas Electronics Corporation. He is currently engaging development of microcontrollers.



Sugako Otani received a B.S. degree in Applied Physics and an M.S. degree in Physics from Waseda University, Tokyo, Japan, in 1993 and 1995, respectively. She joined Mitsubishi Electric Corp., Itami, Japan, in 1995. She is currently working in the System Core Development Div. of Renesas Electronics Corp. From 2005 to 2006, she was a Visiting Scholar at Stanford University. Her research interests include computer architecture and microprocessors. She is a member of the IEEE.



Hiroyuki Kondo received a B.S. degree in Physics from Kyoto University, Kyoto, Japan, in 1986. He joined Mitsubishi Electric Corporation, Itami, Japan, in 1986. He is currently working in the System Core Development Div. of Renesas Electronics Corporation. His research interests include microprocessor architecture and Operating Systems.