Akira FUJIMAKI Daiki HASEGAWA Yuto TAKESHITA Feng LI Taro YAMASHITA Masamitsu TANAKA
Yihao WANG Jianguo XI Chengwei XIE
Feng TIAN Zhongyuan ZHOU Guihua WANG Lixiang WANG
Yukihiro SUZUKI Mana SAKAMOTO Taiyou NAGASHIMA Yosuke MIZUNO Heeyoung LEE
Yo KUMANO Tetsuya IIZUKA
Wisansaya JAIKEANDEE Chutiparn LERTVACHIRAPAIBOON Dechnarong PIMALAI Kazunari SHINBO Keizo KATO Akira BABA
Satomitsu Imai Shoya Ishii Nanako Itaya
Satomitsu Imai Takekusu Muraoka Kaito Tsujioka
Takahide Mizuno Hirokazu Ikeda Hiroki Senshu Toru Nakura Kazuhiro Umetani Akihiro Konishi Akihito Ogawa Kaito Kasai Kosuke Kawahara
Yongshan Hu Rong Jin Yukai Lin Shunmin Wu Tianting Zhao Yidong Yuan
Kewen He Kazuya Kobayashi
Tong Zhang Kazuya Kobayashi
Yuxuan PAN Dongzhu LI Mototsugu HAMADA Atsutake KOSUGE
Shigeyuki Miyajima Hirotaka Terai Shigehito Miki
Xiaoshu CHENG Yiwen WANG Hongfei LOU Weiran DING Ping LI
Akito MORITA Hirotsugu OKUNO
Chunlu WANG Yutaka MASUDA Tohru ISHIHARA
Dai TAGUCHI Takaaki MANAKA Mitsumasa IWAMOTO
Kento KOBAYASHI Riku IMAEDA Masahiro MORIMOTO Shigeki NAKA
Yoshinao MIZUGAKI Kenta SATO Hiroshi SHIMADA
Baoquan ZHONG Zhiqun CHENG Minshi JIA Bingxin LI Kun WANG Zhenghao YANG Zheming ZHU
Kazuya TADA
Suguru KURATOMI Satoshi USUI Yoko TATEWAKI Hiroaki USUI
Yoshihiro NAKA Masahiko NISHIMOTO Mitsuhiro YOKOTA
Tsuneki YAMASAKI
Kengo SUGAHARA
Cuong Manh BUI Hiroshi SHIRAI
Hiroyuki DEGUCHI Masataka OHIRA Mikio TSUJI
Yongzhe Wei Zhongyuan Zhou Zhicheng Xue Shunyu Yao Haichun Wang
Mio TANIGUCHI Akito IGUCHI Yasuhide TSUJI
Kouji SHIBATA Masaki KOBAYASHI
Zhi Earn TAN Kenjiro MATSUMOTO Masaya TAKAGI Hiromasa SAEKI Masaya TAMURA
Koya TANIKAWA Shun FUJII Soma KOGURE Shuya TANAKA Shun TASAKA Koshiro WADA Satoki KAWANISHI Takasumi TANABE
Shunichi ISHIWATA Takayasu SAKURAI
Media processors have emerged so that a single LSI can realize multiple multimedia functions, such as graphics, video, audio and telecommunication with effectively shared hardware and flexible software. First, the difference between media processors and general-purpose microprocessors with multimedia extensions is clarified. Features for processes and data in the multimedia applications are summarized and are followed by the multimedia enhancements that the recent general-purpose microprocessors use. The architecture for media processors reflects the further optimized utilization of these features and realizes better price-performance ratio than the general-purpose microprocessors. Finally, the future directions of media processors are estimated, based on the performance, the power dissipation and the die size of the present microprocessors with multimedia extensions and the present media processors. The demand to improve the price-performance ratio for the whole system and to reduce the power consumption makes the media processor evolve into a system processor, which integrates not only the media processor but also the function of a general-purpose microprocessor, various interfaces and DRAMs.
Osamu YAMADA Hiroshi MIYAZAWA Junji KUMADA
Almost all broadcasting systems and their equipment would be digitalized in the near future. In Japan, investigation of digital broadcasting has been going on for a long time, aiming at a realization of improvement of picture quality, new services, system flexibility, etc. Japanese digital broadcasting systems under development have a lot of technical merits, for example, a high transmission capacity and a hierarchical transmission scheme for satellite, and mobile reception for terrestrial digital broadcasting systems, compared to conventional digital systems.
A multimedia coding standard, MPEG4 has frozen its Committee Draft (CD) as the MPEG4 version 1 CD, last October. It defines Audio-Visual (AV) coding Algorithms and their System Multiplex/Composition formats. Founding on Object-base concept, Video part adopts Shape Coding technology in addition to conventional Texture Coding skills. Audio part consists of voice coding tools (HVXC and CELP core) and audio coding tools (HILN and MPEG2 AAC or Twin VQ). Error resilience technologies and Synthetic and Natural Hybrid Coding (SNHC) technologies are the MPEG4 specific features. System part defines flexible Multiplexing of audio-visual bitstreams and Scene Composition for user-interactive re-construction of the scenes at decoder side. The version 1 standardization will be finalized in 1998, with some possible minute changes. The expected application areas are real-time communication, mobile multimedia, internet/intranet accessing, broadcasting, storage media, surveillance, and so on.
Toyohiko YOSHIDA Akira YAMADA Edgar HOLMANN Hidehiro TAKATA Atsushi MOHRI Yukihiko SHIMAZU Kiyoshi NAKAKIMURA Keiichi HIGASHITANI
A dual-issue VLIW processor, running at 250 MHz, is enhanced with multimedia instructions for a sustained peak performance of 1000MOPS. The multimedia processor integrates 300 K transistors in an 8 mm2 core area and it is fabricated onto a 6 mm
Akihiko HASHIGUCHI Masuyoshi KUROKAWA Ken'ichiro NAKAMURA Hiroshi OKUDA Koji AOYAMA Mitsuharu OHKI Katsunori SENO Ichiro KUMATA Masatoshi AIKAWA Hirokazu HANAKI Takao YAMAZAKI Mitsuo SONEDA Seiichiro IWASE
A programmable DSP with linear array architecture for real-time video processing is reported. It achieves a processing rate of 5. 4 GOPS and 81GB/s memory bandwidth using Dual Sense Amplifier architecture. A low-power-supply pipeline decreases power consumption and a time shared bit-line reduces chip area. It has 4320 processor elements and a 1. 1 Mbit 3-port memory. The DSP can be applied to HDTV signals with its 75 MHz peak I/O rate. Sufficient programmability is provided to execute video format conversion such as image size conversion and Y/C separation, and picture quality improvement such as noise reduction and image enhancement. The chip was fabricated using 0. 4 µm CMOS triple metal technology with a 15. 12 mm
Yasunori KIMURA Akira ASATO Toshihiro OZAWA Hiroshi NAKAYAMA
This paper describes the 'Procyon' processor which is to be used for geometry processing. The objective of this processor is to provide a high performance geometry processor to support next generation 3D graphics such as game and CAD applications. The Procyon processor is a four parallel VLIW processor which makes hardware logic simple. We are pursuing performance improvement by compiler optimization. Procyon has a unique feature called 'Software bypass' as well as special hardware to support 3D graphics processing. Software bypass enables the compiler to make accesses to data on hardware bypass lines. By using this information, the compiler can schedule instructions much more freely and generates efficient VLIW code. Other features of Procyon are multiply-add-accumulate instruction, SIMD instructions and clipping instructions. Procyon VLIW code is held in compacted form, which improves memory performance. A program development environment, such as a pipeline simulator and an assembly code parallelizer, is also prepared for system and application programmers. Preliminary simulation results demonstrate that a performance of 2. 6 M polygons per second at 125 MHz Procyon is attained.
Tetsuya MATSUMURA Hiroshi SEGAWA Satoshi KUMAKI Yoshinori MATSUURA Atsuo HANAMI Kazuya ISHIHARA Shin-ichi NAKAGAWA Tadashi KASEZAWA Yoshihide AJIOKA Atsushi MAEDA Masahiko YOSHIMOTO Tadashi SUMI
This paper describes a chip set architecture and its implementation for programmable MPEG2 MP@ML (main profile at main level) video encoder. The chip set features a functional partitioning architecture based on the MPEG2 layer structure. Using this partitioning scheme, an optimized system configuration with double bus structure is proposed. In addition, a hybrid architecture with dual video-oriented on-chip RISC processors and dedicated hardware and a hierarchical pipeline scheme covering all layers are newly introduced to realize flexibility. Also, effective motion estimation is achieved by a scalable solution for high picture quality. Adopting these features, three kinds of VLSI have been developed using 0. 5 micron double metal CMOS technology. The chip set consists of a controller-LSI (C-LSI), a macroblock level pixel processor-LSI (P-LSI) and a motion estimation-LSI (ME-LSI). The chip set combined with synchronous DRAMs (SDRAM) supports all the layer processing including rate-control and realizes real-time encoding for ITU-R-601 resolution video (720
This paper proposes polling-based real-time software for MPEG2 System protocol LSIs, which is a typical embedded and real-time system on a chip, and demonstrates its performance and usefulness. The polling-based real-time software is designed and optimized by analyzing application specific function requirements and deciding scheduling intervals and the execution cycles of each task. It requires neither hardware for multiple interrupt handling nor software for heavy context switching. The polling-based approach provides sufficient performance without any hardware and software overhead for a real-time application like the MPEG2 System protocol.
Gen FUJITA Takao ONOYE Isao SHIRAKAWA
A VLSI architecture of a motion estimator is described dedicatedly for the H. 263 low bitrate video coding. Adopting an efficient hierarchical search algorithm, a new motion estimator yields high quality vectors with small area occupancy and at a low operation frequency. A one-dimensional PE (Processing Element) array is devised to be tuned to the H. 263 encoding, which treats both the advanced prediction mode and the PB-frame mode. The proposed motion estimation core is integrated in 1. 55 mm2 by using 0. 35 µm CMOS 3LM technology, which operates at 15 MHz, and hence enables the realtime motion estimation of QCIF pictures.
Hisashi INOUE Shiro IWASAKI Takashi KATSURA Hitoshi FUJIMOTO Shun-ichi KUROHMARU Masatoshi MATSUO Yasuo KOHASHI Masayoshi TOUJIMA Tomonori YONEZAWA Kiyoshi OKAMOTO Yasuo IIZUKA Hiromasa NAKAJIMA Junji MICHIYAMA
We have developed a low bit-rate video coding using a video digital signal processor (DSP) called VDSP1χ, which performs real-time encoding and decoding for discrete cosine transform-(DCT-) based algorithms, such as ITU-T H. 261, H. 263 and wavelet-based subband encoding algorithms. This LSI features a processing unit which implements wavelet filters at high speeds, a compact DCT circuit, and a fast, flexible DRAM interface for low-cost systems. This system is capable of processing quarter common intermediate format (QCIF)(176
Kazutoshi KOBAYASHI Noritsugu NAKAMURA Kazuhiko TERADA Hidetoshi ONODERA Keikichi TAMARU
We have developed and fabricated an LSI called the FMPP-VQ64. The LSI is a memory-based shared-bus SIMD parallel processor containing 64 PEs, intended for low bit-rate image compression using vector quantization. It accelerates the nearest neighbor search (NNS) during vector quantization. The computation time does not depend on the number of code vectors. The FMPP-VQ64 performs 53,000 NNSs per second, while its power dissipation is 20 mW. It can be applied to the mobile telecommunication system.
Katsuhiko SEO Hisao KOIZUMI Barry SHACKLEFORD Masashi MORI Takashi KUSUHARA Hirotaka KIMURA Fumio SUZUKI
This paper proposes a top-down co-verification approach in the design of embedded systems composed of both hardware and software, for multimedia applications. In order to realize the optimized embedded system in cost, performance, power consumption and flexibility, hardware/software co-design becomes to be essential. In this top-down co-design flow, a target design is verified at three different levels: (1) algorithmic, (2) implementation, and (3) experimental. We have developed a methodology of top-down co-verification, which consists of the system level simulation at the algorithmic level, two type of co-simulations at the implementation level and the co-emulation at the experimental level. We have realized an environment optimized for verification performance by employing verification models appropriate to each verification stage and an efficient top-down environment by introducing the component logical bus architecture as the interface between hardware and software. Through actual application to a image compression and expansion system, the possibility of efficient co-verification was demonstrated.
Increasingly, 3D Graphics is becoming the main stream feature rather than the early adopters unique advantage in PC platform. In such circumstances, most of the graphics chips focus on the acceleration of the rendering capabilities. However, there are very few or almost no attempts made for the acceleration of the geometry process. This universal 3D graphics geometry processor offers a unique and optimized performance advantage for such 3D geometry calculations. By offloading such operations from the CPU, this 3D graphics geometry processor (hereinafter called 3DGP) delivers a well balanced 3D graphics acceleration environment in the PC.
Yusuke OHTOMO Sadayuki YASUDA Masafumi NOGAWA Jun-ichi INOUE Kimihiro YAMAKOSHI Hirotoshi SAWADA Masayuki INO Shigeki HINO Yasuhiro SATO Yuichiro TAKEI Takumi WATANABE Ken TAKEYA
The switch LSI described here takes advantage of the special characteristics of fully-depleted CMOS/SIMOX devices
Takehiko NAKAO Masanori KUWAHARA Yasuo OHARA Reiji ARIYOSHI Toshihiko KITAZUME Naoki SUGAWA Takeshi OGAWARA Satoshi ODA Shoji NOMURA Yuichi MIYAZAWA Akira KANUMA
The Phase Locked Loop (PLL) for clock recovery used in a single chip 155. 52 Mb/s
Akira YAMAZAKI Tadato YAMAGATA Yutaka ARITA Makoto TANIGUCHI Michihiro YAMADA
The features for the integration of 1Tr/1C DRAM and logic for graphic and multimedia applications are surveyed. The key circuit/process technology for large scale embedded DRAM cores is described. The methods to improve transistor performance and gate density are shown. Noise immunity design and easy customization techniques are also introduced.
Yoshiharu AIMOTO Tohru KIMURA Yoshikazu YABE Hideki HEIUCHI Youetsu NAKAZAWA Masato MOTOMURA Takuya KOGA Yoshihiro FUJITA Masayuki HAMADA Takaho TANIGAWA Hajime NOBUSAWA Kuniaki KOYAMA
We have developed a parallel image processing RAM (PIP-RAM) which integrates a 16-Mb DRAM and 128 processor elements (PEs) by means of 0. 38-µm CMOS 64-Mb DRAM process technology. It achieves 7. 68-GIPS processing performance and 3. 84-GB/s memory bandwidth with only 1-W power dissipation (@ 30-MHz), and the key to this performance is the DRAM design. This paper presents the key circuit techniques employed in the DRAM design: 1) a paged-segmentation accessing scheme that reduces sense amplifier power dissipation, and 2) a clocked low-voltage-swing differential-charge-transfer scheme that reduces data line power dissipation with the help of a multi-phase synchronization DRAM control scheme. These techniques have general importance for the design of LSIs in which DRAMs and logic are tightly integrated on single chips.
Theoretical calculations of the pulsing operation and the intensity noise under the optical feedback are demonstrated for operation of the self-sustained pulsation lasers. Two alternative models for the optical feedback effect, namely the time delayed injection model and the external cavity model, are applied in a combined manner to analyze the phenomena. The calculation starts by supposing the geometrical structure of the laser and the material parameters, and are ended by evaluating the noise. Characteristics of the feedback induced noise for variations of the operating parameters, such as the injection current, the feedback distance and the feedback ratio, are examined. A comparison to experimental data is also given to ensure accuracy of the calculation.
In this paper, a new simulation approach to the analysis of the reflection characteristics on nonuniform transmission lines (NTLs) is presented. The input and output responses in the time domain and the reflection coefficients in the frequency domain are effectively obtained by using the modified central difference (MCD) simulation and the fast Fourier transform (FFT) technique for Gaussian pulse responses. The simulated results for the reflection characteristics of the NTL transformers are in excellent agreements with the theoretical values. By representing both the reflected voltage and the reflection coefficient, it is shown that this approach is useful to analyze for various types of tapered and stepped NTLs.
Vladimir A. VANKE Hiroshi MATSUMOTO Naoki SHINOHARA
Physics principles of a new type of microwave input amplifiers are described. Cyclotron wave electrostatic amplifier (CWESA) has a low noise level, broad band, switchable gain, super high self-protection against microwave overloads, rapid recovery and small DC consumption. CWESAs are widely used in Russian pulse Doppler radars and other systems.