This letter proposes a new hardware/software partitioning algorithm for processor cores with SIMD instructions. Given a compiled assembly code including SIMD instructions and a timing constraint, the proposed algorithm synthesizes an area-optimized processor core with a new assembly code. Firstly, we assume for each operation type a super SIMD functional unit which can execute all the SIMD instructions. Secondly we reduce a SIMD instruction or "sub-function" of each super functional unit, one by one, while the timing constraint is satisfied. At the same time, we update the assembly code so that it can run on the new processor configuration. By repeating this process, we finally find SIMD functional unit configuration as well as a processor core architecture. The promising experimental results are also shown.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Nozomu TOGAWA, Koichi TACHIKAKE, Yuichiro MIYAOKA, Masao YANAGISAWA, Tatsuo OHTSUKI, "A Hardware/Software Partitioning Algorithm for Processor Cores with Packed SIMD-Type Instructions" in IEICE TRANSACTIONS on Fundamentals,
vol. E86-A, no. 12, pp. 3218-3224, December 2003, doi: .
Abstract: This letter proposes a new hardware/software partitioning algorithm for processor cores with SIMD instructions. Given a compiled assembly code including SIMD instructions and a timing constraint, the proposed algorithm synthesizes an area-optimized processor core with a new assembly code. Firstly, we assume for each operation type a super SIMD functional unit which can execute all the SIMD instructions. Secondly we reduce a SIMD instruction or "sub-function" of each super functional unit, one by one, while the timing constraint is satisfied. At the same time, we update the assembly code so that it can run on the new processor configuration. By repeating this process, we finally find SIMD functional unit configuration as well as a processor core architecture. The promising experimental results are also shown.
URL: https://globals.ieice.org/en_transactions/fundamentals/10.1587/e86-a_12_3218/_p
Copy
@ARTICLE{e86-a_12_3218,
author={Nozomu TOGAWA, Koichi TACHIKAKE, Yuichiro MIYAOKA, Masao YANAGISAWA, Tatsuo OHTSUKI, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={A Hardware/Software Partitioning Algorithm for Processor Cores with Packed SIMD-Type Instructions},
year={2003},
volume={E86-A},
number={12},
pages={3218-3224},
abstract={This letter proposes a new hardware/software partitioning algorithm for processor cores with SIMD instructions. Given a compiled assembly code including SIMD instructions and a timing constraint, the proposed algorithm synthesizes an area-optimized processor core with a new assembly code. Firstly, we assume for each operation type a super SIMD functional unit which can execute all the SIMD instructions. Secondly we reduce a SIMD instruction or "sub-function" of each super functional unit, one by one, while the timing constraint is satisfied. At the same time, we update the assembly code so that it can run on the new processor configuration. By repeating this process, we finally find SIMD functional unit configuration as well as a processor core architecture. The promising experimental results are also shown.},
keywords={},
doi={},
ISSN={},
month={December},}
Copy
TY - JOUR
TI - A Hardware/Software Partitioning Algorithm for Processor Cores with Packed SIMD-Type Instructions
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 3218
EP - 3224
AU - Nozomu TOGAWA
AU - Koichi TACHIKAKE
AU - Yuichiro MIYAOKA
AU - Masao YANAGISAWA
AU - Tatsuo OHTSUKI
PY - 2003
DO -
JO - IEICE TRANSACTIONS on Fundamentals
SN -
VL - E86-A
IS - 12
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - December 2003
AB - This letter proposes a new hardware/software partitioning algorithm for processor cores with SIMD instructions. Given a compiled assembly code including SIMD instructions and a timing constraint, the proposed algorithm synthesizes an area-optimized processor core with a new assembly code. Firstly, we assume for each operation type a super SIMD functional unit which can execute all the SIMD instructions. Secondly we reduce a SIMD instruction or "sub-function" of each super functional unit, one by one, while the timing constraint is satisfied. At the same time, we update the assembly code so that it can run on the new processor configuration. By repeating this process, we finally find SIMD functional unit configuration as well as a processor core architecture. The promising experimental results are also shown.
ER -