This paper focuses on SIMD processor synthesis and proposes a SIMD instruction set/functional unit synthesis algorithm. Given an initial assembly code and a timing constraint, the proposed algorithm synthesizes an area-optimized processor core with optimal SIMD functional units. It also synthesizes a SIMD instruction set. The input initial assembly code is assumed to run on a full-resource SIMD processor (virtual processor) which has all the possible SIMD functional units. In our algorithm, we introduce the SIMD operation decomposition and apply it to the initial assembly code and the full-resource SIMD processor. By gradually reducing SIMD operations or decomposing SIMD operations, we can finally find a processor core with small area under the given timing constraint. The promising experimental results are also shown.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Nozomu TOGAWA, Koichi TACHIKAKE, Yuichiro MIYAOKA, Masao YANAGISAWA, Tatsuo OHTSUKI, "A SIMD Instruction Set and Functional Unit Synthesis Algorithm with SIMD Operation Decomposition" in IEICE TRANSACTIONS on Information,
vol. E88-D, no. 7, pp. 1340-1349, July 2005, doi: 10.1093/ietisy/e88-d.7.1340.
Abstract: This paper focuses on SIMD processor synthesis and proposes a SIMD instruction set/functional unit synthesis algorithm. Given an initial assembly code and a timing constraint, the proposed algorithm synthesizes an area-optimized processor core with optimal SIMD functional units. It also synthesizes a SIMD instruction set. The input initial assembly code is assumed to run on a full-resource SIMD processor (virtual processor) which has all the possible SIMD functional units. In our algorithm, we introduce the SIMD operation decomposition and apply it to the initial assembly code and the full-resource SIMD processor. By gradually reducing SIMD operations or decomposing SIMD operations, we can finally find a processor core with small area under the given timing constraint. The promising experimental results are also shown.
URL: https://globals.ieice.org/en_transactions/information/10.1093/ietisy/e88-d.7.1340/_p
Copy
@ARTICLE{e88-d_7_1340,
author={Nozomu TOGAWA, Koichi TACHIKAKE, Yuichiro MIYAOKA, Masao YANAGISAWA, Tatsuo OHTSUKI, },
journal={IEICE TRANSACTIONS on Information},
title={A SIMD Instruction Set and Functional Unit Synthesis Algorithm with SIMD Operation Decomposition},
year={2005},
volume={E88-D},
number={7},
pages={1340-1349},
abstract={This paper focuses on SIMD processor synthesis and proposes a SIMD instruction set/functional unit synthesis algorithm. Given an initial assembly code and a timing constraint, the proposed algorithm synthesizes an area-optimized processor core with optimal SIMD functional units. It also synthesizes a SIMD instruction set. The input initial assembly code is assumed to run on a full-resource SIMD processor (virtual processor) which has all the possible SIMD functional units. In our algorithm, we introduce the SIMD operation decomposition and apply it to the initial assembly code and the full-resource SIMD processor. By gradually reducing SIMD operations or decomposing SIMD operations, we can finally find a processor core with small area under the given timing constraint. The promising experimental results are also shown.},
keywords={},
doi={10.1093/ietisy/e88-d.7.1340},
ISSN={},
month={July},}
Copy
TY - JOUR
TI - A SIMD Instruction Set and Functional Unit Synthesis Algorithm with SIMD Operation Decomposition
T2 - IEICE TRANSACTIONS on Information
SP - 1340
EP - 1349
AU - Nozomu TOGAWA
AU - Koichi TACHIKAKE
AU - Yuichiro MIYAOKA
AU - Masao YANAGISAWA
AU - Tatsuo OHTSUKI
PY - 2005
DO - 10.1093/ietisy/e88-d.7.1340
JO - IEICE TRANSACTIONS on Information
SN -
VL - E88-D
IS - 7
JA - IEICE TRANSACTIONS on Information
Y1 - July 2005
AB - This paper focuses on SIMD processor synthesis and proposes a SIMD instruction set/functional unit synthesis algorithm. Given an initial assembly code and a timing constraint, the proposed algorithm synthesizes an area-optimized processor core with optimal SIMD functional units. It also synthesizes a SIMD instruction set. The input initial assembly code is assumed to run on a full-resource SIMD processor (virtual processor) which has all the possible SIMD functional units. In our algorithm, we introduce the SIMD operation decomposition and apply it to the initial assembly code and the full-resource SIMD processor. By gradually reducing SIMD operations or decomposing SIMD operations, we can finally find a processor core with small area under the given timing constraint. The promising experimental results are also shown.
ER -