Copy
Hideki KAWAZU, Jumpei UCHIDA, Yuichiro MIYAOKA, Nozomu TOGAWA, Masao YANAGISAWA, Tatsuo OHTSUKI, "Sub-operation Parallelism Optimization in SIMD Processor Core Synthesis" in IEICE TRANSACTIONS on Fundamentals,
vol. E88-A, no. 4, pp. 876-884, April 2005, doi: 10.1093/ietfec/e88-a.4.876.
Abstract: A b-bit SIMD functional unit has n k-bit sub-functional units in itself, where b = k n. It can execute n-parallel k-bit operations. However, all the b-bit functional units in a processor core do not necessarily execute n-parallel operations. Depending on an application program, some of them just execute n/2-parallel operations or even n/4-parallel operations. This means that we can modify a b-bit SIMD functional unit so that it has n/2 k-bit sub-functional units or n/4 k-bit sub-functional units. The number of k-bit sub-functional units in a SIMD functional unit is called sub-operation parallelism. We incorporate a sub-operation parallelism optimization algorithm into SIMD functional unit optimization. Our proposed algorithm gradually reduces sub-operation parallelism of a SIMD functional unit while the timing constraint of execution time satisfied. Thereby, we can finally find a processor core with small area under the given timing constraint. We expect that we can obtain processor core configurations of smaller area in the same timing constraint rather than a conventional system. The promising experimental results are also shown.
URL: https://globals.ieice.org/en_transactions/fundamentals/10.1093/ietfec/e88-a.4.876/_p
Copy
@ARTICLE{e88-a_4_876,
author={Hideki KAWAZU, Jumpei UCHIDA, Yuichiro MIYAOKA, Nozomu TOGAWA, Masao YANAGISAWA, Tatsuo OHTSUKI, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={Sub-operation Parallelism Optimization in SIMD Processor Core Synthesis},
year={2005},
volume={E88-A},
number={4},
pages={876-884},
abstract={A b-bit SIMD functional unit has n k-bit sub-functional units in itself, where b = k n. It can execute n-parallel k-bit operations. However, all the b-bit functional units in a processor core do not necessarily execute n-parallel operations. Depending on an application program, some of them just execute n/2-parallel operations or even n/4-parallel operations. This means that we can modify a b-bit SIMD functional unit so that it has n/2 k-bit sub-functional units or n/4 k-bit sub-functional units. The number of k-bit sub-functional units in a SIMD functional unit is called sub-operation parallelism. We incorporate a sub-operation parallelism optimization algorithm into SIMD functional unit optimization. Our proposed algorithm gradually reduces sub-operation parallelism of a SIMD functional unit while the timing constraint of execution time satisfied. Thereby, we can finally find a processor core with small area under the given timing constraint. We expect that we can obtain processor core configurations of smaller area in the same timing constraint rather than a conventional system. The promising experimental results are also shown.},
keywords={},
doi={10.1093/ietfec/e88-a.4.876},
ISSN={},
month={April},}
Copy
TY - JOUR
TI - Sub-operation Parallelism Optimization in SIMD Processor Core Synthesis
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 876
EP - 884
AU - Hideki KAWAZU
AU - Jumpei UCHIDA
AU - Yuichiro MIYAOKA
AU - Nozomu TOGAWA
AU - Masao YANAGISAWA
AU - Tatsuo OHTSUKI
PY - 2005
DO - 10.1093/ietfec/e88-a.4.876
JO - IEICE TRANSACTIONS on Fundamentals
SN -
VL - E88-A
IS - 4
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - April 2005
AB - A b-bit SIMD functional unit has n k-bit sub-functional units in itself, where b = k n. It can execute n-parallel k-bit operations. However, all the b-bit functional units in a processor core do not necessarily execute n-parallel operations. Depending on an application program, some of them just execute n/2-parallel operations or even n/4-parallel operations. This means that we can modify a b-bit SIMD functional unit so that it has n/2 k-bit sub-functional units or n/4 k-bit sub-functional units. The number of k-bit sub-functional units in a SIMD functional unit is called sub-operation parallelism. We incorporate a sub-operation parallelism optimization algorithm into SIMD functional unit optimization. Our proposed algorithm gradually reduces sub-operation parallelism of a SIMD functional unit while the timing constraint of execution time satisfied. Thereby, we can finally find a processor core with small area under the given timing constraint. We expect that we can obtain processor core configurations of smaller area in the same timing constraint rather than a conventional system. The promising experimental results are also shown.
ER -