This paper proposes a thread partitioning algorithm in low power high-level synthesis. The algorithm is applied to high-level synthesis systems. In the systems, we can describe parallel behaving circuit blocks (threads) explicitly. First it focuses on a local register file RF in a thread. It partitions a thread into two sub-threads, one of which has RF and the other does not have RF. The partitioned sub-threads need to be synchronized with each other to keep the data dependency of the original thread. Since the partitioned sub-threads have waiting time for synchronization, gated clocks can be applied to each sub-thread. Then we can synthesize a low power circuit with a low area overhead, compared to the original circuit. Experimental results demonstrate effectiveness and efficiency of the algorithm.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Jumpei UCHIDA, Nozomu TOGAWA, Masao YANAGISAWA, Tatsuo OHTSUKI, "High-Level Power Optimization Based on Thread Partitioning" in IEICE TRANSACTIONS on Fundamentals,
vol. E87-A, no. 12, pp. 3075-3082, December 2004, doi: .
Abstract: This paper proposes a thread partitioning algorithm in low power high-level synthesis. The algorithm is applied to high-level synthesis systems. In the systems, we can describe parallel behaving circuit blocks (threads) explicitly. First it focuses on a local register file RF in a thread. It partitions a thread into two sub-threads, one of which has RF and the other does not have RF. The partitioned sub-threads need to be synchronized with each other to keep the data dependency of the original thread. Since the partitioned sub-threads have waiting time for synchronization, gated clocks can be applied to each sub-thread. Then we can synthesize a low power circuit with a low area overhead, compared to the original circuit. Experimental results demonstrate effectiveness and efficiency of the algorithm.
URL: https://globals.ieice.org/en_transactions/fundamentals/10.1587/e87-a_12_3075/_p
Copy
@ARTICLE{e87-a_12_3075,
author={Jumpei UCHIDA, Nozomu TOGAWA, Masao YANAGISAWA, Tatsuo OHTSUKI, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={High-Level Power Optimization Based on Thread Partitioning},
year={2004},
volume={E87-A},
number={12},
pages={3075-3082},
abstract={This paper proposes a thread partitioning algorithm in low power high-level synthesis. The algorithm is applied to high-level synthesis systems. In the systems, we can describe parallel behaving circuit blocks (threads) explicitly. First it focuses on a local register file RF in a thread. It partitions a thread into two sub-threads, one of which has RF and the other does not have RF. The partitioned sub-threads need to be synchronized with each other to keep the data dependency of the original thread. Since the partitioned sub-threads have waiting time for synchronization, gated clocks can be applied to each sub-thread. Then we can synthesize a low power circuit with a low area overhead, compared to the original circuit. Experimental results demonstrate effectiveness and efficiency of the algorithm.},
keywords={},
doi={},
ISSN={},
month={December},}
Copy
TY - JOUR
TI - High-Level Power Optimization Based on Thread Partitioning
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 3075
EP - 3082
AU - Jumpei UCHIDA
AU - Nozomu TOGAWA
AU - Masao YANAGISAWA
AU - Tatsuo OHTSUKI
PY - 2004
DO -
JO - IEICE TRANSACTIONS on Fundamentals
SN -
VL - E87-A
IS - 12
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - December 2004
AB - This paper proposes a thread partitioning algorithm in low power high-level synthesis. The algorithm is applied to high-level synthesis systems. In the systems, we can describe parallel behaving circuit blocks (threads) explicitly. First it focuses on a local register file RF in a thread. It partitions a thread into two sub-threads, one of which has RF and the other does not have RF. The partitioned sub-threads need to be synchronized with each other to keep the data dependency of the original thread. Since the partitioned sub-threads have waiting time for synchronization, gated clocks can be applied to each sub-thread. Then we can synthesize a low power circuit with a low area overhead, compared to the original circuit. Experimental results demonstrate effectiveness and efficiency of the algorithm.
ER -