Speculative Execution and Reducing Branch Penalty on a Superscalar Processor

Hideki ANDO; Chikako NAKANISHI; Hirohisa MACHIDA; Tetsuya HARA; Masao NAKAYA

Speculative Execution and Reducing Branch Penalty on a Superscalar Processor

Hideki ANDO, Chikako NAKANISHI, Hirohisa MACHIDA, Tetsuya HARA, Masao NAKAYA

Full Text Views

0

Share
Cite this

Summary :

Superscalar processors improve performance by exploiting instruction-level parallelism (ILP). ILP in a basic block is, however, not sufficient on non-numerical applications for gaining substantial speedup. Instructions across branches are required to be executed in parallel to dramatically improve performance. That is, speculative execution is strongly required. Boosting is a general solution to achieving speculative execution. Boosting labels an instruction to be speculatively executed, and the hardware handles side-effects. This paper describes the efficient implementation of boosting in terms of cost/performance trade-offs. Our policy in implementation is beneficial in code scheduling heuristics, penalties imposed by code duplication to maintain program semantics, and area cost. This paper also describes a branch scheme which minimizes branch penalty. Branch delay causes crucial penalties on the performance of superscalar processors since multiple delay slots exist even in a single delay cycle. Our scheme is the fetching of both sequential and target instructions, and either of them is selected on a branch. No delay cycle can be imposed. This scheme is realized by a combination of static code movement and hardware support. As a result, we reduce branch penalty with small cost. Simulation results show that our ideas are highly effective in improving the performance of a superscalar processor.

Publication: IEICE TRANSACTIONS on Electronics Vol.E76-C No.7 pp.1080-1093

Publication Date: 1993/07/25

Publicized

Online ISSN

DOI

Type of Manuscript: Special Section PAPER (Special Issue on New Architecture LSIs)

Category: Improved Binary Digital Architectures

Cite this

Copy

Hideki ANDO, Chikako NAKANISHI, Hirohisa MACHIDA, Tetsuya HARA, Masao NAKAYA, "Speculative Execution and Reducing Branch Penalty on a Superscalar Processor" in IEICE TRANSACTIONS on Electronics, vol. E76-C, no. 7, pp. 1080-1093, July 1993, doi: .
Abstract: Superscalar processors improve performance by exploiting instruction-level parallelism (ILP). ILP in a basic block is, however, not sufficient on non-numerical applications for gaining substantial speedup. Instructions across branches are required to be executed in parallel to dramatically improve performance. That is, speculative execution is strongly required. Boosting is a general solution to achieving speculative execution. Boosting labels an instruction to be speculatively executed, and the hardware handles side-effects. This paper describes the efficient implementation of boosting in terms of cost/performance trade-offs. Our policy in implementation is beneficial in code scheduling heuristics, penalties imposed by code duplication to maintain program semantics, and area cost. This paper also describes a branch scheme which minimizes branch penalty. Branch delay causes crucial penalties on the performance of superscalar processors since multiple delay slots exist even in a single delay cycle. Our scheme is the fetching of both sequential and target instructions, and either of them is selected on a branch. No delay cycle can be imposed. This scheme is realized by a combination of static code movement and hardware support. As a result, we reduce branch penalty with small cost. Simulation results show that our ideas are highly effective in improving the performance of a superscalar processor.
URL: https://globals.ieice.org/en_transactions/electronics/10.1587/e76-c_7_1080/_p

Copy

@ARTICLE{e76-c_7_1080,
author={Hideki ANDO, Chikako NAKANISHI, Hirohisa MACHIDA, Tetsuya HARA, Masao NAKAYA, },
journal={IEICE TRANSACTIONS on Electronics},
title={Speculative Execution and Reducing Branch Penalty on a Superscalar Processor},
year={1993},
volume={E76-C},
number={7},
pages={1080-1093},
abstract={Superscalar processors improve performance by exploiting instruction-level parallelism (ILP). ILP in a basic block is, however, not sufficient on non-numerical applications for gaining substantial speedup. Instructions across branches are required to be executed in parallel to dramatically improve performance. That is, speculative execution is strongly required. Boosting is a general solution to achieving speculative execution. Boosting labels an instruction to be speculatively executed, and the hardware handles side-effects. This paper describes the efficient implementation of boosting in terms of cost/performance trade-offs. Our policy in implementation is beneficial in code scheduling heuristics, penalties imposed by code duplication to maintain program semantics, and area cost. This paper also describes a branch scheme which minimizes branch penalty. Branch delay causes crucial penalties on the performance of superscalar processors since multiple delay slots exist even in a single delay cycle. Our scheme is the fetching of both sequential and target instructions, and either of them is selected on a branch. No delay cycle can be imposed. This scheme is realized by a combination of static code movement and hardware support. As a result, we reduce branch penalty with small cost. Simulation results show that our ideas are highly effective in improving the performance of a superscalar processor.},
keywords={},
doi={},
ISSN={},
month={July},}

Copy

TY - JOUR
TI - Speculative Execution and Reducing Branch Penalty on a Superscalar Processor
T2 - IEICE TRANSACTIONS on Electronics
SP - 1080
EP - 1093
AU - Hideki ANDO
AU - Chikako NAKANISHI
AU - Hirohisa MACHIDA
AU - Tetsuya HARA
AU - Masao NAKAYA
PY - 1993
DO -
JO - IEICE TRANSACTIONS on Electronics
SN -
VL - E76-C
IS - 7
JA - IEICE TRANSACTIONS on Electronics
Y1 - July 1993
AB - Superscalar processors improve performance by exploiting instruction-level parallelism (ILP). ILP in a basic block is, however, not sufficient on non-numerical applications for gaining substantial speedup. Instructions across branches are required to be executed in parallel to dramatically improve performance. That is, speculative execution is strongly required. Boosting is a general solution to achieving speculative execution. Boosting labels an instruction to be speculatively executed, and the hardware handles side-effects. This paper describes the efficient implementation of boosting in terms of cost/performance trade-offs. Our policy in implementation is beneficial in code scheduling heuristics, penalties imposed by code duplication to maintain program semantics, and area cost. This paper also describes a branch scheme which minimizes branch penalty. Branch delay causes crucial penalties on the performance of superscalar processors since multiple delay slots exist even in a single delay cycle. Our scheme is the fetching of both sequential and target instructions, and either of them is selected on a branch. No delay cycle can be imposed. This scheme is realized by a combination of static code movement and hardware support. As a result, we reduce branch penalty with small cost. Simulation results show that our ideas are highly effective in improving the performance of a superscalar processor.
ER -