[Branch Prediction]处理器分支预测文献笔记(3)_a comparison of dynamic branch predictors that use-CSDN博客

本文链接：https://blog.csdn.net/xeonmm1/article/details/82263644

[文献名] Fisher, Joseph A., and Stefan M. Freudenberger. “Predicting Conditional Branch Directions from Previous Runs of a Program.” In Proceedings of the Fifth International Conference on Architectural Support for Programming Languages and Operating Systems, 85–95. ASPLOS V. New York, NY, USA: ACM, 1992. doi:10.1145/143365.143493.

[相关观点]

Branch prediction is an important capability for high performance CPUs,

在向量式代码中分支预测器性能可能很优秀，但是在系统，商用中却不一定。

静态动态定义：If one predicts conditional branch directions while a program is running, one is said to be doing dynamic branch prediction. If, instead, one tries to predict branch directions before the program runs, one is doing static branch prediction.

指出静态方法资源消耗低

1.统计分支走向和频率

2.提出百分比不是衡量预测的唯一依据，因为不同的程序有不同的分支密度，利用错误分支跳转每定量指令是一个比较好的方法

The experiments reported upon here show that static prediction can be done almost as well as is possible by taking previous runs of a program, and using those runs to make decisions about which way branches will go in future runs

[文献名] Pan, Shien-Tai, Kimming So, and Joseph T. Rahmeh. “Improving the Accuracy of Dynamic Branch Prediction Using Branch Correlation.” In Proceedings of the Fifth International Conference on Architectural Support for Programming Languages and Operating Systems, 76–84. ASPLOS V. New York, NY, USA: ACM, 1992. doi:10.1145/143365.143490.

[相关观点]

1.the efficiency of handling branches is important.

2.Generally, dynamic Prediction gives better results than static prediction, but at the cost of increased hardware complexity.

3.指出了过往动态预测只注重了分支自己本身的跳转历史。

4.Self–history prediction schemes generally work well for scientific/engineering applications where program execution is dominated by inner-loops.

5.可以回溯到高级语言的控制。

6.由于分支相关，在后的分支可以被提前计算出跳转

7.本质上是全局和局部的概念。

[文献名] Yeh, Tse-Yu, and Yale N. Patt. “Two-Level Adaptive Training Branch Prediction.” In Proceedings of the 24th Annual International Symposium on Microarchitecture, 51–61. MICRO 24. New York, NY, USA: ACM, 1991. doi:10.1145/123465.123475.

[相关观点]

1 branches impede machine performance due to pipeline stalls for unresolved branches.

2 However, program profiling has to be performed in advance with certain sample data sets which may have different branch tendencies than the data sets that occur at run-time.

3 The first level is the history of the last n branches. The second is the branch behavior for the last s occurrences of that unique pattern of the last n branches

4.应注意上图还有很多HR作为记录分支历史

5.指出因延迟原因，在第一级放置一个位，第二个时钟用来预测模式

6.密集型跳转策略：Taken，等待前面的预测出结果

7.Conditional branch占了绝对比重

[文献名] Yeh, Tse-Yu, and Yale N. Patt. “A Comparison of Dynamic Branch Predictors That Use Two Levels of Branch History.” In Proceedings of the 20th Annual International Symposium on Computer Architecture, 257–66. ISCA ’93. New York, NY, USA: ACM, 1993. doi:10.1145/165123.165161.

[相关观点]

1.全局的概念：所有跳转历史聚合一起去索引

2.Per-address：为独立地址分配跳转历史记录

3.Pre-set：通过branch特点分类。

Global history schemes make effective predictions for if-thenelse branches due to their correlation with previous branches.but require higher implementation costs to be effective overall.需要长分支历史去消除互相干扰

Per-address history schemes perform better than other schemes on floating point programs and require lower implementation costs to be effective overall.This periodic behavior is better retained with a per-address branch history table.

To be effective, however, per-set history schemes require even higher implementation costs than global history schemes due to the separate pattern history tables of each set.

[文献名] Chang, Po-Yung, E. Hao, and Y.N. Patt. “Alternative Implementations of Hybrid Branch Predictors.” In , Proceedings of the 28th Annual International Symposium on Microarchitecture, 1995, 252–57, 1995. doi:10.1109/MICRO.1995.476833.

[相关观点]

联合两种预测器进行工作，通过选择器进行决策，精确度比gshare高。

[文献名] Chen, I-Cheng K., John T. Coffey, and Trevor N. Mudge. “Analysis of Branch Prediction via Data Compression.” In Proceedings of the Seventh International Conference on Architectural Support for Programming Languages and Operating Systems, 128–37. ASPLOS VII. New York, NY, USA: ACM, 1996. doi:10.1145/237090.237171.

[相关观点]

1.指出2级预测的结构：they have one or more shift-registers (branch history registers) to store history information in the first level and have one or more tables of 2-bit counters (pattern history tables) in their second level

2.指出马克洛夫Markov预测机与二级预测的共性。可以作为理想状态的2级机分析模型，评估预测的极限精度。

3.使用两位饱和计数器的原因是因为循环出口只有一个。

4.However, since PPM is optimal, it is unlikely that significant improvement can be made by improving the predictor alone, except for the cases noted. Therefore, to further increase branch prediction accuracy, the focus should be on improving the information processor and the source.

[文献名] Lee, Chih-Chieh, I-C.K. Chen, and T.N. Mudge. “The Bi-Mode Branch Predictor.” In , Thirtieth Annual IEEE/ACM International Symposium on Microarchitecture, 1997. Proceedings, 4–13, 1997. doi:10.1109/MICRO.1997.645792.

[相关观点]

1.The ability to minimize stalls or pipeline bubbles that may result from branches is becoming increasingly critical as microprocessor designs implement greater degrees of instruction level parallelism.

2.As a result, two-level dynamic branch predictors have been incorporated in several recent high-performance microprocessors. Perhaps the best known examples, at the time of writing, are the Pentium Pro [Gwennap95] and Alpha 21264 [Gwennap96].

3.In current designs, dynamic predictors spend large amounts of hardware to memorize this branch outcome history.

4.Global history-the outcomes of neighboring branches- is a common way to identify special branch conditions. Previous studies have shown that the global history indexed schemes achieve good performance by storing the outcomes of global history patterns in two-bit counters, e.g.,

5.gshare, randomizes the index by xor-ing the global history with the branch address

6.通过分割2级表的方法减弱混淆问题。

[文献名] Sprangle, Eric, Robert S. Chappell, Mitch Alsup, and Yale N. Patt. “The Agree Predictor: A Mechanism for Reducing Negative Branch History Interference.” In Proceedings of the 24th Annual International Symposium on Computer Architecture, 284–91. ISCA ’97. New York, NY, USA: ACM, 1997. doi:10.1145/264107.264210.

[相关观点]

1.2级定义：The first level function mwas originally the value of an N-bit shift register, called the Branch History Register (BHR), that kept track of the directions of the previous N branches.

The second level is an array of PHT entries (in this paper, a-bit saturating counters) which tracks the outcomes of branches mapped to it by the first level’s indexing function. Figure 1 depicts a general two-level predictor.

2.指出混淆问题。

3.定义：We define an instance of PHT interference as a branch DCcessing a PHT entry that was previously updated by a different branch.

4.指出了晶体管数量有限，单纯提高表大小不实际。

5.指出gshare是一种方法，chang：滤过简单跳转

6.有个别程序很差

7.使用“偏置位”识别出分支的跳转方向，并分离不同方向的分支