[Branch Prediction]处理器分支预测文献笔记(2)

[文献名] Casey, Kevin, M. Anton Ertl, and David Gregg. “Optimizing Indirect Branch Prediction Accuracy in Virtual Machine Interpreters.” ACM Trans. Program. Lang. Syst. 29, no. 6 (October 2007). doi:10.1145/1286821.1286828.

[相关观点]

1.BTB方法对于某些应用例如翻译器,只有2%-50%的命中;

2.一篇针对优化的文献。

 

[文献名] Li, Tao, L.K. John, Anand Sivasubramaniam, N. Vijaykrishnan, and J. Rubio. “OS-Aware Branch Prediction: Improving Microprocessor Control Flow Prediction for Operating Systems.” IEEE Transactions on Computers 56, no. 1 (January 2007): 2–17. doi:10.1109/TC.2007.250619.

[相关观点]

1.T. Yeh and Y.N. Patt, “Two-Level Adaptive Branch Prediction,”Proc. 24th Int’l Symp. Microarchitecture, pp. 51-61, 1991.

Most current high-performance processors use dynamic branch predictions。

2.指出使用操作系统导致共享分支预测资源。可以见到在操作系统的共同影响下,预测精确度下降(增加一半占了一定数量)。

3.增加资源量也没有任何明显改善

4.不是开发新的预测器而是兼容现有的进行改进。

5.

 

6.阐述however, the fixed sizes of branch predictor tables, constrained by chip die area and access latency,make it impossible to hold all of the dynamic branch information.

7.[6], [21]由于资源限制Branch Aliasing会造成毁灭性破坏

8.关于Gshare:Gshare [12] uses the “exclusive or” (XOR) of the global history with the low-order address bits of a branch to form a more randomized BHT index

9.关于Agree:The Agree predictor [23] converts instances of destructive aliasing into either constructive or neutral aliasing by attaching each branch with a biasing bit that predicts the most likely outcome of that branch. The 2-bit BHT counter is then evaluated as to whether or not the branch will go in the direction indicated by the biasing bit. The concept behind the Agree predictor is that most branches are highly biased.

10.主要采用分割资源的方法。

 

[文献名] Sendag, R., J.J. Yi, and Peng-fei Chuang. “Branch Misprediction Prediction: Complementary Branch Predictors.” Computer Architecture Letters 6, no. 2 (February 2007): 49–52. doi:10.1109/L-CA.2007.13.

[相关观点]

1.利用MPBT记录错误的预测并进行纠错。

2.在浮点评测中纠错性能良好。

3.针对变动的循环体很有效。

4.依然在cache体系

 

[文献名] Biggar, Paul, Nicholas Nash, Kevin Williams, and David Gregg. “An Experimental Study of Sorting and Branch Prediction.” J. Exp. Algorithmics 12 (June 2008): 1.8:1–1.8:39. doi:10.1145/1227161.1370599.

[相关观点]

1.For example, Intel Pentium 4 processors[Intel 2004, 2001] have pipelines of up to 31 stages.

2.阅读量化书

3.静态启发:前T后不T

4.半静态:一个Hint位,在编译时预先写好跳转

5.也提到了资源局限造成的混淆问题。

6.真随机数网站:random.org

7.资源使用:We used a variety of cache configurations; generally speaking we used an 8-KB level-1 data cache, 8-KB level-1 instruction cache, and a shared 2-MB instruction and data level-2 cache, all with 32-byte cache lines

8.当数值分布混乱,模式方法就会出现差性能。

9.特定的算法例如冒泡排序出现mispredict的情况加剧

 

[文献名] Kwak, J.W., and C. S. Jhon. “High-Performance Embedded Branch Predictor by Combining Branch Direction History and Global Branch History.” IET Computers Digital Techniques 2, no. 2 (March 2008): 142–54. doi:10.1049/iet-cdt:20060130.

[相关观点]

1.提到分支预测在移动设备的重要性。

2.以前使用过了地址和全局历史索引,现在加入“分支方向历史”branch direction history作为输入量。

3.新预测器:direction-gshare

4.阐述:利用地址索引PHT称为Bimodal预测器

5.指出PHT的混淆问题。引向了对输入变量进行xor或其他函数的讨论

6.Bi-mode predictor分开了Taken Table和非Taken Table。

 

7.

[7] SPRANGLE E, CHAPPELL RS, ALSUP M, ET AL.: ‘The agree predictor: a mechanism for reducing negative branch history interference’. IEEE ISCA ‘97, pp. 284–291

[8] LEE C-C, CHEN I-CK, MUDGE TN: ‘The bi-mode branch predictor’. Int. Symp. Microarchitecture IEEE’97, pp. 4–13

8.有很多ARM的技术参考文献。

9.LOH G, HENRY DS: ‘Predicting conditional branches with fusion-based hybrid predictors’. 11th Conf. Parallel Architectures and Compilation Techniques (PACT),September 2002

10.指出了前代ARM处理器包括ARM7 9 10 11都是使用静态预测或简单的Bimodal预测器,一些高端移动处理器也使用上了动态分支预测技术

11.通过一些文献阐述了神经网络精度高的事实。

12.通过代码指出使用跳转方向信息的可行性:

 

the low-level assembly code of the loop-style branch instruction is usually backward-taken, whereas the if-style branch instruction is usually forward-taken. Therefore we propose the additional use of the BDH information as a new component of input vectors for the branch prediction.

 

13.McFarling文献

 

[文献名] Jiménez, Daniel A. “Generalizing Neural Branch Prediction.” ACM Trans. Archit. Code Optim. 5, no. 4 (March 2009): 17:1–17:27. doi:10.1145/1498690.1498692.

[相关观点]

1.有提到对深流水线影响很大 Sprangle and Carmean 2002

有很多IEEE文献

2.资源使用了32KB和256KB作为测试。

3.感知器起源:The perceptron predictor [Jim´enez and Lin 2001]

4.提到了神经网络方法延迟大

5.指出了神经网络方法有着单预测器中的最高精度

6.最开始的神经元设计无法被应用因为高延迟

7.有关于神经元方法的文献综述

8.采用了一个三维矩阵记录信息,以获得分段平面的能力。理想情况下足够大,实验中缩小。

9.Path-Based Neural Predictor. We simulate the path-based neuralpredictor [Jim´enez 2003].

10.存在表现较差>10%错误率的应用~为论文提供依据

自己总结的问题:高资源消耗,延迟问题

 

[文献名] Kim, Hyesoon, J. Joao, O. Mutlu, Chang Joo Lee, Y.N. Patt, and R. Cohn. “Virtual Program Counter (VPC) Prediction: Very Low Cost Indirect Branch Prediction Using Conditional Branch Prediction Hardware.” IEEE Transactions on Computers 58, no. 9 (September 2009): 1153–70. doi:10.1109/TC.2008.227.

[相关观点]

1.指出了使用BTB能够解决indirect branch的问题。但是效果只有50%左右。

2.A VPC predictor treats an indirect branch as a sequence of multiple conditional branches

3.再利用现有的PHT表进行预测。

4.也属于前沿探索,挖掘预测性能极限。

 

[文献名] Panda, R., P.V. Gratz, and D.A Jimenez. “B-Fetch: Branch Prediction Directed Prefetching for In-Order Processors.” Computer Architecture Letters 11, no. 2 (July 2012): 41–44. doi:10.1109/L-CA.2011.33.

[相关观点]

1.指出现有处理器频率不断增加,但是存储器速度却没有相应跟上。

2.有几篇处理器技术文献

3.指出一些处理器依靠聚合顺序核获得低功耗和改善吞吐量。

4.利用专用架构进行cache预读取。

 

 

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值