Value Locality and Load Value Prediction
前言
Paper作者发现value同样具有locality,尤其是load指令,经常会反复的从memory中加载相同的数值,因此他们提出了新的架构,类似于分支预测,对load加载的数值进行预测。
一、 Value Locality
value locality体现在以下的方面:
不过作者也指出,value locality受编译器影响较大,因为子表达式优化可能会解决这部分的问题。
二、工作原理
1. Load Value Prediction Table
该Table通过load instruction address进行索引,并输出索引值predict value。
2. Dynamic Load Classification
可以将load的value分为不同类别
- don’t predict
- predict: verification
- highly predicted
- constant
predict: verification时,需要从memory中读出进行验证。highly predicted: verification时,从Constant Verification Unit单元中读取进行验证;constant, 同 highly predicted。
这四个类别类似于分支预测中的not taken→taken的转换。
3. Constant Verification Unit
这是一个存储常量数值的单元,与cache保持一致性,load的时候加载。如果CPU执行store地址,地址与该单元中地址匹配,那么就失效,这样就保证了该值是最新的,并且constant的。
4. The Load Value Prediction Unit
各个单元之间的接口。
5. Framework
整体的结构图:
注意
We see that instructions in the branch (BRU) and multi-cycle integer (MCFX) units experience the least reductions in true dependency resolution time. This makes sense, since both branches and move-from-special-purpose-register (mfspr) instructions are waiting for operand types (link register, count register, and condition code registers) that the LVP mecha-nism does not predict.
Conversely, the dramatic reductions seen for floating-point (FPU), single-cycle fixed point (SCFX), and load/store (LSU) instructions correspond to the fact that operands for them are predicted