CPU峰值性能就是CPU运算能力满打满算最最理想情况下的性能,这只有理论意义,实际性能要以软件实测为准。有人问寡人峰值性能怎么算,这里就很简单地说两句。搞计算化学的一般只关注浮点性能,所以这里只提峰值浮点性能。
峰值浮点性能=CPU核数CPU频率每周期执行的浮点操作数
时下普通的CPU的单精度(SP)浮点性能是双精度(DP)浮点性能的两倍。目前常见的几类CPU内核的每周期浮点操作数以及细节如下(引自网络,见http://stackoverflow.com/questions/15655835/flops-per-cycle-for-sandy-bridge-and-haswell-sse2-avx-avx2)
Intel Core 2 and Nehalem:
4 DP FLOPs/cycle: 2-wide SSE2 addition + 2-wide SSE2 multiplication
8 SP FLOPs/cycle: 4-wide SSE addition + 4-wide SSE multiplication
Intel Sandy Bridge/Ivy Bridge:
8 DP FLOPs/cycle: 4-wide AVX addition + 4-wide AVX multiplication
16 SP FLOPs/cycle: 8-wide AVX addition + 8-wide AVX multiplication
Intel Haswell:
16 DP FLOPs/cycle: two 4-wide FMA (fused multiply-add) instructions
32 SP FLOPs/cycle: two 8-wide FMA (fused multiply-add) instructions
AMD K10:
4 DP FLOPs/cycle: 2-wide SSE2 additi