减少功耗同时保持CPU性能的一种方法

刚刚看到,这个技术应该被IBM power7用到了:)

We get CPI from below equation(Assume we only had 2 level caches and L2 is inclusive & unified cache) :

 

exe + 0.67*(hit_LI1 + LI1_missrate * (L2_hit + L2_missrate * L2_penalty)) + 0.33 * (hit_L1 + L1_missrate * (L2_hit + L2_missrate * L2_penalty))

exe:   average cycles for each instruction。

hit_LI1:  cycles for hit L1 instruction cache (1 cycle when pipeline is full)

hit_L1:  cycles for hit L1 data cache (1 cycle when pipeline is full)

LI1_missrate: miss ratio for L1 instruction cache.

L1_missrate: miss ratio for L1 data cache L2_hit :  cycles for hit L2 data cache.

L2_missrate: miss ratio for L2 data cache.

L2_penalty: cycles for miss penalty when L2 cache miss occurs.

 

L1/L2 hit, L2_penalty are determined by cache unit and system bus respectively.

Total execution time are CPI* Clock cycle time * instruction counts.

 

If we find the program speed closely depends on cache miss, and slow our cpu frequency, MAYBE we could save power without hurting performance. For example because of slower frequency we need more time to handle the current data, with independent clock dcache unit fetch next line become more efficient, cache miss penalty also become smaller.

Because our cpi from above equation gets smaller, with slower cpu frequency total execution time could keep the same with higher cpu frequency.

Normally cache access latency is 4/6 cycles, if cpu frequency become orig/2, our cache access latency only need 2/3 cycles, then cpi become smaller.

 

 

 

  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值