Reevaluating Amadahl's Law in the Multicore Era

最新推荐文章于 2023-08-29 19:58:28 发布

tatetian

最新推荐文章于 2023-08-29 19:58:28 发布

阅读量726

点赞数

分类专栏：高性能文章标签： performance access cache sun 算法 delay

本文链接：https://blog.csdn.net/likehightime/article/details/4498318

版权

高性能专栏收录该内容

7 篇文章 0 订阅

订阅专栏

8.27日，IIT的计算机系系主任孙贤和教授来清华做了一场题为“Reevaluating Amadahl's Law in the Multicore Era”的报告，我去听了一下，对我有些启发。报告的主要内容：

题目：Reevaluating Amadahl's Law in the Multicore Era

报告人： Xian-he Sun

日期：2009.8.27

地点：清华FIT 1-415

内容：

High Performance Computing

Scalable computing: the way to high performance --> Multicore

Amdahl's Law: SpeedUp = 1 /( (1-f) + f/n ) < 1/(1-f)

Sun首先提出的问题是“为什么几百个核的CPU没有普及？”根据Sun的说法，上百核的CPU没有推广，并不是因为技术上做不出来，而是由于此前计算机科学家们一直认为，根据Amdahl's Law计算出的加速比，当n>8以后，Speedup随着n的提高就非常非常有限了，所以100核的CPU并不比10核的CPU快多少。这样上百核的CPU就显得华而不实了。

Hill & Marty, "Amdahl's Law in the multicore era", IEEE Computing

Scalable Computing:

Gustafson's law[1988]: fixed-time speedup model SpeedUp = (1-f) + nf

Sun and Ni's Law[1990]: memory-bounded speedup model

Gustafson's law的提出改变了上面的看法，因为它指出在给定时间内，加速比可以随着n而线性增加。

Memory-wall: speed gap between CPU and memory access => data access became the bottleneck

Multicore is scalable, under the assumption that the access time of memory is fixed(but this assumption is true all the time).

把内存因素考虑进来后，我们说多核是scalable的意思是说加速比能够随着n（即核的数目）而线性增加。Sun介绍说，在保证访问存储器的时间是一定的情况下，可以证明多核是scalable的。这样Sun的研究核心就转变成了如何确保access time of memory一定。在一般情况下，时间不定是因为有cache缺失，所以问题又被转换成了如何减少cache缺失的发生。

Sun的研究团队在Supercomputing上发表了一篇文章，提出了软件和硬件的解决方案，核心思想是data prefetching and predicting。

Hardware: Data Access History Cache --> use different cache strategies according to the cache history to facilate different applications

Software(走神了...没听到): Push IO

结论：

1. Cloud Computing & multicore/manycore architecture lead the future of computing.

2. Scaling up the number of cores can continually improve performance if the data access delay is fixed.

3. Data access is the killing factor of performance

4. Mitigating memory-wall by data prefetching:

- Data Access History Cache

- Server-based Push Prefetching

未来工作：

由于不同应用的访问存储器的特点不同，所以要想达到最优的效果就要研究如何做Application-specific访存加速。

启示：

Memory-wall的问题，即CPU和Memory之间速度差距的问题看起来短时间内不仅不会改善，还会变得更加严重。传统的思路总是去设计消耗CPU时间最少的算法，即时间复杂度，而完全没有考虑访存时间。但恰恰是这个因素，会越来越成为制约程序运行速度的因素。Judy Array之所以能比Hash表快（当表很大很大时）的最重要的原因就是前者把如何减少cache缺失作为算法实现的头等大事（所以实现很复杂...）。

tatetian

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Reevaluating Amadahl's Law in the Multicore Era

8.27日，IIT的计算机系系主任孙贤和教授来清华做了一场题为“Reevaluating Amadahls Law in the Multicore Era”的报告，我去听了一下，对我有些启发。报告的主要内容：题目：Reevaluating Amadahls Law in the Multicore Era报告人： Xian-he Sun日期：2009.8.27地点：清华
复制链接

扫一扫

专栏目录