卡巴斯基写的一本书《代码优化-高效使用内存》超简单笔记

看了卡巴斯基(ms就是那个写杀毒软件的大牛)写的一本书《代码优化-高效使用内存》,Code Optimization: Effective Memory Usage
主要是针对C/C++的,但通用性还比较强,不但讲了怎么样来做代码优化,而且还讲了为什么这样优化是有效的,依据是计算机的原理,比如CPU,RAM,Cache的处理方式,编译原理,
中间还给出了优化前后比较测试结果,对比较讲究效率的程序员来说还是比较有用的。

是否要做优化的准则
    优化应尽可能独立于硬件,并能易于迁移到其他操作系统上,而无需额外劳动或者显著降低效率
    优化所增加的工作量不能超过开发过程(包括测试)的15%以上。理想情况下,所有关键算法应该以独立库的方式实现。
    优化算法应提供不少于20%的性能提升。
    优化应提供无副作用的代码变更的可能。换句话说,它不应以牺牲代码灵活性为代价。

优化基本规则
Rule 1
Before proceeding with code optimization, develop a reliable, nonoptimized version of the same code. This means that before you start optimizing code, make sure that the program works correctly.
Rule 2
Use algorithmic optimization, rather than features of the system, to achieve the greatest performance gain.
Rule 3
Don't confuse code optimization with assembly implementation.
Rule 4
Before you try to rewrite a program in the assembly language, review the assembly code generated by the compiler and evaluate its efficiency.
Rule 5
If the assembly listing produced by the compiler is perfect, but the program still runs slowly, load it into a disassembler.
Rule 6
If the available processor commands allow you to implement the algorithm more efficiently, leave the compiler alone and start implementing assembly code.
Rule 7
When developing assembly code, create an elegant and efficient solution, free of bells and whistles.

通常的误解
Myth 1
My compiler will optimize everything for me.
Myth 2
Maximum efficiency can be achieved only when programming in the pure assembly language; programming in a high-level language doesn't allow such a result.
Myth 3
Humans, unlike an optimizing compiler, are unable to account for all of the features of processor architecture.
Myth 4
The x86 processors are not worth using; PowerPC must be used to understand what true performance is.

步骤:
1.    Profiling。 根据90-90准则("The first 90% of the code accounts for the first 90% of the development time. The remaining 10% of the code accounts for the other 90% of the development time." )
    ,Profiling的目的就是找到那段影响最大的代码,也叫hotspot, C/C++的profiling 工具还是Intel Vtune 最强大
2.    RAM Subsystem RAM内存级优化
    Optimizing Memory Operation    Recommendations
            Here is a brief list of recommendations that will have the greatest influence on memory performance:

            Unroll loops that read memory
                (减少读内存的循环次数,大概1/8的循环次数减少1、2的处理时间,对写内存的没啥影响)

            Eliminate data dependence
                (消除数据依赖。依赖于代码块中生成的数据会降低效率)
            Send several queries to the memory controller simultaneously
                (一次同时发送几个查询)
            Request data for reading with increments of no less than 32 bytes

            Use all requested pages

            Process data with an increment that eliminates hits to the same DRAM page

            Create virtual data flows

            Process data in double words

            Align the addresses of data sources

            Combine code execution with memory reads

            Group read and write operations

            Access the memory only when necessary

            Never optimize a program for a specific platform

            In the following few sections, each tip will be covered in detail.

线性排序 vs 快速排序
            快速排序是  O(n lg n) ~ O(n2)
            线性排序利用数组的天然序列性质来实现O(n)的,需使用理论的最大地址边界的数组,似乎也有利用O算子的对常量(即使是非常巨大的常量)的计算结果为0之嫌,
            不过在现实中是可行的,如果能再使用bit位的方式来映射的话。
            从作者的比较结果来看,确实也效果很好,2000000个整数的排序时间只有快速排序的1/250
            对耗用大量地址空间的问题(32位整数以上),根据实际情况,可以采用的方式有减小范围,多机器多进程并行等方式。

3. Cache Subsystem    处理器Cache级优化       
            充分和有效的使用L1,L2 Cache,一般来说,提高命中率,减少加载次数还是王道。

4. Machine Optimization 机器级优化
            编译器优化往往是特定于处理器架构的。部分编译器不能优化的代码实际上是因为低效的编写方式。

个人理解,优化的目标和手段有:提供或提高并行能力,包括计算并行和数据读并行,减少读写内存次数,充分利用缓存,算法上则通过合理的设计能将高阶复杂度计算降阶,比如转化为读写。

因为看的比较快,估计很多精华都漏掉了,大牛对计算机系统的理解的深度不是我几句话就能概括的:),附上about the author:

Kris Kaspersky is a technical writer and the author of articles on various aspects of hacking, disassembling, and code optimization. He has dealt with many issues relating to security and system programming including compiler development, optimization techniques, security mechanism research, real-time OS kernel creation, and writing antivirus programs.

  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值