深入理解计算机系统笔记
文章平均质量分 51
RichardYSteven
这个作者很懒,什么都没留下…
展开
-
Optimizing Program Performance--Enhanceing Parallelism
现代的CPU中,执行运算的function unit 大多是pipe line的。也就是在上一条指令没有执行完成前,就可以执行下一条指令了。 但是之前我们实现的代码并没有利用到这点,因为运算结果保存在一个变量中。这样在计算时,必须等待上一次执行的结果,来计算下一次的值,而不能利用pipe line实现同步运行了。 对于某些运算, 如加法,我们可以把将要计算的元素 分成原创 2010-01-17 16:31:00 · 958 阅读 · 0 评论 -
Condition Codes
The most useful condition codes are: CF: Carry Flag. The most recent operation generated a carry out of the most significant bit. Used to detectoverflow for unsigned operations.ZF: Zer原创 2009-12-23 21:33:00 · 1521 阅读 · 0 评论 -
Stack Frame Structure
在IA32的体系结构中,没有专门用来保存函数参数 局部变量的寄存器,而是使用堆栈来实现。 下图显示了,从一个函数调用另一个函数的堆栈变化原创 2009-12-26 14:40:00 · 989 阅读 · 0 评论 -
Register Usage Conventions
By convention, registers %eax, %edx, and %ecx are classified as caller save registers. When procedureQ is called by P, it can overwrite these registers without destroying any data required原创 2009-12-28 09:46:00 · 800 阅读 · 0 评论 -
Web Site for CS:APP
http://csapp.cs.cmu.edu/原创 2009-12-30 09:33:00 · 771 阅读 · 0 评论 -
Y86 Instruction Set Architecure(ISA)
The instructions supported by a particular processor and their byte-level encodingsare known as its instruction-set architecture (ISA).Different “families” of processors, such as Intel IA32,IB原创 2010-01-04 14:37:00 · 1867 阅读 · 0 评论 -
Sequential Y86 Implementations Part I ISA
每条指令都涉及到了很多个操作,所以现在就是要把这些指令的执行步骤统一起来,让所有的指令都按照统一的顺序来执行。We organize them in a particularsequence of stages, attempting to make all instructions follow a uniform sequence, even though the instructio原创 2010-01-04 14:57:00 · 1169 阅读 · 0 评论 -
Sequential Y86 Implementations Part III Timing
Our implementation of SEQ consists of包含了: combinational logic and two forms of memory devices: clocked registers (the program counter and condition code register), and random-access原创 2010-01-04 22:58:00 · 902 阅读 · 0 评论 -
Sequential Y86 Implementations Part II Hardware Structure
原创 2010-01-04 22:55:00 · 813 阅读 · 0 评论 -
Optimizing Program Performance--Eliminating Unneeded Memory Reference
Memory Reference是指通过指针来读取或者写入。 这样的话肯定比直接操作寄存器要慢,而且指令数还增加了。 如: 1 /* Direct access to vector data */2 void combine3(vec_ptr v, data_t *dest)3 {4 int i;5 int length = vec_l原创 2010-01-09 16:58:00 · 830 阅读 · 0 评论 -
Optimizing Program Performance-- Expressing Program Performance
We need a way to express program performance that can guide us in improving the code. A useful measurefor many programs is Cycles Per Element (CPE). 不过这个概念,我是不太清楚的。 贴个图来看看。原创 2010-01-08 19:51:00 · 778 阅读 · 0 评论 -
Optimizing Program Performance--Reucing Procedure Calls
函数调用会增加一些overhead。调用越多对性能的影响越大。 所以要尽量减少在loop中的函数调用。 如:例子11 /* Move call to vec_length out of loop */2 void combine2(vec_ptr v, data_t *dest)3 {4 int i;5 int length = vec_原创 2010-01-09 11:38:00 · 905 阅读 · 0 评论 -
Optimizing Program Performance-- Capability and limitation of Optimizing Compilers
编译器优化代码的时候, 并不是所有的代码都可以优化的。 有下面两种情况,不能优化。 1. Memory aliasing2. function call 例子11 void twiddle1(int *xp, int *yp)2 {3 *xp += *yp;4 *xp += *yp;5 }67原创 2010-01-08 19:36:00 · 758 阅读 · 0 评论 -
Optimizing Program Performance-- Eliminating Loop Inefficiencies
看下面两个函数1 /* Convert string to lower case: slow */2 void lower1(char *s)3 {4 int i;56 for (i = 0; i i++)7 if (s[i] >= ’A’ && s[i] 8 s[i] -= (’A’ - ’a’);原创 2010-01-09 11:07:00 · 838 阅读 · 0 评论 -
Optimizing Program Performance--Expressing relative performance
The best way to express a performance improvement is as a ratio of the form Told/Tnew, where Told is the timerequired for the original version and Tnew is the time required by the modified versi原创 2010-01-09 11:45:00 · 780 阅读 · 0 评论 -
Optimizing Program Performance-- Reducing Loop Overhead
1 /* Accumulate result in local variable */2 void combine4(vec_ptr v, data_t *dest)3 {4 int i;5 int length = vec_length(v);6 data_t *data = get_vec_start(v);7 data_t x = ID原创 2010-01-12 15:22:00 · 1124 阅读 · 0 评论 -
Jmp指令的格式
jmp就是跳到另一个地方那个去执行 There are several different encodings for jumps, but some of the most commonly used ones are PC-relative.1. That is, they encode the difference between the address of the targe原创 2009-12-24 22:21:00 · 4956 阅读 · 0 评论 -
Chapter 8 Exceptional Control Flow -- Nonlocal Jumps
这个东西很高级,当出现某种情况的时候,可以直接跳到指定的地方。 而不用返回出现exception的地方。 #include int setjmp(jmp buf env);int sigsetjmp(sigjmp buf env, int savesigs); #include void longjmp(jmp buf env, int retval);void原创 2010-03-02 11:18:00 · 909 阅读 · 0 评论 -
Optimizing Program Performance-- Summary, Performance Improvement Techniques
Although we have only considered a limited set of applications, we can draw important lessons on how towrite efficient code. We have described a number of basic strategies for optimizing program per原创 2010-01-22 15:57:00 · 1086 阅读 · 0 评论 -
Optimizing Program Performance-- 使用GPROF来查看系统的性能
Compiling a Program for Profiling首先要以特殊选项来编译程序,这样才可以产生profile。就是要加上 -pg选项 gcc -g -c myprog.c utils.c -pggcc -o myprog myprog.o utils.o -pgExecuting the Program to原创 2010-01-23 17:22:00 · 1129 阅读 · 0 评论 -
Chapter 6 Memory Hierarchy --- Locality
概念:Locality is typically described as having two distinct forms: temporal locality and spatial locality. In aprogram with good temporal locality, a memory location that is referenced原创 2010-02-03 11:15:00 · 1124 阅读 · 0 评论 -
Virtual address space for Linux process
Program code and data. Code begins at the same fixed address, followed by data locations thatcorrespond to global C variables. The code and data areas are initialized directly from the原创 2009-12-08 20:47:00 · 3169 阅读 · 0 评论 -
利用移位运算 加速乘法
Practice Problem 2.21:As we will see in Chapter 3, the leal instruction on an Intel-compatible processor can perform computationsof the form aThe compiler often uses this instruction to perfor原创 2009-12-14 15:23:00 · 1115 阅读 · 0 评论 -
Chapter 6 Memory Hierarchy -- Summary
Programmers who understand the nature of the memory hierarchy can exploit this understanding to writemore efficient programs, regardless of the specific memory system organization. In particular, we原创 2010-02-10 20:22:00 · 1084 阅读 · 0 评论 -
Chapter 7 Linker -- How Linkers Resolve Multiply-Defined Global Symbols
At compile time, the compiler exports each global symbol to the assembler as either strong or weak, and theassembler encodes this information implicitly in the symbol table of the relocatabl原创 2010-02-20 15:01:00 · 1070 阅读 · 0 评论 -
关于补码取负数的证明
很早就学到了,补码取负数的运算可以用 取反 加一 来算但一直不知道为什么。 终于找到一个牛逼的证明 A well-known technique for performing two’s complement negation at the bit level is to complement thebits and then increment the result. In C,原创 2009-12-14 11:46:00 · 2486 阅读 · 0 评论 -
Chapter 7 Linker -- Tools for manipulating Object Files
There are a number of tools available on Unix systems to help you understand and manipulate object files.In particular, the GNU binutils package is especially helpful and runs on every Unix platform原创 2010-02-22 13:48:00 · 1000 阅读 · 0 评论 -
Chapter 7 Linking ---Static Linking
Static Linking就是将多个relocatable object file 组合转换成一个executable object file。 这个过程中,linker主要完成两件事情:1. Symbol resolution. Object files define and reference symbols. The purpose of symbol resolut原创 2010-02-16 17:01:00 · 997 阅读 · 0 评论 -
Chapter 7 Linking --- Object files and ELF format
Object files come in three forms:1. Relocatable object file. Contains binary code and data in a form that can be combined with otherrelocatable object files at compile time to create an executable原创 2010-02-16 17:15:00 · 984 阅读 · 0 评论 -
寻址方式
下图显示了 IA32中使用的寻址方式原创 2009-12-20 22:08:00 · 717 阅读 · 0 评论 -
word size suffix in GAS
GAS中使用suffix来指定操作数的长度原创 2009-12-20 22:13:00 · 839 阅读 · 0 评论 -
Load Effective Address指令的作用
指令 作用 描述leal S,D D 下面看一道习题 所以实际使用中,leal用来做算术运算。 原本 9(%eax, %eac, 2) 表示去%eax+2*%ecx+9 这个地址中的值, 而leal表示去地址,就表示取出这个地址。就变成了计算这个值了。原创 2009-12-21 20:52:00 · 1403 阅读 · 0 评论 -
Arithmetic and Logical Operation
可以看到,这里面没有除法。 不知道别的architecture有没有除法。 Special Arithmetic Operations 这里有除法运算,不过这个除法的话 需要使用到并没有标示出来的寄存器。原创 2009-12-21 21:00:00 · 919 阅读 · 0 评论 -
Chapter 9 Measuring Program Execution Time-- Scale of Computer System Event
假设cpu的频率是1G,那一个时钟周期就是 1纳秒。 一个时钟周期可以执行一条(或多条?)指令。 从宏观的范围看,计算机系统必须相应毫秒(ms)级的事件。 比如 键盘敲击(每50ms), 图片播放(每33ms)。 对这些事件的处理,(中断)需要用微妙(us)级的时间来处理。 系统的时钟中断,经常设置成1ms或者10ms原创 2010-03-29 17:22:00 · 1064 阅读 · 0 评论