10天精读掌握：计算机组成与设计COAD:Patterson and Hennessy 第7天 2018/11.1

最新推荐文章于 2023-08-18 21:08:42 发布

彪悍的人生不需要解释哈

最新推荐文章于 2023-08-18 21:08:42 发布

阅读量289

点赞数 2

分类专栏：第四周主宰力量系列计划计算机组成原理

本文链接：https://blog.csdn.net/weixin_43314012/article/details/83672688

版权

主宰力量系列计划同时被 3 个专栏收录

13 篇文章 0 订阅

订阅专栏

第四周

9 篇文章 0 订阅

订阅专栏

计算机组成原理

8 篇文章 0 订阅

订阅专栏

10天精读掌握：计算机组成与设计COAD:Patterson and Hennessy 第7天 2018/11.1

1. 第4次周计划概览
2. 今日学习成果
3. 今日时间表
4. 今日反思

今天是 2年修完清华6个CS硕士学位第43天

1. 第4次周计划概览

接下来10天，我将开启第一次主宰力量系列计划（2年精读彻底掌握40本国外计算机类类传世之作）。

第一次主宰力量计划之10天精读掌握 计算机组成与设计（COAD:Patterson and Hennessy，500页)
	○ 教材网址：https://book.douban.com/subject/10441748/
	○ 作者简介：
		§ John L. Hennessy 斯坦福大学校长
		§ David A. Patterson   加州大学伯克利分校计算机科学系教授，美国国家工程研究院院士，IEEE和ACM会士
	○ 计划时间： 10.24~11.2号（这是2年修完清华6个CS硕士学位之第35天~45天，第4次周计划）
	○ 学习时间：平均每天至少核心学习时间达到13小时，10天高效率学习130小时。
	○ 伏笔：我对本书的难度也不太了解，如果部分内容难度过高我可能需要13~15天才能完成本教材的学习。
	○ 辅助学习视频：
	 北京大学-计算机组成：http://www.chinesemooc.org/kvideo.php?do=kvideo_announcement&kvideoid=4392&classesid=1967

PS:10.27和10.28两天因为有事没有学习，不计入本次十天计划

2. 今日学习成果

今日评分：75分（效率低）
今日目标完成率50%

• 斯坦福COAD-P,H 4.5节流水线原理
• 深刻理解：用洗衣店的例子来比喻单周期和流水线两种CPU工作方式
• 深刻理解：流水线上任务数量对流水线性能的影响
○ 流水线负载饱满时才能达到巅峰效率
• 深刻理解；不论是单周期处理还是流水线处理，都是被漩涡鸣人吊车尾（最慢的那个）拖后腿；
• 掌握-流水线加速比公式（HJ）
○
• （无敌！）深度体会MIPS指令的如下设计思想对于流水线设计的好处！
○ 1 MIPS长度统一
○ 2 MIPS只有三种指令类型
○ 3 MIPS指令功能单一，只有lw sw指令能访问内存
○ 4 Each MIPS instruction writes at most one result and
does this in the last stage of the pipeline. （这将导致数据前移技术的旁路设计大大简化）

	○ 复习指引：阅读如下内容即可，理解了不需要经常复习
• 体会设计思想：MIPS指令的规整性使得其可以被简单的分为五大步来进行流水线设计（即设计5级流水线），进一步延申设计8 9 10级流水线也很简单。。但是对于X86而言，他的指令类型太多长度不规整导致指令之间的共性很少，因此设计的流水线非常复杂
	○ 就像你去洗衣店洗衣服一样，你的衣服是地球制造的衣服他都能洗。。因为地球的材料就那样，衣服大小也差不多都能装到洗衣机里。。如果你的衣服是火星人造的，火星人有些海拔0.1米，有些海拔100米，方差很大所以他们的衣服也千奇百怪（如X86）所以他们的洗衣店就要买很多设备才能满足所有火星人洗衣服（像谢俊一样）
	○ 火星人的衣服样式太多了，所以你的洗衣店的流水级可能要50级才能满足火星佬。。（我们知道级数太大会导致效率变低）
• 牛逼啊我操！MIPS博大精深！（记住,MIPS就是为了流水线而设计的指令集）
	○ 深刻体会设计思想：MIPS限制了仅lw和sw指令能够访问存储器，这使得我们的流水级从（黄山七佬变成黄山五佬）
		§ 同样，我们还能体会到MIPS这样设计的目的是让一个指令只完成一个任务，所以我们的流水线也变短了！而X86有些指令即要访问内存又要计算，所以他自然就需要更大的流水级去满足这些大指令！
		Tird, memory operands only appear in loads or stores in MIPS. This restriction
		means we can use the execute stage to calculate the memory address and then
		access memory in the following stage. If we could operate on the operands in
		memory（在非lw sw指令）, as in the x86, stages 3 and 4 would expand to an address stage, memory
		stage, and then execute stage.
	 First, all MIPS instructions are the same length. Tis restriction makes it much
	easier to fetch instructions in the frst pipeline stage and to decode them in the
	second stage. In an instruction set like the x86, where instructions vary from 1 byte
	to 15 bytes, pipelining is considerably more challenging. Recent implementations
	of the x86 architecture actually translate x86 instructions into simple operations
	that look like MIPS instructions and then pipeline the simple operations rather
	than the native x86 instructions! (See Section 4.10.)
	
	Second, MIPS has only a few instruction formats, with the source register felds
	being located in the same place in each instruction. Tis symmetry means that the
	second stage can begin reading the register fle at the same time that the hardware
	is determining what type of instruction was fetched. If MIPS instruction formats
	were not symmetric, we would need to split stage 2, resulting in six pipeline stages.
	We will shortly see the downside of longer pipelines.
	
	Tird, memory operands only appear in loads or stores in MIPS. This restriction
	means we can use the execute stage to calculate the memory address and then
	access memory in the following stage. If we could operate on the operands in
	memory（在非lw sw指令）, as in the x86, stages 3 and 4 would expand to an address stage, memory
	stage, and then execute stage.
	
	Fourth, as discussed in Chapter 2, operands must be aligned in memory. Hence,
	we need not worry about a single data transfer instruction requiring two data
	memory accesses; the requested data can be transferred between processor and
	memory in a single pipeline stage 
	
• MIPS黄山五佬

• 4.5.2 Hazard
	○ T-直觉感知：数据冒险的PPT3种情况，包括出路load-use冒险的本质都一样。
		§ 本质：你想吃饭但又不想排队，要么就走后门取餐（PPT情况1和2）；如果饭还没做好，那你就先等他饭做好之后再去插队。
	○ 了解控制冒险解决方案
		§ 预测法1：分支预测（固定）
		§ 预测法2：动态硬件预测(根据历史预测未来，很像我之前黑客松获奖设计的车内后视镜自动转移记忆算法）
	
	○ 重要例题：掌握数据前移技术的核心
		Reordering Code to Avoid Pipeline Stalls
		Consider the following code segment in C:
		a = b + e;
		c = b + f;
		Here is the generated MIPS code for this segment, assuming all variables are in
		memory and are addressable as oﬀsets from $t0:
		lw $t1, 0($t0)
		lw $t2, 4($t0)
		add $t3, $t1,$t2
		sw $t3, 12($t0)
		lw $t4, 8($t0)
		add $t5, $t1,$t4
		sw $t5, 16($t0) 
		
		Find the hazards in the preceding code segment and reorder the instructions
		to avoid any pipeline stalls.
		Both add instructions have a hazard because of their respective dependence
		on the immediately preceding lw instruction. Notice that bypassing eliminates
		several other potential hazards, including the dependence of the frst add on
		the frst lw and any hazards for store instructions. Moving up the third lw
		instruction to become the third instruction eliminates both hazards:
		lw $t1, 0($t0)
		lw $t2, 4($t0)
		lw $t4, 8($t0)
		add $t3, $t1,$t2
		sw $t3, 12($t0)
		add $t5, $t1,$t4
		sw $t5, 16($t0)
		On a pipelined processor with forwarding, the reordered sequence will
		complete in two fewer cycles than the original version.
		
		本例题反思
		Forwarding yields another insight into the MIPS architecture, in addition to the
		four mentioned on page 277. Each MIPS instruction writes at most one result and
		does this in the last stage of the pipeline. Forwarding is harder if there are multiple
		results to forward per instruction or if there is a need to write a result early on in
		instruction execution.

• 斯坦福COAD-P,H 4.10~4.10.1节指令级并行，推测的概念（待补充）
• 理解：提高指令级并行程度有哪两种方法？
○ 超级流水线（提高流水线深度）
○ 超标量流水线（多发射）
• 了解：实现多发射有两种方式（软件硬件）
• 理解：多发射要处理哪两个问题？
○ 射几条
○ 数据冒险和控制冒险
• 初步理解：静态多发射什么意思？
• 初步理解：动态多发射什么意思？
• 初步理解：什么是推测？
• 深刻理解：回卷的必要性
• 静态推测，动态推测
• T-理解:动态推测和静态推测有如下两个问题有很大的区别。
○ Q：软硬件分别的推测恢复机制是？
○ Q：推测导致额外的异常软硬件分别如何处理？
• 北大-单周期处理器习题30道
北大-流水线处理器习题30道

3. 今日时间表

在这里插入图片描述

4. 今日反思

评分75分，核心学习时间8小时。
效率低的主要原因是下午有3个小时没有学习而在检索资料，晚上亦是如此。避免类似事情发生的办法是：固定的时间只能做固定的事情。

彪悍的人生不需要解释哈

关注

2
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
10天精读掌握：计算机组成与设计COAD:Patterson and Hennessy 第7天 2018/11.1

10天精读掌握：计算机组成与设计COAD:Patterson and Hennessy 第7天 2018/11.11. 第4次周计划概览2. 今日学习成果3. 今日时间表4. 今日反思今天是 2年修完清华6个CS硕士学位第43天1. 第4次周计划概览接下来10天，我将开启第一次主宰力量系列计划（2年精读彻底掌握40本国外计算机类类传世之作）。第一次主宰力量计划之10天精读掌握计算机组成...
复制链接

扫一扫

专栏目录