Computer Organization and Design The Hardware Software interface 阅读笔记4

Exceptions and Interrupts

exceptions(异常): An unscheduled event that disrupts program execution; used to detect undefined instructions.
interrupts(中断): An exception that comes from outside
of the processor.

Two methods to communicate the reason for an exception:

  1. Include Supervisor Exception Cause Register(SCAUSE)(used in RISCV), records the highest priority exception in a clock cycle if more than one exception occurs.
  2. Vectored interrupts: Vectored interrupts are achieved by assigning each interrupting device a unique code, typically four to eight bits in length. When a device interrupts, it sends its unique code over the data bus to the processor, telling the processor which interrupt service routine to execute.

supervisor exception program counter (SEPC) is used to save the address of the offending instruction.

Many RISC-V computers store the exception entry address in a special register named Supervisor Trap Vector(STVEC), which the OS can load with a value of its choosing.

Ask:
The difference between imprecise exceptions and precise exceptions?

Parallelism via Instructions

ILP: Instruction-Level Parallelism
multiple issue A scheme whereby multiple instructions are launched
in one clock cycle.

  • Today’s high-end microprocessors attempt to issue from three to six instructions in every clock cycle. Even moderate designs will aim at a peak IPC of 2.

static multiple issue: An approach to implementing a multiple-issue processor where many decisions are made by the compiler before execution.
dynamic multiple issue: An approach to implementing a multiple-issue processor where many decisions are made during execution by the processor.

To finding and exploiting for ILP:

1. speculation

speculation: An approach whereby the compiler or processor guesses the outcome of an instruction to remove it as a dependence in executing other instructions.

The recovery mechanisms for speculation:

  • In the case of speculation in software, the compiler usually inserts additional instructions that check the accuracy of the speculation and provide a fix-up routine to use when the speculation is wrong.
  • In hardware speculation, the processor usually buffers the speculative results until it knows they are no longer speculative. If the speculation is correct, the instructions are completed by allowing the contents of the buffers to be written to the registers or memory. If the speculation is incorrect, the hardware flushes the buffers and re-executes the correct instruction sequence.

Issue packet: A set of instructions issued in a given clock cycle.
Very long Instruction Word(VLIW): A style of instruction set architecture that launches many operations that are defined to be independent in a single-wide instruction, typically with many separate opcode fields.

2. loop unrolling

Loop unrolling: A technique to get more performance from loops that access arrays, in which multiple copies of the loop body are made and instructions from different iterations are scheduled together.

Dependences are a property of programs, presence of dependence indicates potential for a hazard.

Data dependencies sets an upper bound on how much parallelism can possibly be exploited.

Difference between dependencies and anti-dependencies.

True data dependence:
在这里插入图片描述
Read After Write(RAW) hazard ⇑ \Uparrow , cannot execute simultaneously.

Anti-dependence: :
1.
在这里插入图片描述
2.
在这里插入图片描述
Instructions involved in a name dependence can execute simultaneously if name used in instructions is changed so instructions do not conflict:

3. Superscalar Processor

Superscalar: An advanced pipelining technique that enables the processor to execute more than one instruction per clock cycle by selecting them during execution.

dynamic pipeline scheduling Hardware support for reordering the order of instruction execution to avoid stalls.
divide the pipeline into three parts:

  1. instruction fetch and issue unit
  2. multiple function unit
  3. commit unit

Each function has buffers to hold the operands and the operation

Three primary units of a dynamically scheduled pipeline.

在这里插入图片描述
out-of-order execution: A processor executes instructions in an order governed by the availability of input data and execution units, rather than by their original order in a program.

Steps of out-of-order execution:

  1. 当收到一个命令后,立即将其与所有register files里的量拷贝至一个reservation station中,该命令将被缓存起来直到与命令相关的所有操作数与Functional units都处于空闲状态时立即被执行。
  2. 命令执行完后,生成的结果存入Commit unit中,直到“ it is safe to release the result of an operation to programmer-visible registers and
    memory.”

The increase of ILP faced two major bottlenecks, despite the existence of processors with four to six issues per clock, very few applications can sustain more than two instructions per clock.
⇒ \Rightarrow Within the pipeline, the dependences are hard to alleviated.
⇒ \Rightarrow Losses in the memory hierarchy

The downside to the increasing exploitation of instruction-level parallelism via dynamic multiple issue and speculation is potential energy inefficiency. Now that we have collided with the power wall, we are seeing designs with multiple processors per chip where the processors are not as deeply pipelined or as aggressively speculative as its predecessors.

The belief is that while the simpler processors are not as fast as their sophisticated brethren, they deliver better performance per Joule, so that they can deliver more performance per chip when designs are constrained more by energy than they are by the number of transistors.

pipeline of ARM Cortex-A53 and Intel Core i7 920

在这里插入图片描述

  • Intel fetches x86 instructions and translates them into internal RISC-V-like instructions, which Intel calls micro-operations. The micro-operations are then executed by a sophisticated, dynamically scheduled, speculative pipeline capable of sustaining an execution rate of up to six micro-operations per clock cycle.
  • The Intel Core i7 uses a scheme for resolving anti-dependences and incorrect speculation that uses a reorder buffer together with register renaming.

GFLOPS(Gigaflops )每秒浮点运算次数

  • Many of the difficulties of pipelining arise because of instruction set complications.
  1. Widely variable instruction lengths and running times can lead to imbalance among pipeline stages and severely complicate hazard detection in a design pipelined at the instruction set level.
  2. Addressing modes that update registers complicate hazard detection. Other addressing modes that require multiple memory accesses substantially complicate pipeline control and make it difficult to keep the pipeline flowing smoothly.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

乘螺舟而至

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值