Thought provoking sentences
- ARM Cortex A53 and Intel Core i7, reflecting our post-PC era.
- There is now a new vein of scientific investigation, with computational scientists joining theoretical and experimental scientists in the exploration of new frontiers in astronomy, biology, chemistry, and physics, among others.
- Today’s programmers need to worry about energy efficiency of their programs running either on the PMD or in the cloud, which also requires understanding what is below your code.
- About the significance of programmers comprehending the computer architecture
⇓
\Downarrow
⇓
- The modern problem is that further lowering of the voltage appears to make the transistors too leaky. Nearly 40% of the power consumption in server chips is due to leakage. Energy efficiency has replaced die area as the most critical resource of microprocessor design.
Turn off parts of the chip that are not used in a given clock cycle - Today, for programmers to get significant improvement in response time, they need to rewrite their programs to take advantage of multiple processors.
- The best designs will strike the appropriate balance for a given market among all the factors(cost, performance, energy, etc.)
- Intel x86 's checkered ancestry has led to an architecture that is difficult to explain and impossible to love
Terminology
PMDs : Personal mobile devices
SaaS : Software as a Service
interface: 接口
switch from sequential processing to parallel processing
performance bottleneck
LCD: liquid crystal display
ABI: application binary interface
CPI: clock cycles per instruction
GPR: general-purpose register
[chapter 2] The Instruction Set
synchronization mechanism
The Atomic Memory Operation :
- The critical ability we required to implement synchronization in a multiprocessor is a set of hardware primitives with the ability to atomically read and modify a memory location. That is, nothing else can interpose itself between the read and the write of the memory location
in general, architects do not expect users to employ the basic hardware primitives, but instead expect system programmers will use the primitives to build a synchronization library, a process that is often complex and tricky
- lock A lock occurs when multiple processes try to access the same resource at the same time.
- deadlock occurs when the waiting process is still holding on to another resource that the first needs before it can finish.
An example:
Resource A and resource B are used by process X and process Y
X starts to use A.
X and Y try to start using B
Y ‘wins’ and gets B first
now Y needs to use A
A is locked by X, which is waiting for Y
Translating and starting a program
UNIX follows a suffix convention for files: C source files are named x.c, assembly files are x.s, object files are named x.o, statically linked library routines are x.a, dynamically linked library routes are x.so, and executable files by default are called a.out. MS-DOS uses the suffixes .c, .asm, .obj, .lib, .dll, and .exe to the same effect.
- Why use a linker?
A: We don’t want to compile and assemble the whole program after doing a single change in a line. We want to compile and assemble each procedure independently, so that a change to one line would require compiling and assembling only one procedure. The linker takes all the independently assembled machine language programs and “stitches” them together.
static way of linking
- A library routine is a debugged block of code (subroutine, procedure, function etc.), often designed to handle commonly occurring problems or tasks. Library routines are stored in a program library and given names. This allows them to be called into immediate use when needed, even from other programs. They are designed to be used frequently.
- linker produces an executable file that can be run on a computer, executable file contains no unresolved references
Dynamically linked libraries
- few disadvantages of static approach of linking
library routines in DLL are not linked and loaded until the program is run.
dummy entry: 伪入口
[chapter 3] Arithmetic for Computers
- In the float-point representation, the tradeoff is between precision and range: increasing the size of the fraction enhances the precision of the fraction, while increasing the size of the exponent increases the range of numbers that can be represented.
- in general, floating-point numbers are of the form
( − 1 ) S × F × 2 E (-1)^S\times F \times 2^E (−1)S×F×2E
RISCV floating-point representation:
- RISC-V computers do not raise an exception on overflow or underflow; instead, software can read the floating-point control and status register (fcsr) to check whether
overflow or underflow has occurred. - fused multiply add: A floating-point instruction that performs both a multiply and an add, but rounds only once after the add.
a = a + b × c a=a+b\times c a=a+b×c