深入理解计算机系统——第二章 Representing and Manipulating Information

深入理解计算机系统 课程视频
进制转换(二进制、八进制、十进制、十六进制)超详细

2.1 Information Storage

bytes: Rather than accessing individual bits in memory, most computers use blocks of eight bits, or bytes, as the smallest addressable unit of memory.

memory: A machine-level program views memory as a very large array of bytes, referred to as virtual memory.

virtual address space: Every byte of memory is identified by a unique number, known as its address, and set of memory is identified by a unique number, known as the virtual address space.
Understanding virtual address and virtual address space

program objects: program data, instructuctions and control information.

2.1.1 Hexadecimal Notation

hexadecimal numbers: write bit patterns as base-16. Hexadecimal uses digits 0 through 9 along with characters A through F to represent 16 possible values. (characters A through F may be written in either upper or lower case)

In C, numeric constants starting with 0x or 0X are interpreted as being in hexadecimal.

Converting between decimal and hexadecimal:
To convet a decimal number x to hexadecimal, we can repeatedly devide x by 16, giving a quotient q and demainder r, such that x = q * 16 + r. We then use the hexadecimal digit representing r as the least significant digit and generate the remaining digits by repeating the process on q.

Conversely, to convert a hexadecimal number to decimal, we can multiply each of the hexadecimal digits by the approprite power of 16.

转换方法适用于其他进制之间转化(如十进制和二进制)。

2.1.2 Data Size

words: every computer has a word size, indicating the nominal size of pointer data. Since a virtual address is encoded by such a word, the most important system parameter determined by the word size is the maximum size of the virtual address space. That is, for a machine with a w-bit word size, the virtual addresses can range from 0 to 2w -1, giving the program access to at most 2w bytes.

在这里插入图片描述

The C language allows a variety of ways to order the keywords and to include or omit optional keywords. As examples, all of the following declarations have identical meaning:

unsigned long
unsigned long int
long unsigned
long unsigned int

The above figure shows that a pointer uses the full word size of the machine.

2.1.3 Addressing and Byte Ordering

For program objects that span multiple bytes, we must establish two conventions: what the address of the object will be, and how we will order the bytes in memory.

In virtually all machines, a multi-byte object is stored as a contiguous sequence of bytes, with the address of the object given by the smallest address of the bytes used.

For ordering the bytes representing an object, there are two common conventions.

Some machines choose to store the object in memory ordered from least significant byte to most, while other machines
store them from most to least.

little endian: the least significant byte comes first.

big endian: the most significant byte comes first.

2.1.7 Bit-Level Operations in C

One useful feature of C is that it supports bitwise Boolean operations. In fact, the symbols we have used for the Boolean operations are exactly those used by C:

| for or, & for and, ~ for not, and ^ for exclusive-or.

2.1.8 Logical Operations in C

C also provides a set of logical operators ||, &&, and !, which correspond to the or, and, and not operations of logic.

distinction between logical operators and bit-level operators:

  1. The logical operations treat any nonzero argument as representing true and argument 0 as representing false.
    They return either 1 or 0, indicating a result of either true or false, respectively.

  2. logical operators do not evaluate their second argument if the result of the expression can be determined by evaluating the first argument.

2.1.9 Shift Operations in C

Shift operations associate from left to right, so x < < j < < k x << j << k x<<j<<k is equivalent to ( x < < j ) < < k (x << j) << k (x<<j)<<k.

left shift: For an operand x having bit representation [ x w − 1 , x w − 2 , . . . , x 0 ] [x_{w−1}, x_{w−2}, . . . , x_0] [xw1,xw2,...,x0], the C expression x << k yields a value with bit representation [ x w − k − 1 , x w − k − 2 , . . . , x 0 , 0 , . . . , 0 ] [x_{w−k−1}, x_{w−k−2}, . . . , x_0,0, . . . , 0] [xwk1,xwk2,...,x0,0,...,0]. That is, x is shifted k bits to the left, dropping off the k most significant bits and filling the right end with k zeros. The shift amount should be a value between 0 and w − 1.

right shift:

  1. Logical. A logical right shift fills the left end with k zeros, giving a result
    [ 0 , . . . , 0 , x w − 1 , x w − 2 , . . . x k ] [0, . . . , 0, x_{w−1}, x_{w−2}, . . . x_k] [0,...,0,xw1,xw2,...xk].

  2. Arithmetic. An arithmetic right shift fills the left end with k repetitions of the most significant bit, giving a result [ x w − 1 , . . . , x w − 1 , x w − 1 , x w − 2 , . . . x k ] [x_{w−1}, . . . , x_{w−1}, x_{w−1}, x_{w−2}, . . . x_k] [xw1,...,xw1,xw1,xw2,...xk].
    This convention might seem peculiar, but as we will see, it is useful for operating on signed integer data.

算数右移在左侧高位填充符号位,在补码除法中需用到算数右移。

2.2 Integer Representations

2.2.1 Integer Data Types

在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

2.2.2 Unsigned Encodings

We write a bit vector as either x ⃗ \vec{x} x , to denote the entire vector, or as [ x w − 1 , x w − 2 , . . . , x 0 ] [x_{w−1}, x_{w−2}, . . . , x_0] [xw1,xw2,...,x0] to denote the individual bits within the vector.

We can express this interpretation as a function B2Uw (for “binary to unsigned,” length w):

For vector x ⃗ \vec{x} x = [ x w − 1 , x w − 2 , . . . , x 0 ] [x_{w−1}, x_{w−2}, . . . , x_0] [xw1,xw2,...,x0]:
B 2 U w ( x ⃗ ) = ∑ i = 0 w − 1 x i 2 i B2U_w(\vec{x}) = \sum_{i=0}^{w-1} x_i2^i B2Uw(x )=i=0w1xi2i

2.2.3 Two’s-Complement Encodings

The most common computer representation of signed numbers is known as two’s-complement form. This is defined by interpreting the most significant bit of the word to have negative weight. We express this interpretation as a function B2Tw (for “binary to two’s complement” length w):

For vector x ⃗ \vec{x} x = [ x w − 1 , x w − 2 , . . . , x 0 ] [x_{w−1}, x_{w−2}, . . . , x_0] [xw1,xw2,...,x0]:
B 2 T w ( x ⃗ ) = − x w − 1 2 w − 1 + ∑ i = 0 w − 2 x i 2 i B2T_w(\vec{x}) = -x_{w-1}2^{w-1} + \sum_{i=0}^{w-2} x_i2^i B2Tw(x )=xw12w1+i=0w2xi2i

The most significant bit x w − 1 x_{w−1} xw1 is also called the sign bit.

T M i n w = − 2 w − 1 TMin_w = -2^{w-1} TMinw=2w1
T M a x w = 2 w − 1 − 1 TMax_w = 2^{w-1} - 1 TMaxw=2w11

在这里插入图片描述A few points are worth highlighting about these numbers.

  1. The two’s-complement range is asymmetric:
    |TMin| = |TMax| + 1;
    that is, there is no positive counterpart to TMin.

This asymmetry arises because half the bit patterns (those with the sign bit set to 1) represent negative numbers, while half (those with the sign bit set to 0) represent nonnegative numbers.

Since 0 is nonnegative, this means that it can represent one less positive number than negative.

  1. UMax = 2TMax + 1

All of the bit patterns that denote negative numbers in two’s-complement notation become positive values in an unsigned representation.

Alternative representations of signed numbers

There are two other standard representations for signed numbers:

Sign magnitude

The most significant bit is a sign bit that determines whether the remaining bits should be given negative or positive weight:

B 2 S w ( x ⃗ ) = ( − 1 ) x w − 1 ∑ i = 0 w − 2 x i 2 i B2S_w(\vec{x}) = (-1)^{x_{w-1}} \sum_{i=0}^{w-2}x_i2^i B2Sw(x )=(1)xw1i=0w2xi2i

disadvantage: that there are two different encodings of the number 0.

Ones’ complement

This is the same as two’s complement, except that the most significant bit has weight − ( 2 w − 1 − 1 ) −(2^{w−1} − 1) (2w11) rather than − 2 w − 1 −2^{w−1} 2w1

B 2 T w ( x ⃗ ) = − x w − 1 ( 2 w − 1 − 1 ) + ∑ i = 0 w − 2 x i 2 i B2T_w(\vec{x}) = -x_{w-1}(2^{w-1}-1) + \sum_{i=0}^{w-2} x_i2^i B2Tw(x )=xw1(2w11)+i=0w2xi2i

从公式可以看出: 对于负数:反码 = 1 - 补码
即:负数的补码 = 反码 + 1。
对正数, x w − 1 x_{w-1} xw1 是0, 因此 补玛与原码,反码相同。

The term “two’s-complement” arises from the fact that for nonnegative x we compute a w-bit representation of −x as 2 w − x 2^w − x 2wx (a single two.)
例如: 对于 w 是 4 bit, -6 的补码是 1010, 6的补码是 0110, 2 w 2^w 2w 是16,16 - 6 = 10 (1010)。
对于补码, 可以想时钟模型,时钟圈一个周期 12 小时, 即模为 12, 正数表示 顺时针, 负数表示 逆时针-3 则为逆时针3 小时,对应顺时针9 小时到同样位置。

The term “ones’ complement” comes from the property that we can compute −x in this notation as [111 . . . 1]− x (multiple ones).

反码, [111 . . . 1] 和 [000 . . . 0] 都表示0。前者是 -0, 后者 +0。

因此, 反码和符号量都有缺陷。

2.2.4 Conversions between Signed and Unsigned

the effect of casting is to keep the bit values identical but change how these bits are interpreted

function T2U describes the conversion of a two’scomplement number to its unsigned counterpart, while U2T converts in the opposite direction.

Now define the function T 2 U w T2U_w T2Uw as T 2 U w ( x ) = . B 2 U w ( T 2 B w ( x ) ) T2U_w(x) =. B2U_w(T2B_w(x)) T2Uw(x)=.B2Uw(T2Bw(x)).

For x such that T M i n w ≤ x ≤ T M a x w TMin_w ≤ x ≤ TMax_w TMinwxTMaxw:
T 2 U w ( x ) = { x + 2 w , x < 0     x , x ≥ 0 T2U_w(x) = \begin{cases} x + 2^w,& \text{$x \lt 0$} \\[2ex] \ \ \ x,& \text{$x\geq0$} \end{cases} T2Uw(x)= x+2w,   x,x<0x0

Unsigned to two’s-complement conversion:

For u such that 0 ≤ u ≤ U M a x w 0 ≤ u ≤ UMax_w 0uUMaxw:
U 2 T w ( u ) = {     u , u ≤ T M a x w u − 2 w , u > T M a x w U2T_w(u) = \begin{cases} \ \ \ u,& \text{$u \leq TMax_w$} \\[2ex] u - 2^w,& \text{$u \gt TMax_w$} \end{cases} U2Tw(u)=    u,u2w,uTMaxwu>TMaxw

将补码转换为unsigned,比特位数值并未改变,如当 w = 4时,-5 补码为 1011 ,转换为 unsigned则会解析为 11

2.2.5 Signed versus Unsigned in C

Generally, most numbers are signed by default.

C allows conversion between unsigned and signed. Although the C standard does not specify precisely how this conversion should be made, most systems follow the rule that the underlying bit representation does not change.

Conversions can happen as follows:

1. explicit casting

int tx, ty;
unsigned ux, uy;
tx = (int) ux; // us casts to int
uy = (unsigned) ty; // ty casts to unsigned

2. assignment

from right to left

int tx, ty;
unsigned ux, uy;
tx = ux; // cast to signed
uy = ty; // casr to unsigned

3. printf
When printing numeric values with printf, the directives %d, %u, and %x are used to print a number as a signed decimal, an unsigned decimal, and in hexadecimal format, respectively.

Note that printf does not make use of any type information, and so it is possible to print a value of type int with directive %u and a value of type unsigned with directive %d.

int x = -1;
unsigned u = 2147483648; /* 2 to the 31st */
printf("x = %u = %d\n", x, x);
printf("u = %u = %d\n", u, u);
x = 4294967295 = -1
u = 2147483648 = -2147483648

4. expressions containing combinations of signed and unsigned quantities

低级(表示范围小)转化为高级(表示范围大)

如:-1 < 0U
左边 signed, 有边 unsigned, 因此左边转化为 unsigned,然后进行操作运算,因此该表示为假。

2.2.6 Expanding the Bit Representation of a Number

One common operation is to convert between integers having different word sizes while retaining the same numeric value.

扩展数值的比特位后保持结果不变。

1. Expansion of an unsigned number by zero extension

Define bit vectors u ⃗ = [ u w − 1 , u w − 2 , . . . , u 0 ] \vec u = [u_{w−1}, u_{w−2}, . . . , u_0] u =[uw1,uw2,...,u0] of width w and u ′ ⃗ \vec {u'} u = [0, . . . , 0, u w − 1 , u w − 2 , . . . , u 0 ] u_{w−1}, u_{w−2}, . . . , u_0] uw1,uw2,...,u0] of width w’, where w’ > w.

Then B 2 U w B2U_w B2Uw ( u ⃗ ) (\vec u) (u ) = B 2 U w ′ ( u ′ ⃗ ) B2U_{w'}(\vec {u'}) B2Uw(u ).

对于无符号整数,扩大其尺寸只需在高位填 0

2. Expansion of a two’s-complement number by sign extension

Let w' = w + k. What we want to prove is that:

B 2 T w + k ( [ x w − 1 , . . . , x w − 1 ⏟ k times , x w − 1 , x w − 2 , . . . , x 0 ] ) = B 2 T w ( [ x w − 1 , x w − 2 , . . . , x 0 ] ) B2T_{w+k}([\underbrace{{\color{blue}x_{w-1}}, . . . , {\color{blue}x_{w-1}}}_\text{k times}, {\color{blue}x_{w-1}}, x_{w-2}, . . . , x_0]) = B2T_w([{\color{blue}x_{w-1}}, x_{w-2}, . . . , x_0]) B2Tw+k([k times xw1,...,xw1,xw1,xw2,...,x0])=B2Tw([xw1,xw2,...,x0])

Thus, the task can be reduced to prove that:

B 2 T w + 1 ( [ x w − 1 , x w − 1 x w − 2 , . . . , x 0 ] ) = B 2 T w ( [ x w − 1 , x w − 2 , . . . , x 0 ] ) B2T_{w+1}([{\color{blue}x_{w-1}}, {\color{blue}x_{w-1}}x_{w-2}, . . . , x_0]) = B2T_w([{\color{blue}x_{w-1}}, x_{w-2}, . . . , x_0]) B2Tw+1([xw1,xw1xw2,...,x0])=B2Tw([xw1,xw2,...,x0])

B 2 T w + 1 ( [ x w − 1 , x w − 1 x w − 2 , . . . , x 0 ] ) = − x w − 1 2 w + x w − 1 2 w − 1 + ∑ i = 0 w − 2 x i 2 i = − x w − 1 ( 2 w − 2 w − 1 ) + ∑ i = 0 w − 2 x i 2 i = − x w − 1 2 w − 1 + ∑ i = 0 w − 2 x i 2 i = B 2 T w ( [ x w − 1 , x w − 2 , . . . , x 0 ] ) \begin{aligned} B2T_{w+1}([{\color{blue}x_{w-1}}, {\color{blue}x_{w-1}}x_{w-2}, . . . , x_0]) &= -{\color{blue}x_{w-1}}2^w + {\color{blue}x_{w-1}}2^{w-1} + \sum_{i=0}^{w-2}x_i2^i \\[6ex] &= -{\color{blue}x_{w-1}}(2^w - 2^{w-1}) + \sum_{i=0}^{w-2}x_i2^i \\[6ex] &= -{\color{blue}x_{w-1}}2^{w-1} + \sum_{i=0}^{w-2}x_i2^i \\[6ex] &= B2T_w([{\color{blue}x_{w-1}}, x_{w-2}, . . . , x_0]) \end{aligned} B2Tw+1([xw1,xw1xw2,...,x0])=xw12w+xw12w1+i=0w2xi2i=xw1(2w2w1)+i=0w2xi2i=xw12w1+i=0w2xi2i=B2Tw([xw1,xw2,...,x0])

对于有符号位的整数,在高位填充其符号位后数值保存不变。

2.2.7 Truncating Numbers

1. Truncation of an unsigned number

B 2 U w ( [ x w − 1 , x w − 2 , . . . , x 0 ] )   m o d   2 k = [ ∑ i = 0 w − 1 x i 2 i ]   m o d   2 k = [ ∑ i = 0 k − 1 x i 2 i ]   m o d   2 k = ∑ i = 0 k − 1 x i 2 i = B 2 U k ( [ x k − 1 , x k − 2 , . . . , x 0 ] ) \begin{aligned} B2U_w([x_{w-1}, x_{w-2}, . . . , x_0]) \ mod \ 2^k &= \Biggl[ \sum_{i=0}^{w-1}x_i2^i\Biggr] \ mod \ 2^k \\[4ex] &= \Biggl[ \sum_{i=0}^{k-1}x_i2^i\Biggr] \ mod \ 2^k \\[4ex] &= \sum_{i=0}^{k-1}x_i2^i \\[4ex] &= B2U_k([x_{k-1}, x_{k-2}, . . . , x_0]) \end{aligned} B2Uw([xw1,xw2,...,x0]) mod 2k=[i=0w1xi2i] mod 2k=[i=0k1xi2i] mod 2k=i=0k1xi2i=B2Uk([xk1,xk2,...,x0])

In this derivation, we make use of the following property:

{ 2 i   m o d   2 k = 0 (i ≥ k ) 2 i   m o d   2 k = 2 i (i < k) \begin{cases} 2^i \ mod \ 2^k = 0 &\text{(i$\geq k$)} \\[2ex] 2^i \ mod \ 2^k = 2^i &\text{(i < k)} \\[1ex] \end{cases} 2i mod 2k=02i mod 2k=2i(ik)(i < k)

对于无符号整数,直接将多出的高位去掉。

2. Truncation of a two’s-complement number

A similar property holds for truncating a two’s-complement number, except that it then converts the most significant bit into a sign bit.

B 2 T k ( [ x k − 1 , x k − 2 , . . . , x 0 ] ) = U 2 T k ( B 2 U w ( [ x w − 1 , x w − 2 , . . . , x 0 ] )   m o d 2 k ) B2T_k([x_{k−1}, x_{k−2}, . . . , x_0]) = U2T_k(B2U_w([x_{w−1}, x_{w−2}, . . . , x_0]) \ mod 2^k) B2Tk([xk1,xk2,...,x0])=U2Tk(B2Uw([xw1,xw2,...,x0]) mod2k)

将补码的比特位减小,同样是直接去掉高位多余的位,然后解析的时候将剩下的最高位当作符号位。

2.3 Integer Arithmetic

2.3.1 Unsigned Addition

Let us define the operation + w u +^u_w +wu for arguments x and y, where 0 ≤ x, y < 2 w 2^w 2w,

as the result of truncating the integer sum x + y to be w bits long and then viewing the result as an unsigned number.

This can be characterized as a form of modular arithmetic, computing the sum modulo 2 w 2^w 2w by simply discarding any bits with weight greater than 2 w − 1 2^{w−1} 2w1 in the bit-level representation of x + y.

For x and y such that 0 ≤ x \leq x x, y < 2 w 2^w 2w:

x + w u y = { x + y , x + y <  2 w    Normal x + y − 2 w , 2 w   ≤ x + y  <  2 w + 1    Overflow x + ^u_wy = \begin{cases} x + y, &\text{x + y < $2^w$ \ \ Normal} \\[2ex] x + y - 2^w, &\text{$2^w$ $\leq {x + y}$ < $2^{w+1}$ \ \ Overflow} \end{cases} x+wuy= x+y,x+y2w,x + y < 2w   Normal2w x+y < 2w+1   Overflow

两个无符号整数相加,如果结果超过最大范围,则其值为 x + y - 2 w 2^w 2w

Unsigned negation
For every value x, there must be some value − w u x -^u_w x wux such that - w u x \pmb{^u_w x } wux + w u x = 0 \pmb{^u_w x = 0} wux=0.

For any number x such that 0 ≤ x < 2 w 2^w 2w, its w-bit unsigned negation − w u x -^u_w x wux is given by the following:

− w u x = { x , x = 0 2 w − x , x > 0 -^u_w x = \begin{cases} x, &\text{x = 0} \\[2ex] 2^w - x, &\text{x > 0} \\[1ex] \end{cases} wux= x,2wx,x = 0x > 0

无符号整数取反
例如: w 为 4 bit, unsigned x = 6, 表示为 0110,则 − 4 u 6 -^u_4 6 4u6 为 10, 表示为 1010,两者相加为 0。这里 negation 相当于时钟逆时针走,如果模为 16,顺时针走 6 小时, 相当于逆时针走 10 小时。

2.3.2 Two’s-Complement Addition

For integer values x and y in the range − 2 w − 1 ≤ x , y ≤ 2 w − 1 − 1 −2^{w−1} ≤ x, y ≤ 2^{w−1} − 1 2w1x,y2w11:

x + w t y = { x + y − 2 w , 2 w − 1 ≤ x + y    Positive overflow x + y , − 2 w − 1 ≤ x + y < 2 w − 1     Normal x + y + 2 w , x + y < − 2 w − 1     Negative overflow x + ^t_w y = \begin{cases} x + y - 2^w, & \text{$2^{w-1}\leq {x + y}$\ \ \ Positive overflow} \\[2ex] x + y, &\text{$-2^{w-1}\leq {x + y} < 2^{w-1}$ \ \ \ Normal}\\[2ex] x + y + 2^w, &\text{$x + y < -2^{w-1}$ \ \ \ Negative overflow}\\[2ex] \end{cases} x+wty= x+y2w,x+y,x+y+2w,2w1x+y   Positive overflow2w1x+y<2w1    Normalx+y<2w1    Negative overflow

2.3.3 Two’s-Complement Negation

For x in the range T M i n w ≤ x ≤ T M a x w TMin_w ≤ x ≤ TMax_w TMinwxTMaxw, its two’s-complement negation − w t x -^t_w x wtx is given by the formula:

− w t x = { T M i n w , x =  T M i n w − x , x >  T M i n w -^t_w x = \begin{cases} TMin_w, &\text{x = $TMin_w$} \\[2ex] -x, &\text{x > $TMin_w$} \\[1ex] \end{cases} wtx= TMinw,x,x = TMinwx > TMinw

negate a number

将一个数取反再加1。适用于 unsigned 和 signed。

如:对 unsigned 6, 表示为 0110, 取反加1 后的 1010,与前面结果相同。

对 signed 6, 1010 即为补码 -6,正确。

对 signed -6, 表示为 1010, 取反加1 后为 0110, 即 6。正确。

negate a number 其实就是取其补码,当前二进制位与其补码相加后为0。
而原码与反码相加后得到全1,再加1得到 0。

2.3.4 Unsigned Multiplication

For x and y such that 0 ≤ x , y ≤ U M a x w 0 ≤ x, y ≤ UMax_w 0x,yUMaxw:

x ∗   w u y = ( x . y )   m o d   2 w x *\ ^u_w y = (x . y) \ mod \ 2^w x wuy=(x.y) mod 2w

2.3.5 Two’s-Complement Multiplication

Truncating a two’s-complement number to w bits is equivalent to first computing its value modulo 2 w 2_w 2w and then converting from unsigned to two’s complement, giving the following:

For x and y such that T M i n w ≤ x , y ≤ T M a x w TMin_w ≤ x, y ≤ TMax_w TMinwx,yTMaxw:

x ∗   w t y = U 2 T w ( ( x . y )   m o d   2 w ) x *\ ^t_w y = U2T_w((x . y) \ mod \ 2^w) x wty=U2Tw((x.y) mod 2w)

补码的乘法需要扩展符号位计算,因此乘积的全部比特位和无符号计算结果不相同,见下表:
在这里插入图片描述

相关计算说明见:
关于补码(有符号)乘法遇到的疑惑
how to do two complement multiplication and division of integers?

前面扩展比特位部分介绍过,对于补码在高位填充符号位后其数值不变,因此乘法先扩展符号位再计算。

2.3.6 Multiplying by Constants

B 2 U w + k ( [ x w − 1 , x w − 2 , . . . , x 0 , 0 , . . . , 0 ] ) = ∑ i = 0 w − 1 x i 2 i + k = [ ∑ i = 0 w − 1 x i 2 i ] ⋅ 2 k = x 2 k \begin{aligned} B2U_{w+k}([x_{w-1}, x_{w-2}, . . . , x_0, {\color{blue}0}, . . . , {\color{blue}0}]) &= \sum_{i=0}^{w-1}x_i2^{i+k} \\[4ex] &= \Biggl[\sum_{i=0}^{w-1}x_i2^{i}\Biggr] \cdot 2^k \\[4ex] &= x2^k \end{aligned} B2Uw+k([xw1,xw2,...,x0,0,...,0])=i=0w1xi2i+k=[i=0w1xi2i]2k=x2k

When shifting left by k for a fixed word size, the high-order k bits are discarded (truncating the high-order k bits), yielding:
[ x w − k − 1 , x w − k − 2 , . . . , x 0 , 0 , . . . , 0 ] [x_{w-k-1}, x_{w-k-2}, . . . , x_0, {\color{blue}0}, . . . , {\color{blue}0}] [xwk1,xwk2,...,x0,0,...,0]

So, for 0 ≤ k \leq k k < w, the C expression x << k yields the value x 2 k   m o d   2 w = x ∗ w u 2 k x2^k\ mod \ 2^w = x *^u_w 2^k x2k mod 2w=xwu2k.

Two’s-complement multiplication by a power of 2

Since the bit-level operation of fixed-size two’s-complement arithmetic is equivalent to that for unsigned arithmetic, we can make a similar statement about the relationship between left shifts and multiplication by a power of 2 for two’s-complement arithmetic:

such that 0 ≤ k < w, the C expression x << k yields the value x ∗   w t 2 k x * \ ^t_w 2^k x wt2k

Note that multiplying by a power of 2 can cause overflow with either unsigned or two’s-complement arithmetic.

整数乘法实际为向左位移的过程,如 w4x2,即 0010,计算 2 * 5
根据 5 = 2 2 2^2 22 + 2 0 2^0 20,即将 0010 先向左移 2 位,得到 1000 ,再加上 0010 左移 0 位(即保持不变)的结果,最后得到 1010 ,即 10

注意:左移时右侧补 0,左侧高位去掉,即使用逻辑左移

补码乘法计算和无符号整数相同?

2.3.7 Dividing by Powers of 2

Dividing by a power of 2 can also be performed using shift operations, but we use a right shift rather than a left shift. The two different right shifts—logical and arithmetic—serve this purpose for unsigned and two’s-complement numbers, respectively.

除法计算采用右移,但和乘法不同,无符号整数采用逻辑右移,而补码则采用算数右移

Integer division always rounds toward zero.

Notation:

  1. ⌊ \lfloor a ⌋ \rfloor

For any real number a, define ⌊ \lfloor a ⌋ \rfloor to be the unique integer a' such that a' ≤ a < a' + 1.

As examples, ⌊ 3.14 ⌋ \lfloor3.14\rfloor 3.14 = 3, ⌊ − 3.14 ⌋ \lfloor−3.14\rfloor 3.14 = −4, and ⌊ 3 ⌋ \lfloor3\rfloor 3 = 3.

  1. ⌈ a ⌉ \lceil a\rceil a

Similarly, define a to be the unique integer a’ such that a’ − 1 < a ≤ a’.

As examples, ⌈ 3.14 ⌉ \lceil3.14\rceil 3.14 = 4, ⌈ − 3.14 ⌉ \lceil-3.14\rceil 3.14= −3, and ⌈ 3 ⌉ \lceil3\rceil 3 = 3.

For x ≥ 0 x \geq 0 x0 and y ≥ 0 y \geq 0 y0, integer division should yield ⌊ x / y ⌋ \lfloor x/y\rfloor x/y.

while for x < 0 x \lt 0 x<0 and y > 0 y \gt 0 y>0, it should yield ⌈ x / y ⌉ \lceil x/y\rceil x/y.

That is, it should round down a positive result but round up a negative one.

对于正数相除,结果取下限;对于负数,结果取上限。

Unsigned division by a power of 2

Performing logical right shift for unsigned division by a power of 2.

Two’s-complement division by a power of 2

The case for dividing by a power of 2 with two’s-complement arithmetic is slightly more complex.

  1. the shifting should be performed using an arithmetic right shift, to ensure that negative values remain negative.

    However, this causes the result to be rounded downward rather than toward zero.

  2. correcting this improper rounding by “biasing” the value before shifting:

    This technique exploits the following property :

    ⌈ x / y ⌉ = ⌊ ( x + y − 1 ) / y ⌋ (y > 0) \lceil x/y\rceil = \lfloor (x + y - 1)/y\rfloor \qquad \text{(y > 0)} x/y=⌊(x+y1)/y(y > 0)

    To prove the above formula, suppose that x = k y + r x = ky + r x=ky+r, where 0 ≤ r < y 0 \leq r \lt y 0r<y, giving ( x + y − 1 ) / y = k + ( r + y − 1 ) / y (x + y - 1)/y = k + (r + y - 1)/y (x+y1)/y=k+(r+y1)/y, and so ⌊ ( x + y − 1 ) / y ⌋ = k + ⌊ ( r + y − 1 ) / y ⌋ \lfloor (x + y - 1)/y\rfloor = k + \lfloor (r + y - 1)/y\rfloor ⌊(x+y1)/y=k+⌊(r+y1)/y.

    If r = 0 r = 0 r=0, 0 < ( y − 1 ) / y < 1 0 \lt (y - 1)/y \lt 1 0<(y1)/y<1, so the latter term will equal 0.

    If r > 0 r > 0 r>0, because r r r is an integer, so r ≥ 1 r \geq 1 r1.
    ( r + y − 1 ) / y = 1 + ( r − 1 ) / y (r + y - 1)/y = 1 + (r - 1)/y (r+y1)/y=1+(r1)/y
    0 ≤ ( r − 1 ) / y < 1 0 \leq (r - 1)/y \lt 1 0(r1)/y<1
    Thus, the latter term will equal to 1.

    The C expression:
    ( x < 0   ?   x + ( 1 < < k ) − 1 : x ) > > k (x \lt 0 \ ?\ x + (1 \lt \lt k)-1 : x) \gt \gt k (x<0 ? x+(1<<k)1:x)>>k

    Note: 1 < < k 1 \lt \lt k 1<<k equals to 2 k 2^k 2k

2.4 Floating Point

2.4.2 IEEE Floating-Point Representation

The IEEE floating-point standard represents a number in a form V V V = ( − 1 ) s × M × 2 E (−1)^s \times M \times 2^E (1)s×M×2E:

  • The sign s s s determines whether the number is negative ( s s s = 1 1 1) or positive ( s s s = 0 0 0), where the interpretation of the sign bit for numeric value 0 0 0 is handled as a special case.

  • The significand M M M is a fractional binary number that ranges either between 1 1 1 and 2 − ϵ 2 - \epsilon 2ϵ or between 0 0 0 and 1 − ϵ 1 − \epsilon 1ϵ .( ϵ \epsilon ϵ is usually 2 − k   ( k > 0 2^{-k} \ (k \gt 0 2k (k>0))

  • The exponent E E E weights the value by a (possibly negative) power of 2.

在这里插入图片描述

The bit representation of a floating-point number is divided into three fields to encode these value:

  • The single sign bit s s s directly encodes the sign s s s.
  • The k k k-bit exponent field e x p = e k − 1 ⋯ e 1 e 0 exp = e_{k−1} \cdots e_1e_0 exp=ek1e1e0 encodes the exponent E E E.
  • The n n n-bit fraction field f r a c = f n − 1 ⋯ f 1 f 0 frac = f_{n−1} \cdots f_1f_0 frac=fn1f1f0 encodes the significand M M M, but the value encoded also depends on whether or not the exponent field equals 0 0 0.

In the single-precision floating-point format (a float in C), fields s s s, e x p exp exp, and f r a c frac frac are 1 1 1, k k k = 8 8 8, and n n n = 23 23 23 bits each, yielding a 32 32 32-bit representation.

In the double-precision floating-point format (a double in C), fields s s s, e x p exp exp, and f r a c frac frac are 1 1 1, k k k = 11 11 11, and n n n = 52 52 52 bits each, yielding a 64 64 64-bit representation.

Case 1: Normalized Values

This is the most common case. It occurs when the bit pattern of exp is neither all zeros (numeric value 0 0 0) nor all ones (numeric value 255 255 255 for single precision, 2047 2047 2047 for double).

In this case, the exponent field is interpreted as representing a signed integer in biased form.

That is, the exponent value is E E E = e − B i a s e − Bias eBias, where e e e is the unsigned number having bit representation e k − 1 ⋯ e 1 e 0 e_{k−1} \cdots e_1e_0 ek1e1e0 and Bias is a bias value equal to 2 k − 1 − 1 2^{k−1} − 1 2k11 ( 127 127 127 for single precision and 1023 1023 1023 for double).

This yields exponent ranges from − 126 −126 126 to + 127 +127 +127 for single precision and − 1022 −1022 1022 to + 1023 +1023 +1023 for double precision.

偏移值 Bias

阶码 E E E 用偏移的目的
因为指数可能是负数,为了不在阶码中引入符号位, 采用阶码形式将数值分成负数和非负数, 而无需用补码形式。

阶码 E E E 偏移值的选取
偏移值选范围的中间值,而 Normalized 形式无全 0 0 0 和全 1 1 1,因此范围是 1 ~ 254 1 ~ 254 1254 (单精度),中间值即为 127 127 127,表示范围实际是 − 126 ~ 127 -126 ~ 127 126127

The fraction field frac is interpreted as representing the fractional value f f f, where 0 ≤ f < 1 0 \leq f \lt 1 0f<1, having binary representation 0. f n − 1 ⋯ f 1 f 0 0.f_{n−1} \cdots f_1f_0 0.fn1f1f0, that is, with the binary point to the left of the most significant bit.

The significand is defined to be M = 1 + f M = 1 + f M=1+f.

This is sometimes called an implied leading 1 representation, because we can view M M M to be the number with binary representation 1. f n − 1 f n − 2 ⋯ f 0 1.f_{n−1}f_{n−2}\cdots f_0 1.fn1fn2f0.

This representation is a trick for getting an additional bit of precision for free, since we can always adjust the exponent E E E so that significand M M M is in the range 1 ≤ M < 2 1 \leq M \lt 2 1M<2.

Case 2: Denormalized Values

When the exponent field is all zeros, the represented number is in denormalized form.

In this case, the exponent value is E = 1 − B i a s E = 1 − Bias E=1Bias, and the significand value is M = f M = f M=f , that is, the value of the fraction field without an implied leading 1 1 1.

Purpose of denormalized numbers
  1. They provide a way to represent numeric value 0 0 0, since with a normalized number we must always have M ≥ 1 M \geq 1 M1, and hence we cannot represent 0 0 0.
    In fact, the floating-point zero has two representations: + 0.0 +0.0 +0.0 and − 0.0 -0.0 0.0.
    + 0.0 +0.0 +0.0: a bit pattern of all zeros.
    − 0.0 -0.0 0.0: sign bit is 1 1 1, the other fields are all zeros.

  2. Representing numbers that are very close to 0.0 0.0 0.0. They provide a property known as gradual underflow in which possible numeric values are spaced evenly near 0.0 0.0 0.0.

Why set the bias this way for denormalized values?

Having the exponent value be 1 − B i a s 1 − Bias 1Bias rather than simply − B i a s −Bias Bias might seem counterintuitive. We will see shortly that it provides for smooth transition from denormalized to normalized values.

最小的规格化值的 E = 1 − B i a s E = 1 - Bias E=1Bias,为了让非规格化和规格化值平滑过度。

Case 3: Special Values

A final category of values occurs when the exponent field is all ones.

When the fraction field is all zeros, the resulting values represent infinity, either + ∞ +\infty + when s = 0 s = 0 s=0 or − ∞ −\infty when s = 1 s = 1 s=1.

Infinity can represent results that overflow, as when we multiply two very large numbers, or when we divide by zero.

When the fraction field is nonzero, the resulting value is called a NaN, short for “not a number.” Such values are returned as the result of an operation where the result cannot be given as a real number or as infinity, as when computing − 1 \sqrt {−1} 1 or ∞ − ∞ \infty − \infty .

在这里插入图片描述
在这里插入图片描述

示例

1


根据公式 V V V = ( − 1 ) s × M × 2 E (−1)^s \times M \times 2^E (1)s×M×2E

  1. 一般情况计算

    • s 表示正负符号,占最高一位,0 表示正数,1 表示负数。
    • M 为图 2.32 中 frac 部分,范围为 [0,1) 或 [1,2),其位数为 nfrac 部分值为 fM = 1+f(一般情况)。
    • 对于格式 A,n 为 2,如果该部分为 11,则 f = 1 ∗ 2 − 1 + 1 ∗ 2 − 2 f = 1 * 2^{-1} + 1 * 2^{-2} f=121+122 = 3 4 \frac{3}{4} 43 M M M = 1 + f 1+f 1+f = 7 4 \frac{7}{4} 47
    • E 为图 2.32 中 exp 部分,位数为 k,偏移量 Bias 2 k − 1 − 1 2^{k−1} − 1 2k11,值为 e E E E = e − B i a s e − Bias eBias(一般情况)。
    • 对于格式 A,k 为 3,偏移量为 2 3 − 1 − 1 2^{3−1} − 1 2311 = 3 3 3。如果 exp 部分为 011,则 e 的值为 3,因此 E 为 0。
  2. 特殊情况 exp 全 0
    这时 E = 1 − B i a s E = 1 − Bias E=1Bias M = f M = f M=f

  3. 特殊情况 exp 全 1

    • frac 全 0
      结果为无穷,正数为正无穷,负数为负无穷。
    • frac 不是全 0
      结果为 NAN

  1. 格式 A 数字 1 表示

    • 正数则 s 为 0。
    • M 为 1,则 f 为 0,即 00
    • 2 E 2^{E} 2E 为 1,则 E 为 0,因为 Bias 为 3,e 为 3,即 011
    • 最终值为 0 011 00
  2. 格式 B 数字 1 2 \frac{1}{2} 21 表示

    • 正数则 s 为 0。
    • 1 2 \frac{1}{2} 21 可以表示为 1 × 2 − 1 1 \times 2^{-1} 1×21M 为 1,E 为 -1。
    • 因为 Bias 为 1,则 e 为 0,对应上面特殊情况,不能用常规公式计算。
    • 对于特殊情况, e 为 0,则 E 为 0,修改表示方式,此时 M 2 − 1 2^{-1} 21,且 M = f M = f M=f,因此 frac 部分为 100
    • 最终结果为 0 00 100
  3. 格式 B 数字 11 8 \frac{11}{8} 811 表示

    • 正数则 s 为 0。
    • 11 8 \frac{11}{8} 811 可以表示为 11 8 × 1 \frac{11}{8} \times 1 811×1,即 M 11 8 \frac{11}{8} 811E 为 0。
    • f 3 8 \frac{3}{8} 83e 为 1,因此 frac 部分为 011exp 部分为 01
    • 最终结果为 0 01 011
  4. 格式 A 数字 11 8 \frac{11}{8} 811 表示

    • 正数则 s 为 0。
    • 这个格式 exp 部分只有 2 位,因此需要 round to 1.5,即 3 2 \frac{3}{2} 23
    • 3 2 × 1 \frac{3}{2} \times 1 23×1,即 M 3 2 \frac{3}{2} 23E 为 0,f 1 2 \frac{1}{2} 21e 为 3。
    • 最终结果为 0 011 10

2.4.4 Rounding

在这里插入图片描述

1. Round-to-even

Round-to-even (also called round-to-nearest) is the default mode. It attempts to find a closest match.

The only design decision is to determine the effect of rounding values that are halfway between two possible results. Round-to-even mode adopts the convention that it rounds the number either upward or downward such that the least significant digit of the result is even.

It will round upward about 50% of the time and round downward about 50% of the time.

这种情况是当处理大量数据时,保证基本一半是数据向上转换,一半数据向下转换,因此全部数据的平均值误差更小。

2. Round-toward-zero
Round-toward-zero mode rounds positive numbers downward and negative numbers upward.

3. Round-down
Round-down mode rounds both positive and negative numbers
downward.

4. Round-up
Round-up mode rounds both positive and negative numbers upward.

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值