CSAPP charpter2 Homework Problems

YIZAYAKUNQAQ

已于 2022-03-24 17:43:50 修改

阅读量1.7k

点赞数

文章标签： c语言

于 2022-02-19 00:44:26 首次发布

本文链接：https://blog.csdn.net/YIZAYAKUNQAQ/article/details/123012776

版权

CSAPP charpter2 Homework Problems

2.55~2.59
2.60~2.64
2.65~2.69
2.70~2.74
2.75~2.79
2.80~2.84
2.85~2.89
2.90~2.94

2.55~2.59

2.55
Compile and run the sample code that uses show_bytes (file show-bytes.c) on different machines to which you have access. Determine the byte orderings used by these machines.

在这里插入图片描述
12345转换为16进制为0x00003039,
0x3039的整型数转化为浮点数为0x4640e400
可以看到我的机子为小端存储

2.56
Try running the code for show_bytes for different sample values

在这里插入图片描述

2.57
Write procedures show_short, show_long, and show_double that print the byte representations of C objects of types short, long, and double, respectively. Try these out on several machines.

在这里插入图片描述

2.58
Write procedures show_short, show_long, and show_double that print the byte representations of C objects of types short, long, and double, respectively. Try these out on several machines.

int is_little_endian()
 {
	int x = 1;
	char* c = &x;
	if (*c == 0)
		return 0;
	else
		return 1;

}

2.59
Write a C expression that will yield a word consisting of the least significant byte of x and the remaining bytes of y. For operands x = 0x89ABCDEF and y = 0x76543210, this would give 0x765432EF.

int main() 
{
	int x, y;
	printf("x:");
	scanf("%x", &x);
	printf("y:");
	scanf("%x", &y);
	char* xByte = &x;
	char* yByty = &y;
	*yByty = *xByte;
	printf("\noutput:%x\n", y);
}

2.60~2.64

2.60
Suppose we number the bytes in a w-bit word from 0 (least significant) to w/8 − 1 (most significant). Write code for the following C function, which will return an unsigned value in which byte i of argument x has been replaced by byte b:
unsigned replace_byte (unsigned x, int i, unsigned char b);
Here are some examples showing how the function should work:
replace_byte(0x12345678, 2, 0xAB) --> 0x12AB5678
replace_byte(0x12345678, 0, 0xAB) --> 0x123456AB

unsigned replace_byte(unsigned x, int i, unsigned char b)
{
	unsigned offset = i << 3;
	unsigned mask = 0xFF;
	mask = mask << offset;
	return x & (~mask) | (b << offset);
}

2.61
Write C expressions that evaluate to 1 when the following conditions are true and to 0 when they are false. Assume x is of type int.
A. Any bit of x equals 1.
B. Any bit of x equals 0.
C. Any bit in the least significant byte of x equals 1.
D. Any bit in the most significant byte of x equals 0.
Your code should follow the bit-level integer coding rules (page 164), with the additional restriction that you may not use equality (==) or inequality (!=) tests.

int A(int x){return !~x;}
int B(int x){return !x;}
int C(int x){return !(~x & 0xFFu);}
int D(int x){return !(x & (0xFFu << (3 << 3)));}

2.62
Write a function int_shifts_are_arithmetic() that yields 1 when run on a machine that uses arithmetic right shifts for data type int and yields 0 otherwise.
Your code should work on a machine with any word size. Test your code on several machines.

int int_shifts_are_arithmetic()
{
	int x = -1;
	x = x >> 3;
	int offset = (sizeof(int) - 1) << 3;
	return !(~x & (0xFFu << offset));
}

2.63
Fill in code for the following C functions. Function srl performs a logical right shift using an arithmetic right shift (given by value xsra), followed by other operations not including right shifts or division. Function sra performs an arithmetic right shift using a logical right shift (given by value xsrl), followed by other operations not including right shifts or division. You may use the computation 8*sizeof(int) to determine w, the number of bits in data type int. The shift amount k can range from 0 to w − 1.
unsigned srl(unsigned x, int k) {
		/* Perform shift arithmetically */ 
		unsigned xsra = (int) x >> k;
 .
 .
 }
 int sra(int x, int k) { 
 /* Perform shift logically */ 
 int xsrl = (unsigned) x >> k;
  .
  .
  }

unsigned srl(unsigned x, int k) 
{ /* Perform shift arithmetically */ 
	unsigned xsra = (int)x >> k;
	int mask = -1;
	mask = mask << ((sizeof(x) << 3) - k);
	return xsra & ~mask;
}

int sra(int x, int k)
{ /* Perform shift logically */
	int xsrl = (unsigned)x >> k;
	int mask1 = 1 << ((sizeof(x) << 3) - 1);
	int sign = !(mask1 & x) - 1;

	int mask2 = -1;
	mask2 = mask2 << ((sizeof(x) << 3) - k);
	mask2 = mask2 & sign;
	return xsrl | mask2;
}

2.64
Write code to implement the following function:
/* Return 1 when any odd bit of x equals 1; 0 otherwise.
Assume w=32 */
int any_odd_one(unsigned x);
Your function should follow the bit-level integer coding rules (page 164), except that you may assume that data type int has w = 32 bits.

int any_odd_one(unsigned x)
{
	return !!(x & 0xAAAA);
}

2.65~2.69

2.65
Write code to implement the following function:
/* Return 1 when x contains an odd number of 1s; 0 otherwise.Assume w=32 */
int odd_ones(unsigned x);
Your function should follow the bit-level integer coding rules (page 164), except that you may assume that data type int has w = 32 bits.
Your code should contain a total of at most 12 arithmetic, bitwise, and logical operations.

int odd_ones(unsigned x)
 {
    x ^= x >> 16;
    x ^= x >> 8;
    x ^= x >> 4;
    x ^= x >> 2;
    x ^= x >> 1;
    return x & 0x1;
}

使用原理：1 ^ 1 = 0 ^ 0 = 0, 1 ^ 0 = 0 ^ 1 = 0
即以此对高低16、8、4、2、1进行异或运算，将偶数个1归0
参考链接: link.
参考链接: link.

2.66
Write code to implement the following function:
/*
 * Generate mask indicating leftmost 1 in x.Assume w=32.
 * For example, 0xFF00 -> 0x8000, and 0x6600 --> 0x4000.
 * If x = 0, then return 0.
 */ 
 int leftmost_one(unsigned x); 
Your function should follow the bit-level integer coding rules (page 164), except that you may assume that data type int has w = 32 bits.
Your code should contain a total of at most 15 arithmetic, bitwise, and logical operations.
Hint: First transform x into a bit vector of the form [0. . .011. . .1]

 int leftmost_one(unsigned x)
{
     x |= x >> 1;
     x |= x >> 2;
     x |= x >> 4;
     x |= x >> 8;
     x |= x >> 16;
     x &= ~(x >> 1);
     return x;
}

参考链接: link.

2.67
You are given the task of writing a procedure int_size_is_32() that yields 1 when run on a machine for which an int is 32 bits, and yields 0 otherwise. You are not allowed to use the sizeof operator. Here is a first attempt:
/* The following code does not run properly on some machines */
int bad_int_size_is_32() {
/* Set most significant bit (msb) of 32-bit machine */ 
int set_msb = 1 << 31; 
/* Shift past msb of 32-bit word */ 
int beyond_msb = 1 << 32;

/* set_msb is nonzero when word size >= 32 
beyond_msb is zero when word size <= 32 */ 
return set_msb && !beyond_msb; 
} 
When compiled and run on a 32-bit SUN SPARC, however, this procedure returns 0. The following compiler message gives us an indication of the problem:
warning: left shift count >= width of type
A. In what way does our code fail to comply with the C standard?
B. Modify the code to run properly on any machine for which data type int is at least 32 bits.
C. Modify the code to run properly on any machine for which data type int is at least 16 bits

A：在许多机器上，当移动一个w位的值，移位指令只考虑位移量的低log2w位，因此实际上位移量就是通过计算k mod w得到的。但是C并没有对其做出规定，所以在C中位移量不能超过或等于字长。
B：

int int_size_is_32() 
{
    int set_msb = 1 << 31;
    int beyond_msb = set_msb << 1;
    return set_msb && !beyond_msb;
}

C：

int int_size_is_32() 
{
    int set_msb = 1 << 15 << 15 << 1;
    int beyond_msb = set_msb << 1;
    return set_msb && !beyond_msb;
}

2.68
Write code for a function with the following prototype:
/* 
* Mask with least signficant n bits set to 1 
* Examples: n = 6 --> 0x3F, n = 17 --> 0x1FFFF 
* Assume 1 <= n <= w 
*/ 
int lower_one_mask(int n);
Your function should follow the bit-level integer coding rules (page 164). Be careful of the case n = w.

int lower_one_mask(int n)
{
	unsigned x = -1;
	x = x >> n-1>>1;
	return ~(x<<n-1<<1);
}

2.69
Write code for a function with the following prototype:
/* 
* Do rotating left shift. Assume 0 <= n < w 
* Examples when x = 0x12345678 and w = 32: 
* n=4 -> 0x23456781, n=20 -> 0x67812345 
*/ 
unsigned rotate_left(unsigned x, int n);
Your function should follow the bit-level integer coding rules (page 164). Be careful of the case n = 0.

unsigned rotate_left(unsigned x, int n)
 {
    int w = sizeof(int) <<3;
    return (x << n) | (x >> (w - n -1) >> 1);
}

2.70~2.74

2.70
Write code for the function with the following prototype:
/*
* Return 1 when x can be represented as an n-bit, 2’s-complement 
* number; 0 otherwise 
* Assume 1 <= n <= w 
*/ 
int fits_bits(int x, int n);
Your function should follow the bit-level integer coding rules (page 164).

int fit_bits(int x, int n)
{
	unsigned y = (unsigned) x >> n-1 >> 1;
	return !y;
}

2.71
You just started working for a company that is implementing a set of procedures to operate on a data structure where 4 signed bytes are packed into a 32-bit unsigned. Bytes within the word are numbered from 0 (least significant) to 3(most significant). You have been assigned the task of implementing a function for a machine using two’s-complement arithmetic and arithmetic right shifts with the following prototype:
/* Declaration of data type where 4 bytes are packed 
into an unsigned */ 
typedef unsigned packed_t; 
/* Extract byte from word.Return as signed integer */
int xbyte(packed_t word, int bytenum); 
That is, the function will extract the designated byte and sign extend it to be a 32-bit int.
Your predecessor (who was fired for incompetence) wrote the following code:
/* Failed attempt at xbyte */ 
int xbyte(packed_t word, int bytenum) 
{ 
	return (word >> (bytenum << 3)) & 0xFF; 
}
A. What is wrong with this code?
B. Give a correct implementation of the function that uses only left and right shifts, along with one subtraction.

A：它不能对复制进行正确的扩展
B：

 int xbyte(packed_t word, int bytenum)
  {
    int size = sizeof(unsigned);
    int leftshifts = (size - 1 - bytenum) << 3;
    int rightshifts = (size - 1) << 3;
    return (int)(word << leftshifts) >> rightshifts;
}

2.72
You are given the task of writing a function that will copy an integer val into a buffer buf, but it should do so only if enough space is available in the buffer.
Here is the code you write:
/* Copy integer into buffer if space is available */
/* WARNING: The following code is buggy */ 
void copy_int(int val, void *buf, int maxbytes) { 
	if (maxbytes-sizeof(val) >= 0) 
		memcpy(buf, (void *) &val, sizeof(val));
} 
This code makes use of the library function memcpy. Although its use is a bit artificial here, where we simply want to copy an int, it illustrates an approach commonly used to copy larger data structures.
You carefully test the code and discover that it always copies the value to the buffer, even when maxbytes is too small.
A. Explain why the conditional test in the code always succeeds. Hint: The sizeof operator returns a value of type size_t.
B. Show how you can rewrite the conditional test to make it work properly.

A：maxbayes为int型有符号数，sizeof()返回值为size_t类型，是无符号数。两者进行运算，运算结果视为无符号数，故二者的差恒>=0；
B：

void copy_int(int val, void *buf, int maxbytes)
 { 
 	if (maxbytes-(int)sizeof(val) >= 0)
 		 memcpy(buf, (void *) &val, sizeof(val));
 }

2.73
Write code for a function with the following prototype:
/* Addition that saturates to TMin or TMax */
 int saturating_add(int x, int y);
Instead of overflowing the way normal two’s-complement addition does, saturating addition returns TMax when there would be positive overflow, and TMin when there would be negative overflow. Saturating arithmetic is commonly used in programs that perform digital signal processing.
Your function should follow the bit-level integer coding rules (page 164).

才疏学浅，不会

2.74
Write a function with the following prototype:
/* Determine whether arguments can be subtracted without overflow */ 
int tsub_ok(int x, int y); 
This function should return 1 if the computation x-y does not overflow.

int tsub_ok(int x, int y)
{
	y = ~y + 1;
	int res = x + y;
	unsigned sign1 = x ^ y;
	unsigned sign2 = x ^ res;
	return !((~sign1 & sign2) >> ((sizeof(x) - 1) << 3 - 1));
}

2.75~2.79

2.75
Suppose we want to compute the complete 2w-bit representation of x.y, where both x and y are unsigned, on a machine for which data type unsigned is w bits.The low-order w bits of the product can be computed with the expression x*y, so we only require a procedure with prototype
unsigned unsigned_high_prod(unsigned x, unsigned y);
that computes the high-order w bits of x.y for unsigned variables.
We have access to a library function with prototype
int signed_high_prod(int x, int y)
that computes the high-order w bits of x.y for the case where x and y are in two’scomplement form. Write code calling this procedure to implement the function for unsigned arguments. Justify the correctness of your solution.
Hint: Look at the relationship between the signed product x.y and the unsigned product x?.y?in the derivation of Equation 2.18.

unsigned unsigned_high_prod(unsigned x, unsigned y)
{
	int w = sizeof(x) << 3;
	unsigned x_sign = x >> (w - 1);
	unsigned y_sign = y >> (w - 1);
	return (signed_high_prod(x, y) + x_sign * y + y_sign * x);
}

2.76
The library function calloc has the following declaration:
void *calloc(size_t nmemb, size_t size);
According to the library documentation, “The calloc function allocates memory for an array of nmemb elements of size bytes each. The memory is set to zero. If nmemb or size is zero, then calloc returns NULL.”
Write an implementation of calloc that performs the allocation by a call to malloc and sets the memory to zero via memset. Your code should not have any vulnerabilities due to arithmetic overflow, and it should work correctly regardless of the number of bits used to represent data of type size_t.
As a reference, functions malloc and memset have the following declarations:
void *malloc(size_t size);
void *memset(void *s, int c, size_t n);

void* calloc(size_t nmemb, size_t size)
{
	if (!nmemb || !size) return NULL;

	size_t space = nmemb * size;
	if (space / size != nmemb) return NULL;

	void* p = malloc(space);
	if(!p) return memset(p, 0, space);
	return NULL;
}

2.77
Suppose we are given the task of generating code to multiply integer variable x by various different constant factors K. To be efficient, we want to use only the operations +, -, and <<. For the following values of K, write C expressions to perform the multiplication using at most three operations per expression.
A. K = 17
B. K = −7
C. K = 60
D. K = −112

A: return (x << 4) + x;
B: return x - (x << 3);
C: return (x<<6) - (x<<2);
D: return (x<<4) - (x<<7);

Write code for a function with the following prototype:
/* Divide by power of 2. Assume 0 <= k < w-1 */
int divide_power2(int x, int k);
The function should compute x/2^k with correct rounding, and it should follow the bit-level integer coding rules (page 164).

int divide_power2(int x, int k)
{
	int add = x >> (sizeof(x)<<3) - 1;
	int mask = add << k;
	add = add & ~mask;
	return x + add >> k;
}

2.79
Write code for a function mul3div4 that, for integer argument x, computes 3 ∗x/4 but follows the bit-level integer coding rules (page 164). Your code should replicate the fact that the computation 3*x can cause overflow.

int mul3div4(int x)
{
	x = x + x << 1;
	int k = 2;
	int add = x >> (sizeof(x)<<3) - 1;
	int mask = add << k;
	add = add & ~mask;
	return x + add >> k;
}

2.80~2.84

2.80
Write code for a function threefourths that, for integer argument x, computes the value of3/4x, rounded toward zero. It should not overflow. Your function should follow the bit-level integer coding rules (page 164).

int threefourths(int x)
{
	int k = 2;
	int add = x >> (sizeof(x)<<3) - 1;
	int mask = add << k;
	add = add & ~mask;
	x = x + add >> k;
	return x + x << 1;
}

2.81
Write C expressions to generate the bit patterns that follow, where akrepresents k repetitions of symbol a. Assume a w-bit data type. Your code may contain references to parameters j and k, representing the values of j and k, but not a parameter representing w.
A. 1^w−k0^k
B. 0^w−k−j1^k0^j

A: ~0 << k;
B: ~ (~0 << k) << j

2.82
We are running programs where values of type int are 32 bits. They are represented in two’s complement, and they are right shifted arithmetically. Values of type unsigned are also 32 bits.
We generate arbitrary values x and y, and convert them to unsigned values as follows:
/* Create some arbitrary values */
int x = random(); 
int y = random();
/* Convert to unsigned */
unsigned ux = (unsigned) x;
unsigned uy = (unsigned) y; 
For each of the following C expressions, you are to indicate whether or not the expression always yields 1. If it always yields 1, describe the underlying mathematical principles. Otherwise, give an example of arguments that make it yield 0.
A. (x<y) == (-x>-y)
B. ((x+y)<<4) + y-x == 17y+15x
C. ~x + ~y+1 == ~(x+y)
D. (ux-uy) == -(unsigned)(y-x)
E. ((x >> 2) << 2) <= x

A：错，x = Tmin 时错误
B：~~当x右移四位会溢出，而X15不会溢出时不等~~
一开始以为会溢出导致错误，但后面看别人的答案好像不是这样。我猜是这样的：假设进位则w+1位会多出来一个之后会被截断的1，但是减掉x之后之前进位的1又退化位了0，所以不会产生问题。另外，如果两者同时溢出，那么截断剩下来的部分也理应相等。
C：对
D：对
E：对

2.83
Consider numbers having a binary representation consisting of an infinite string of the form 0.yyyyyy. . ., where y is a k-bit sequence. For example, the binary representation of 1/3 is 0.01010101. . .(y = 01), while the representation of 1/5`is 0.001100110011. . .(y = 0011).
A. Let Y = B2Uk(y), that is, the number having binary representation y. Give a formula in terms of Y and k for the value represented by the infinite string.
Hint: Consider the effect of shifting the binary point k positions to the right.
B. What is the numeric value of the string for the following values of y? (a) 101 (b) 0110 © 010011

2.84
Fill in the return value for the following procedure, which tests whether its first argument is less than or equal to its second. Assume the function f2u returns an unsigned 32-bit number having the same bit representation as its floating-point argument. You can assume that neither argument is NaN. The two flavors of zero, +0 and −0, are considered equal.
int float_le(float x, float y) {
	unsigned ux = f2u(x); 
	unsigned uy = f2u(y);
	/* Get the sign bits */
	unsigned sx = ux >> 31;
	unsigned sy = uy >> 31; 
	/* Give an expression using only ux, uy, sx, and sy */
	return ____; 
}

sx>sy || (sx>0 && sy >0 && ux<uy) || (sx==0 &&y ==0 && ux>uy) || (ux<<1 ==0 && uy<<1 ==0)

2.85~2.89

2.85
Given a floating-point format with a k-bit exponent and an n-bit fraction, write formulas for the exponent E, the significand M, the fraction f , and the value V for the quantities that follow. In addition, describe the bit representation.
A. The number 7.0
B. The largest odd integer that can be represented exactly
C. The reciprocal of the smallest positive normalized value

A: E:3; M:0b1.11; f:0b0.11; V:7.0; bit presentation:0 10^k-21 110^n-2
B: E:n; M:0b1.1^n-1; f:0b0.1ⁿ; V:2ⁿ-1; bit presentation:0 + (n+2^k-1-1的二进制表示) +1ⁿ
C:最小正规范数位: E:2^k-1+n-2; M:0b1.0 f:0b0.0; V:~~我不知道怎么打出来所以不打了~~

2.86
Intel-compatible processors also support an “extended-precision” floating-point format with an 80-bit word divided into a sign bit, k = 15 exponent bits, a single integer bit, and n = 63 fraction bits. The integer bit is an explicit copy of the implied bit in the IEEE floating-point representation. That is, it equals 1 for normalized values and 0 for denormalized values. Fill in the following table giving the approximate values of some “interesting” numbers in this format:
|Description|Value|Decimal|
|Smallest positive denormalied|–|--|
|Smallest positive normalied|–|--|
|Largest normalized|–|--|
This format can be used in C programs compiled for Intel-compatible machines by declaring the data to be of type long double. However, it forces the compiler to generate code based on the legacy 8087 floating-point instructions.
The resulting program will most likely run much slower than would be the case for data type float or double.

Smallest positive denormalied: Value: 0 0¹⁵ 0 0⁶²1; Decimal:2^-16445(也有看到说是2^-4145的)
Smallest positive normalied:Value:0 0¹⁴1 1 0⁶³; Decimal:2^-16382(也有看到说是2^-4082的)
Largest normalized:Value: 0 1¹⁴0 1 1⁶³; Decimal:2¹⁶³⁸⁴(1-2^-64)(也有看到说是4083*(2-2^-63)的)

2.87
The 2008 version of the IEEE floating-point standard, named IEEE 754-2008, includes a 16-bit “half-precision” floating-point format. It was originally devised by computer graphics companies for storing data in which a higher dynamic range is required than can be achieved with 16-bit integers. This format has 1 sign bit, 5 exponent bits (k = 5), and 10 fraction bits (n = 10). The exponent bias is 25−1− 1 = 15.
Fill in the table that follows for each of the numbers given, with the following instructions for each column:
Hex: The four hexadecimal digits describing the encoded form.
M: The value of the significand. This should be a number of the form x orx y, where x is an integer and y is an integral power of 2. Examples include 0, 67 64, and 1 256.
E: The integer value of the exponent.
V : The numeric value represented. Use the notation x or x × 2z, where x and z are integers.
D: The (possibly approximate) numerical value, as is printed using the %f formatting specification of printf.
As an example, to represent the number7 8, we would have s = 0, M =7 4, and E = −1. Our number would therefore have an exponent field of 011102 (decimal value 15 − 1 = 14) and a significand field of 11000000002, giving a hex representation 3B00. The numerical value is 0.875.
You need not fill in entries marked —.

| Description | Hex | M | E | V | D |
| -0 | 8000 | 0 | -14 | -0 | -0.0 |
| Smallest value > 2 | 4001 | 1025/1024 | 1 | 1025x2^-9 | 2.001953 |
| 512 | 6000 | 1 | 9 | 512 | 512.000000 |
| Largest denormalized | 03FF | 1023/1024 | -14 | 1023x2^-24 | 0.000061 |
| negative infinite | FC00 | - | - | -inf | -inf |
| 3BB0 | 3BB0 | 123/64 | -1 | 123/128 | 0.960938 |

2.88
Consider the following two 9-bit floating-point representations based on the IEEE floating-point format.

Format A There is 1 sign bit.
There are k = 5 exponent bits. The exponent bias is 15.
There are n = 3 fraction bits.
Format B There is 1 sign bit.
There are k = 4 exponent bits. The exponent bias is 7.
There are n = 4 fraction bits.

In the following table, you are given some bit patterns in format A, and your task is to convert them to the closest value in format B. If rounding is necessary you should round toward +∞. In addition, give the values of numbers given by the format A and format B bit patterns. Give these as whole numbers (e.g., 17) or as fractions (e.g., 17/64 or 17/26).

| Format A | Format A | Format B | Format B |
| Bits | Value | Bits | Value |
| 1 01111 001 | -9/8 | 1 0111 0010 | -9/8 |
| 0 10110 011 | 11x2⁴ | 0 1110 0110 | 11x2⁴ |
| 1 00111 010 | -5x2^-10 | 1 0000 0101 | -5x2^-10 |
| 0 00000 111 | 7x2^-17 | 0 0000 0001 | 2^-10 |
| 1 11100 000 | -2¹³ | 1 1110 1111 | -31x2³ |
| 0 10111 100 | 3x2⁷ | 0 1110 1111 | 31x2³ |

2.89
We are running programs on a machine where values of type int have a 32bit two’s-complement representation. Values of type float use the 32-bit IEEE format, and values of type double use the 64-bit IEEE format.
We generate arbitrary integer values x, y, and z, and convert them to values of type double as follows:
/* Create some arbitrary values */
int x = random(); 
int y = random(); 
int z = random(); 
/* Convert to double */ 
double dx = (double) x; 
double dy = (double) y; 
double dz = (double) z; 
For each of the following C expressions, you are to indicate whether or not the expression always yields 1. If it always yields 1, describe the underlying mathematical principles. Otherwise, give an example of arguments that make it yield 0. Note that you cannot use an IA32 machine running gcc to test your answers, since it would use the 80-bit extended-precision representation for both float and double.
A. (float) x == (float) dx
B. dx - dy == (double) (x-y)
C. (dx + dy) + dz == dx + (dy + dz)
D. (dx * dy) * dz == dx * (dy * dz)
E. dx / dx == dz / dz

A：对
B：错，x-y溢出时不成立
C：对，因为dx，dy，dz由int转换而来，所以不会非常大导致第三个数被忽略。（猜的）
D：对，同上
D：错，一方分母为0一方分母不为0时

2.90~2.94

2.90
You have been assigned the task of writing a C function to compute a floatingpoint representation of 2x. You decide that the best way to do this is to directly construct the IEEE single-precision representation of the result. When x is too small, your routine will return 0.0. When x is too large, it will return +∞. Fill in the blank portions of the code that follows to compute the correct result. Assume the function u2f returns a floating-point value having an identical bit representation as its unsigned argument.
float fpwr2(int x)
{
	/* Result exponent and fraction */
	unsigned exp, frac; 
	unsigned u;
	if (x < ___) { 
		/* Too small.Return 0.0 */ 
		exp = ___; 
		frac = ___; 
	} else if (x < ___) { 
		/* Denormalized result */ 
		exp = ___; 
		frac = ___;
	} else if (x < ___) { 
		/* Normalized result. */ 
		exp = ___; 
		frac = ___; 
	} else { /* Too big.Return +oo */ 
		exp = ___; 
		frac = ___; 
	} 
	/* Pack exp and frac into 32 bits */ 
	u = exp << 23 | frac; 
	/* Return as float */ 
	return u2f(u); 
}

float fpwr2(int x) {
    /* Result exponent and fraction */
    unsigned exp, frac;
    unsigned u;

    /* E=1-bias=1-127=-126 n=23 */
    if (x < -149) {
        /* Too small. Return 0.0 */
        exp = 0;
        frac = 0;
    } else if (x < -126) {
        /* Denormalized result */
        exp = 0;
        frac = 1 << (unsigned) (x+149);
    } else if(x < 128){
        /* Normalized result */
        exp = x + 127;
        frac = 0;
    } else {
        /* Too big. Return +oo */
        exp = 0xFF;
        frac = 0;
    }

    /* Pack exp and frac into 32 bits */
    u = exp << 23 | frac;
    /* Return as float */
    return u2f(u);
}

2.91
Around 250 B.C., the Greek mathematician Archimedes proved that223/71< π <22/7.Had he had access to a computer and the standard library <math.h>, he would have been able to determine that the single-precision floating-point approximation of π has the hexadecimal representation 0x40490FDB. Of course, all of these are just approximations, since π is not rational.
A. What is the fractional binary number denoted by this floating-point value?
B. What is the fractional binary representation of22/7? Hint: See Problem 2.83.
C. At what bit position (relative to the binary point) do these two approximations to π diverge?

A.0b100 1001 0000 1111 1101 1011
B.0b11.(001)n
C.9

2.92
Following the bit-level floating-point coding rules, implement the function with the following prototype:
/* Compute -f. If f is NaN, then return f. */
float_bits float_negate(float_bits f);
For floating-point number f , this function computes −f . If f is NaN, your function should simply return f .
Test your function by evaluating it for all 2³²values of argument f and comparing the result to what would be obtained using your machine’s floating-point operations.

float_bits float_negate(float_bits f)
{
	unsigned sign = f >> 31; 
	unsigned exp = f >> 23 & 0xFF; 
	unsigned frac = f & 0x7FFFFF; 
	if (exp == 0xFF && frac != 0) 
		return f;
	
	/* Reassemble bits */
	return (!sign << 31) | (exp << 23) | frac;
}

2.93
Following the bit-level floating-point coding rules, implement the function with the following prototype:
/* Compute |f|.If f is NaN, then return f. */
float_bits float_absval(float_bits f);
For floating-point number f , this function computes |f |. If f is NaN, your function should simply return f .
Test your function by evaluating it for all 2³²values of argument f and comparing the result to what would be obtained using your machine’s floating-point operations.

float_bits float_absval(float_bits f)
{
	//unsigned sign = f >> 31; 
	unsigned exp = f >> 23 & 0xFF; 
	unsigned frac = f & 0x7FFFFF; 
	if (exp == 0xFF && frac != 0) 
		return f;
	return f & 0x7fffffff;
}

2.94
Following the bit-level floating-point coding rules, implement the function with the following prototype:
/* Compute 2*f.If f is NaN, then return f. */
float_bits float_twice(float_bits f);
For floating-point number f , this function computes 2.0.f . If f is NaN, your function should simply return f .
Test your function by evaluating it for all 2³²values of argument f and comparing the result to what would be obtained using your machine’s floating-point operations.

float_bits float_twice(float_bits f)
{
	unsigned sign = f >> 31; 
	unsigned exp = f >> 23 & 0xFF; 
	unsigned frac = f & 0x7FFFFF; 
	//NaN
	if (exp == 0xFF)
		return f;
	// Denormalized
	if (exp == 0) { 
		if (frac & 0x400000)
			exp += 1;
		else
			frac << 1;
	}
	//normalized
	else {
		exp++;
		if (exp == 0xFF)
			frac = 0;
	}
	/* Reassemble bits */ 
	return (sign << 31) | (exp << 23) | frac;
}

2.95
Following the bit-level floating-point coding rules, implement the function with the following prototype:
/* Compute 0.5*f.If f is NaN, then return f. */
float_bits float_half(float_bits f);
For floating-point number f , this function computes 0.5.f . If f is NaN, your function should simply return f .
Test your function by evaluating it for all 2³²values of argument f and comparing the result to what would be obtained using your machine’s floating-point operations.

float_bits float_negate(float_bits f)
{
	unsigned sign = f >> 31; 
	unsigned exp = f >> 23 & 0xFF; 
	unsigned frac = f & 0x7FFFFF; 
	//NaN
	if (exp == 0xFF)
		return f;
	unsigned test = frac & 0x3;
	// Denormalized
	if (exp == 0) { 
		if (test == 0x3)
			frac++;
		frac >> 1;
	}
	//normalized
	else if (exp == 1) {
		exp--;
		if (test == 0x3)
			frac++;
		frac >> 1;
		frac = frac | 1 << 22;
	}
	else {
		exp--;
	}
	/* Reassemble bits */ 
	return (sign << 31) | (exp << 23) | frac;
}

2.97
Following the bit-level floating-point coding rules, implement the function with the following prototype:
/* Compute (float) i */
float_bits float_i2f(int i);
For argument i, this function computes the bit-level representation of (float) i.
Test your function by evaluating it for all 2³²values of argument f and comparing the result to what would be obtained using your machine’s floating-point operations.

float_bits float_i2f(int i)
{
	if (!i)
		return 0;
	//sign
	unsigned sign = i >> 31;
	if (sign)
		i = ~i + 1;
	//exp
	unsigned bias = 0x7F;
	int j = 0;
	unsigned first_1 = i;
	while (first_1) {
		first_1 = first_1 >> 1;
		j++;
	}
	j--;
	unsigned exp = j + bias;

	//frac
	unsigned frac = i << 32 - j;
	unsigned add = 0;
	if (j > 23) {
		unsigned test = 1 << 31;
		unsigned rounding = i << 55 - j;
		if (rounding > test)
			add++;
		else if (rounding == test) {
			if ((i << 54 - j) == (0b11 << 30))
				add++;
		}
	}
	frac = (frac >> 9) + add;
	return (sign << 31) | (exp << 23) | frac;
}