data representation

最新推荐文章于 2020-10-27 16:24:02 发布

JUAN425

最新推荐文章于 2020-10-27 16:24:02 发布

阅读量1.3k

点赞数

分类专栏： C++

C++ 专栏收录该内容

139 篇文章 1 订阅

订阅专栏

#include <iostream>
using namespace std;

int main() {
   cout << ~0 << endl; // 按位取反，  结果输出-1
   return 0;
}

1. Number Systems

1.1 Decimal (Base 10) Number System

Decimal number system has ten symbols: 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9, called digits。

735 = 7×10^2 + 3×10^1 + 5×10^0

We shall denote a decimal number with an optional suffix D if ambiguity arises.

1.2 Binary (Base 2) Number System

Binary number system has two symbols: 0 and 1, called bits。

10110B = 1×2^4 + 0×2^3 + 1×2^2 + 1×2^1 + 0×2^0

We shall denote a binary number with a suffix B。

A binary digit is called a bit. Eight bits is called a byte。

1.3 Hexadecimal (Base 16) Number System

Hexadecimal number system uses 16 symbols: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, and F, called hex digits .

A3EH = 10×16^2 + 3×16^1 + 14×16^0

We shall denote a hexadecimal number (in short, hex) with a suffix H .

Some programming languages denote hex numbers with prefix 0x

Each hexadecimal digit is also called a hex digit.

. Each hex digit is equivalent to 4 binary bits, i.e., shorthand for 4 bits, as follows:

0H (0000B) (0D)	1H (0001B) (1D)	2H (0010B) (2D)	3H (0011B) (3D)
4H (0100B) (4D)	5H (0101B) (5D)	6H (0110B) (6D)	7H (0111B) (7D)
8H (1000B) (8D)	9H (1001B) (9D)	AH (1010B) (10D)	BH (1011B) (11D)
CH (1100B) (12D)	DH (1101B) (13D)	EH (1110B) (14D)	FH (1111B) (15D)

1.4 Conversion from Hexadecimal to Binary

Replace each hex digit by the 4 equivalent bits, for examples,

A3C5H = 1010 0011 1100 0101B
102AH = 0001 0000 0010 1010B

1.5 Conversion from Binary to Hexadecimal

Starting from the right-most bit (least-significant bit), replace each group of 4 bits by the equivalent hex digit (pad the left-most bits with zero if necessary), for examples,

1001001010B = 0010 0100 1010B = 24AH
10001011001011B = 0010 0010 1100 1011B = 22CBH

1.6 Conversion from Base r to Decimal (Base 10)

Given a n-digit base r number: dn-1 dn-2 dn-3 ... d3 d2 d1 d0 (base r), the decimal equivalent is given by:

dn-1 × r^(n-1) + dn-2 × r^(n-2) + ... + d1 × r^1 + d0 × r^0

For examples,

A1C2H = 10×16^3 + 1×16^2 + 12×16^1 + 2 = 41410 (base 10)
10110B = 1×2^4 + 1×2^2 + 1×2^1 = 22 (base 10)

1.7 Conversion from Decimal (Base 10) to Base r

Use repeated division/remainder. For example,

To convert 261D to hexadecimal:
  261/16 => quotient=16 remainder=5
  16/16  => quotient=1  remainder=0
  1/16   => quotient=0  remainder=1 (quotient=0 stop)
  Hence, 261D = 105H

The above procedure is actually applicable to conversion between any 2 base systems. For example,

To convert 1023(base 4) to base 3:
  1023(base 4)/3 => quotient=25D remainder=0
  25D/3          => quotient=8D  remainder=1
  8D/3           => quotient=2D  remainder=2
  2D/3           => quotient=0   remainder=2 (quotient=0 stop)
  Hence, 1023(base 4) = 2210(base 3)

1.8 General Conversion between 2 Base Systems with Fractional Part

Separate the integral and the fractional parts.
For the integral part, divide by the target radix repeatably, and collect the ramainder in reverse order.
For the fractional part, multiply the fractional part by the target radix repeatably, and collect the integral part in the same order.

Convert 18.6875D to binary
Integral Part = 18D
  18/2 => quotient=9 remainder=0
  9/2  => quotient=4 remainder=1
  4/2  => quotient=2 remainder=0
  2/2  => quotient=1 remainder=0
  1/2  => quotient=0 remainder=1 (quotient=0 stop)
  Hence, 18D = 10010B
Fractional Part = .6875D
  .6875*2=1.375 => whole number is 1
  .375*2=0.75   => whole number is 0
  .75*2=1.5     => whole number is 1
  .5*2=1.0      => whole number is 1
  Hence .6875D = .1011B
Therefore, 18.6875D = 10010.1011B

Example 2:

Convert 18.6875D to hexadecimal
Integral Part = 18D
  18/16 => quotient=1 remainder=2
  1/16  => quotient=0 remainder=1 (quotient=0 stop)
  Hence, 18D = 12H
Fractional Part = .6875D
  .6875*16=11.0 => whole number is 11D (BH)
  Hence .6875D = .BH
Therefore, 18.6875D = 12.BH

2. Computer Memory & Data Representation

Computer uses a fixed number of bits to represent a piece of data, which could be a number, a character, or others. A n-bit storage location can represent up to 2^n distinct entities. For example, a 3-bit memory location can hold one of these eight binary patterns: 000, 001, 010, 011, 100, 101, 110, or 111. Hence, it can represent at most 8 distinct entities. You could use them to represent numbers 0 to 7, numbers 8881 to 8888, characters 'A' to 'H', or up to 8 kinds of fruits like apple, orange, banana; or up to 8 kinds of animals like lion, tiger, etc.

Integers, for example, can be represented in 8-bit, 16-bit, 32-bit or 64-bit. You, as the programmer, choose an appropriate bit-length for your integers. Your choice will impose constraint on the range of integers that can be represented. Besides the bit-length, an integer can be represented in various representation schemes, e.g., unsigned vs. signed integers. An 8-bit unsigned integer has a range of 0 to 255, while an 8-bit signed integer has a range of -128 to 127 - both representing 256 distinct numbers.

It is important to note that a computer memory location merely stores a binary pattern. It is entirely up to you, as the programmer, to decide on how these patterns are to be interpreted. For example, the 8-bit binary pattern "0100 0001B" can be interpreted as an unsigned integer 65, or an ASCII character 'A', or some secret information known only to you. In other words, you have to first decide how to represent a piece of data in a binary pattern before the binary patterns make sense. The interpretation of binary pattern is called data representation or encoding. Furthermore, it is important that the data representation schemes are agreed-upon by all the parties, i.e., industrial standards need to be formulated and straightly followed.

3. Integer Representation

Computers use a fixed number of bits to represent an integer. The commonly-used bit-lengths for integers are 8-bit, 16-bit, 32-bit or 64-bit. Besides bit-lengths, there are two representation schemes for integers:

Unsigned Integers: can represent zero and positive integers.
Signed Integers: can represent zero, positive and negative integers. Three representation schemes had been proposed for signed integers:
1. Sign-Magnitude representation
2. 1's Complement representation
3. 2's Complement representation

You, as the programmer, need to decide on the bit-length and representation scheme for your integers, depending on your application's requirements.

3.1 n-bit Unsigned Integers

Unsigned integers can represent zero and positive integers, but not negative integers. The value of an unsigned integer is interpreted as "the magnitude of its underlying binary pattern".

Example 1: Suppose that n=8 and the binary pattern is 0100 0001B, the value of this unsigned integer is 1×2^0 + 1×2^6 = 65D.

Example 2: Suppose that n=16 and the binary pattern is 0001 0000 0000 1000B, the value of this unsigned integer is 1×2^3 + 1×2^12 = 4104D.

Example 3: Suppose that n=16 and the binary pattern is 0000 0000 0000 0000B, the value of this unsigned integer is 0.

An n-bit pattern can represent 2^n distinct integers. An n-bit unsigned integer can represent integers from 0 to (2^n)-1, as tabulated below:

n	Minimum	Maximum
8	0	(2^8)-1 (=255)
16	0	(2^16)-1 (=65,535)
32	0	(2^32)-1 (=4,294,967,295) (9+ digits)
64	0	(2^64)-1 (=18,446,744,073,709,551,615) (19+ digits)

3.2 Signed Integers

Signed integers can represent zero, positive integers, as well as negative integers. Three representation schemes are available for signed integers:

Sign-Magnitude representation
1's Complement representation
2's Complement representation

In all the above three schemes, the most-significant bit (msb) is called the sign bit. The sign bit is used to represent the sign of the integer - with 0 for positive integers and 1 for negative integers. The magnitude of the integer, however, is interpreted differently in different schemes.

3.3 n-bit Sign Integers in Sign-Magnitude Representation

In sign-magnitude representation:

The most-significant bit (msb) is the sign bit, with value of 0 representing positive integer and 1 representing negative integer.
The remaining n-1 bits represents the magnitude (absolute value) of the integer. The absolute value of the integer is interpreted as "the magnitude of the (n-1)-bit binary pattern".

Example 1: Suppose that n=8 and the binary representation is 0 100 0001B.
   Sign bit is 0 ⇒ positive
   Absolute value is 100 0001B = 65D
   Hence, the integer is +65D

Example 2: Suppose that n=8 and the binary representation is 1 000 0001B.
   Sign bit is 1 ⇒ negative
   Absolute value is 000 0001B = 1D
   Hence, the integer is -1D

Example 3: Suppose that n=8 and the binary representation is 0 000 0000B.
   Sign bit is 0 ⇒ positive
   Absolute value is 000 0000B = 0D
   Hence, the integer is +0D

Example 4: Suppose that n=8 and the binary representation is 1 000 0000B.
   Sign bit is 1 ⇒ negative
   Absolute value is 000 0000B = 0D
   Hence, the integer is -0D

The drawbacks of sign-magnitude representation are:

There are two representations (0000 0000B and 1000 0000B) for the number zero, which could lead to inefficiency and confusion.
Positive and negative integers need to be processed separately.

3.4 n-bit Sign Integers in 1's Complement Representation

In 1's complement representation:

Again, the most significant bit (msb) is the sign bit, with value of 0 representing positive integers and 1 representing negative integers.
The remaining n-1 bits represents the magnitude of the integer, as follows:
- for positive integers, the absolute value of the integer is equal to "the magnitude of the (n-1)-bit binary pattern".
- for negative integers, the absolute value of the integer is equal to "the magnitude of the complement (inverse) of the (n-1)-bit binary pattern" (hence called 1's complement).

Example 1: Suppose that n=8 and the binary representation 0 100 0001B.
   Sign bit is 0 ⇒ positive
   Absolute value is 100 0001B = 65D
   Hence, the integer is +65D

Example 2: Suppose that n=8 and the binary representation 1 000 0001B.
   Sign bit is 1 ⇒ negative
   Absolute value is the complement of 000 0001B, i.e., 111 1110B = 126D
   Hence, the integer is -126D

Example 3: Suppose that n=8 and the binary representation 0 000 0000B.
   Sign bit is 0 ⇒ positive
   Absolute value is 000 0000B = 0D
   Hence, the integer is +0D

Example 4: Suppose that n=8 and the binary representation 1 111 1111B.
   Sign bit is 1 ⇒ negative
   Absolute value is the complement of 111 1111B, i.e., 000 0000B = 0D
   Hence, the integer is -0D

Again, the drawbacks are:

There are two representations (0000 0000B and 1111 1111B) for zero.
The positive integers and negative integers need to be processed separately.

3.5 n-bit Sign Integers in 2's Complement Representation

In 2's complement representation:

Again, the most significant bit (msb) is the sign bit, with value of 0 representing positive integers and 1 representing negative integers.
The remaining n-1 bits represents the magnitude of the integer, as follows:
- for positive integers, the absolute value of the integer is equal to "the magnitude of the (n-1)-bit binary pattern".
- for negative integers, the absolute value of the integer is equal to "the magnitude of the complement of the (n-1)-bit binary pattern plus one" (hence called 2's complement).

Example 1: Suppose that n=8 and the binary representation 0 100 0001B.
   Sign bit is 0 ⇒ positive
   Absolute value is 100 0001B = 65D
   Hence, the integer is +65D

Example 2: Suppose that n=8 and the binary representation 1 000 0001B.
   Sign bit is 1 ⇒ negative
   Absolute value is the complement of 000 0001B plus 1, i.e., 111 1110B + 1B = 127D
   Hence, the integer is -127D

Example 3: Suppose that n=8 and the binary representation 0 000 0000B.
   Sign bit is 0 ⇒ positive
   Absolute value is 000 0000B = 0D
   Hence, the integer is +0D

Example 4: Suppose that n=8 and the binary representation 1 111 1111B.
   Sign bit is 1 ⇒ negative
   Absolute value is the complement of 111 1111B plus 1, i.e., 000 0000B + 1B = 1D
   Hence, the integer is -1D

3.6 Computers use 2's Complement Representation for Signed Integers

We have discussed three representations for signed integers: signed-magnitude, 1's complement and 2's complement. Computers use 2's complement in representing signed integers. This is because:

There is only one representation for the number zero in 2's complement, instead of two representations in sign-magnitude and 1's complement.
Positive and negative integers can be treated together in addition and subtraction. Subtraction can be carried out using the "addition logic".

Example 1: Addition of Two Positive Integers: Suppose that n=8, 65D + 5D = 70D

65D →    0100 0001B
 5D →    0000 0101B(+
          0100 0110B    → 70D (OK)

Example 2: Subtraction is treated as Addition of a Positive and a Negative Integers: Suppose that n=8, 5D - 5D = 65D + (-5D) = 60D

65D →    0100 0001B
-5D →    1111 1011B(+
          0011 1100B    → 60D (discard carry - OK)

Example 3: Addition of Two Negative Integers: Suppose that n=8, -65D - 5D = (-65D) + (-5D) = -70D

-65D →    1011 1111B
 -5D →    1111 1011B(+
           1011 1010B    → -70D (discard carry - OK)

Because of the fixed precision (i.e., fixed number of bits), an n-bit 2's complement signed integer has a certain range. For example, for n=8, the range of 2's complement signed integers is -128 to +127. During addition (and subtraction), it is important to check whether the result exceeds this range, in other words, whether overflow or underflow has occurred.

Example 4: Overflow: Suppose that n=8, 127D + 2D = 129D (overflow - beyond the range)

127D →    0111 1111B
  2D →    0000 0010B(+
           1000 0001B    → -127D (wrong)

Example 5: Underflow: Suppose that n=8, -125D - 5D = -130D (underflow - below the range)

-125D →    1000 0011B
  -5D →    1111 1011B(+
            0111 1110B    → +126D (wrong)

3.7 Range of n-bit 2's Complement Signed Integers

An n-bit 2's complement signed integer can represent integers from -2^(n-1) to +2^(n-1)-1, as tabulated. Take note that the scheme can represent all the integers within the range, without any gap. In other words, there is no missing integers within the supported range.

n	minimum	maximum
8	-(2^7) (=-128)	+(2^7)-1 (=+127)
16	-(2^15) (=-32,768)	+(2^15)-1 (=+32,767)
32	-(2^31) (=-2,147,483,648)	+(2^31)-1 (=+2,147,483,647)(9+ digits)
64	-(2^63) (=-9,223,372,036,854,775,808)	+(2^63)-1 (=+9,223,372,036,854,775,807)(18+ digits)