目录
Background: Fractional binary numbers (二进制小数)
Representation 表示
- Bits to right of “binary point” represent fractional powers of 2
- Represents rational number
IEEE floating point standard: Definition
Floating Point Representation
Numerical Form
(-1)s2M2E
- Sign bit s determines whether number is negative or positive 符号位决定正数还是负数
- Significand M normally a fractional value in range [1.0,2.0). 有效位M处于1.0到2.0
- Exponent E weights value by power of two
Encoding
- MSB sis sign bit s
- expfield encodes E(but is not equal to E)
- fracfield encodes M (but is not equal to M)
Precision options
“Normalized” Values 规格化的值
- When: exp ≠ 000…0 and exp ≠ 111…1
- Exponent coded asa biasedvalue: E = Exp–Bias
– Exp: unsigned value of expfield
– Bias= 2k-1 -1, where kis number of exponent bits
- Single precision: 127 (Exp: 1…254, E: -126…127)
- Double precision: 1023 (Exp: 1…2046, E: -1022…1023)
- Significandcoded with implied leading 1: M = 1.xxx…x2
– xxx…x: bits of fracfield
– Minimum whenfrac=000…0(M= 1.0)
– Maximum whenfrac=111…1(M= 2.0 –ε)
– Get extra leading bit for “free”
Example
解释:
由于是正数,所有s位为0,然后exp加上bias后为140,表示成2进制为100011002,然后考虑frac部分,由于只考虑小数位,忽略掉.前的数字1
Denormalized Values 非规范化值
- Condition:exp = 0000…0
- Exponentvalue:E=1–Bias(instead of E=0–Bias)
- Significand coded with implied leading 0: M = 0.xxx…x2
- xxx…x: bits of frac
- Cases
– exp = 000…0, frac = 000…0
— Represents zero value
— Note distinct values: +0 and –0
– exp = 000…0, frac ≠ 000…0
— Numbers closest to 0.0
— Equispaced 平均间隔
1)提供了一种表示值0的方法
2)表示那些非常接近于0.0的数,对“逐渐溢出”属性的支持
Special Values
- Condition: exp = 1111…1
- Case:exp=111…1,frac=000…0
– Represents value ∞ (infinity)
– Operation that overflows
– Both positive and negative
– E.g., 1.0/0.0 = −1.0/−0.0 = +∞, 1.0/−0.0 = −∞ - Case:exp=111…1,frac≠000…0
– Not-a-Number (NaN)
– Represents case when no numeric value can be determined
– E.g., sqrt(–1), ∞ − ∞, ∞ × 0# Example and properties
Example and properties
Dynamic Range(Positive Only)
Distruibution of Values
Rounding, addition, multiplication
Rounding
Rounding Modes
Example
- 如果刚好是1/2的情况,就让尾位为0,因为是Round to nearest 1/4
- “Even” when last significant bit is 0
- “Half way” when bits to right of rounding position = 100…2
FP Multiplication
- (–1)s1 M1 2E1 x (–1)s2 M2 2E2
- Exact Result: (–1)s M 2E
– Sign s: s1^s2
– SignificandM: M1x M2
– Exponent E: E1+E2 - Fixing
– If M≥ 2, shift Mright, increment E
– If E out of range, overflow
– Round M to fit frac precision - Implementation
– Biggest chore is multiplying significands
Floating point in C
- C Guarantees Two Levels
– float single precision
– double double precision - Conversions/Casting
–Casting between int, float, and doublechanges bit representation
– double/float→ int
– int→ double
– int→ float
Summary
IEEE 的浮点数有明确的数学意义
表达形式为 M X 2E