Decimal to Floating-Point Conversions

The Conversion Procedure

The rules for converting a decimal number into floating point are as follows:
  1. Convert the absolute value of the number to binary, perhaps with a fractional part after the binary point. This can be done by converting the integral and fractional parts separately. The integral part is converted with the techniques examined previously. The fractional part can be converted by multiplication. This is basically the inverse of the division method: we repeatedly multiply by 2, and harvest each one bit as it appears left of the decimal.
  2. Append × 20 to the end of the binary number (which does not change its value).
  3. Normalize the number. Move the binary point so that it is one bit from the left. Adjust the exponent of two so that the value does not change.
  4. Place the mantissa into the mantissa field of the number. Omit the leading one, and fill with zeros on the right.
  5. Add the bias to the exponent of two, and place it in the exponent field. The bias is 2k−1 − 1, where k is the number of bits in the exponent field. For the eight-bit format, k = 3, so the bias is 23−1 − 1 = 3. For IEEE 32-bit, k = 8, so the bias is 28−1 − 1 = 127.
  6. Set the sign bit, 1 for negative, 0 for positive, according to the sign of the original number.

Using The Conversion Procedure

  • Convert 2.625 to our 8-bit floating point format.
    1. The integral part is easy, 210 = 102. For the fractional part:
      0.625× 2 =1.25 1Generate 1 and continue with the rest.
      0.25× 2 =0.5 0Generate 0 and continue.
      0.5× 2 =1.0 1Generate 1 and nothing remains.
      So 0.62510 = 0.1012, and 2.62510 = 10.1012.
    2. Add an exponent part: 10.1012 = 10.1012 × 20.
    3. Normalize: 10.1012 × 20 = 1.01012 × 21.
    4. Mantissa: 0101
    5. Exponent: 1 + 3 = 4 = 1002.
    6. Sign bit is 0.
    The result is 01000101. Represented as hex, that is 4516.
  • Convert -4.75 to our 8-bit floating point format.
    1. The integral part is 410 = 1002. The fractional:
      0.75× 2 =1.5 1Generate 1 and continue with the rest.
      0.5× 2 =1.0 1Generate 1 and nothing remains.
      So 4.7510 = 100.112.
    2. Normalize: 100.112 = 1.00112 × 22.
    3. Mantissa is 0011, exponent is 2 + 3 = 5 = 1012, sign bit is 1.
    So -4.75 is 11010011 = d316
  • Convert 0.40625 to our 8-bit floating point format.
    1. Converting:
      0.40625× 2 =0.8125 0Generate 0 and continue.
      0.8125× 2 =1.625 1Generate 1 and continue with the rest.
      0.625× 2 =1.25 1Generate 1 and continue with the rest.
      0.25× 2 =0.5 0Generate 0 and continue.
      0.5× 2 =1.0 1Generate 1 and nothing remains.
      So 0.4062510 = 0.011012.
    2. Normalize: 0.011012 = 1.1012 × 2-2.
    3. Mantissa is 1010, exponent is -2 + 3 = 1 = 0012, sign bit is 0.
    So 0.40625 is 00011010 = 1a16
  • Convert -12.0 to our 8-bit floating point format.
    1. 1210 = 11002.
    2. Normalize: 1100.02 = 1.12 × 23.
    3. Mantissa is 1000, exponent is 3 + 3 = 6 = 1102, sign bit is 1.
    So -12.0 is 11101000 = e816
  • Convert decimal 1.7 to our 8-bit floating point format.
    1. The integral part is easy, 110 = 12. For the fractional part:
      0.7× 2 =1.4 1Generate 1 and continue with the rest.
      0.4× 2 =0.8 0Generate 0 and continue.
      0.8× 2 =1.6 1Generate 1 and continue with the rest.
      0.6× 2 =1.2 1Generate 1 and continue with the rest.
      0.2× 2 =0.4 0Generate 0 and continue.
      0.4× 2 =0.8 0Generate 0 and continue.
      0.8× 2 =1.6 1Generate 1 and continue with the rest.
      0.6× 2 =1.2 1Generate 1 and continue with the rest.
      The reason why the process seems to continue endlessly is that it does. The number 7/10, which makes a perfectly reasonable decimal fraction, is a repeating fraction in binary, just as the faction 1/3 is a repeating fraction in decimal. (It repeats in binary as well.) We cannot represent this exactly as a floating point number. The closest we can come in four bits is .1011. Since we already have a leading 1, the best eight-bit number we can make is 1.1011.
    2. Already normalized: 1.10112 = 1.10112 × 20.
    3. Mantissa is 1011, exponent is 0 + 3 = 3 = 0112, sign bit is 0.
    The result is 00111011 = 3b16. This is not exact, of course. If you convert it back to decimal, you get 1.6875.
  • Convert -1313.3125 to IEEE 32-bit floating point format.
    1. The integral part is 131310 = 101001000012. The fractional:
      0.3125× 2 =0.625 0Generate 0 and continue.
      0.625× 2 =1.25 1Generate 1 and continue with the rest.
      0.25× 2 =0.5 0Generate 0 and continue.
      0.5× 2 =1.0 1Generate 1 and nothing remains.
      So 1313.312510 = 10100100001.01012.
    2. Normalize: 10100100001.01012 = 1.010010000101012 × 210.
    3. Mantissa is 01001000010101000000000, exponent is 10 + 127 = 137 = 100010012, sign bit is 1.
    So -1313.3125 is 11000100101001000010101000000000 = c4a42a0016
  • Convert 0.1015625 to IEEE 32-bit floating point format.
    1. Converting:
      0.1015625× 2 =0.203125 0Generate 0 and continue.
      0.203125× 2 =0.40625 0Generate 0 and continue.
      0.40625× 2 =0.8125 0Generate 0 and continue.
      0.8125× 2 =1.625 1Generate 1 and continue with the rest.
      0.625× 2 =1.25 1Generate 1 and continue with the rest.
      0.25× 2 =0.5 0Generate 0 and continue.
      0.5× 2 =1.0 1Generate 1 and nothing remains.
      So 0.101562510 = 0.00011012.
    2. Normalize: 0.00011012 = 1.1012 × 2-4.
    3. Mantissa is 10100000000000000000000, exponent is -4 + 127 = 123 = 011110112, sign bit is 0.
    So 0.1015625 is 00111101110100000000000000000000 = 3dd0000016
  • Convert 39887.5625 to IEEE 32-bit floating point format.
    1. The integral part is 3988710 = 10011011110011112. The fractional:
      0.5625× 2 =1.125 1Generate 1 and continue with the rest.
      0.125× 2 =0.25 0Generate 0 and continue.
      0.25× 2 =0.5 0Generate 0 and continue.
      0.5× 2 =1.0 1Generate 1 and nothing remains.
      So 39887.562510 = 1001101111001111.10012.
    2. Normalize: 1001101111001111.10012 = 1.00110111100111110012 × 215.
    3. Mantissa is 00110111100111110010000, exponent is 15 + 127 = 142 = 100011102, sign bit is 0.
    So 39887.5625 is 01000111000110111100111110010000 = 471bcf9016

Source: http://sandbox.mc.edu/~bennet/cs110/flt/dtof.html

类型
存储位数
     
偏置值(Bias)
 
 
数符(s)
阶码(exp)
尾数小数部分(frac)
总位数
十六进制
十进制
短浮点数(Single,float)
1位
8位
23位
32位
7FH
+127
长浮点数(Double)
1位
11位
52位
64位
3FFH
+1023
临时浮点数(扩展精度浮点数)
1位
15位
64位
80位
3FFFH
+16383
Source: http://baike.baidu.com/view/1352525.htm?fr=aladdin

  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Sure, I can help you with that. To convert 1.75 to floating point representation IEEE 1958, we need to follow certain steps: Step 1: Convert the decimal number 1.75 to binary. 1. Convert the integer part 1 to binary -> 1 2. Convert the fraction part 0.75 to binary: 0.75 x 2 = 1.5 (1 goes to the bit after the comma) 0.5 x 2 = 1 (1 goes to the next bit) 0 So 1.75 in binary is represented as 1.11 Step 2: Normalize the binary number The binary number is normalized by moving the decimal point to the left so that there is only one non-zero digit before the decimal point. In this case, the binary number is already normalized. Step 3: Determine the sign, exponent, and mantissa Sign: Since the number is positive, the sign bit is 0. Exponent: The exponent can be calculated using the formula: exponent = (number of bits for exponent - 1) + bias For IEEE 1958 floating point representation, the bias value is 127, and the number of bits for the exponent is 8. In this case, the exponent is 0 + 127 = 127 Mantissa: The mantissa is the fractional part of the normalized binary number. In this case, the mantissa is 1.11, which can be represented as: 1.11 = 1 + 1/2 + 1/4 = 1.5 The mantissa is then represented in binary as: 1.5 = 1.1000... So the sign bit is 0, the exponent in binary is 01111111, and the mantissa in binary is 10000000000000000000000. Putting it all together, 1.75 in floating point representation IEEE 1958 is: 0 11111111 10000000000000000000000 I hope this helps! Is there anything else I can assist you with?

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值