=== Converting from decimal representation to binary32 format===

In general refer to the IEEE 754 standard itself for the strict conversion (including the rounding behaviour)

of a real number into its equivalent binary32 format.

 

Here we can show how to convert a base 10 real number into an IEEE 754 binary32 format

using the following outline:


* consider a real number with an integer and a fraction part such as 12.375

* convert and [[normalized number|normalize]] the integer part into [[binary numeral system|binary]]

* convert the fraction part using the following technique as shown here

* add the two results and adjust them to produce a proper final conversion

 

'''Conversion of the fractional part:'''


consider 0.375, the fractional part of 12.375. To convert it into a binary fraction, multiply the fraction by 2, take the integer part and re-multiply new fraction by 2 until a fraction of zero is found or until the precision limit is reached which is 23 fraction digits for IEEE 754 binary32 format.


0.375 x 2 = 0.750 = 0 + 0.750 => b<sub>1</sub> = 0, the integer part represents the binary fraction digit. Re-multiply 0.750 by 2 to proceed


0.750 x 2 = 1.500 = 1 + 0.500 => b<sub>2</sub> = 1


0.500 x 2 = 1.000 = 1 + 0.000 => b<sub>3</sub> = 1, fraction = 0.000, terminate


We see that (0.375)<sub>10</sub> can be exactly represented in binary as (0.011)<sub>2</sub>. Not all decimal fractions can be represented in a finite digit binary fraction. For example decimal 0.1 cannot be represented in binary exactly. So it is only approximated.


Therefore (12.375)<sub>10</sub> = (12)<sub>10</sub> + (0.375)<sub>10</sub> = (1100)<sub>2</sub> + (0.011)<sub>2</sub> = (1100.011)<sub>2</sub>


Also IEEE 754 binary32 format requires that you represent real values in <math> (1.x_1x_2...x_{23})_2 \times 2^{e}</math> format, (see `Normalized number`, `Denormalized number`) so that 1100.011 is shifted to the right by 3 digits to become <math> (1.100011)_2 \times 2^{3} </math>


Finally we can see that:  <math> (12.375)_{10} =(1.100011)_2 \times 2^{3} </math>


From which we deduce:

*  The exponent is 3 (and in the biased form it is therefore 130 = 1000 0010)

*  The fraction is 100011 (looking to the right of the binary point)


From these we can form the resulting 32 bit IEEE 754 binary32 format representation of 

12.375 as: 0-10000010-10001100000000000000000 = 41460000<sub>H</sub>


'''Note:''' consider converting 68.123 into IEEE 754 binary32 format:

Using the above procedure you expect to get 42883EF9<sub>H</sub> with the last 4 bits being 1001

However due to the default rounding behaviour of IEEE 754 format what you get is 42883EFA<sub>H</sub> whose last 4 bits are 1010 .


'''Ex 1:'''

Consider decimal 1 

We can see that:  <math> (1)_{10} =(1.0)_2 \times 2^{0} </math>


From which we deduce: 

*  The exponent is 0 (and in the biased form it is therefore 127 = 0111 1111 )

*  The fraction is 0 (looking to the right of the binary point in 1.0 is all 0 = 000...0)


From these we can form the resulting 32 bit IEEE 754 binary32 format representation of 

real number 1 as: 0-01111111-00000000000000000000000 = 3f800000<sub>H</sub>


'''Ex 2:'''

Consider a value 0.25 .

We can see that: <math> (0.25)_{10} =(1.0)_2 \times 2^{-2} </math>


From which we deduce:

* The exponent is 2 (and in the biased form it is 127+(2)= 125 = 0111 1101 )

* The fraction is  0 (looking to the right of binary point in 1.0 is all zeros)


From these we can form the resulting 32 bit IEEE 754 binary32 format representation of 

real number 0.25 as: 0-01111101-00000000000000000000000 = 3e800000<sub>H</sub>


'''Ex 3:'''

Consider a value of 0.375 .  We saw that <math> 0.375 = {(1.1)_2}\times 2^{-2} </math>


Hence after determining a representation of 0.375 as <math>{(1.1)_2}\times 2^{-2} </math>

we can proceed as above:


* The exponent is 2 (and in the biased form it is 127+(2)= 125 = 0111 1101 )

* The fraction is  1 (looking to the right of binary point in 1.1 is a single 1 = x<sub>1</sub>)


From these we can form the resulting 32 bit IEEE 754 binary32 format representation of 

real number 0.375 as: 0-01111101-10000000000000000000000 = 3ec00000<sub>H</sub>


Java代码实现如下:


public class I3E745_100p2 {

	public static void main(String[] args) {
		float value = 100.2f;//12.375f;
		int integral = (int) value;
		
		float fraction = value - integral;//小数部分  0.2
		float temp = fraction;
		int [] bits = new int[32];
		for(int i = 0; i < 31; i++){
			if(Float.compare(temp, 1.0f) == 0){
				bits[i] = 1;
				break;
			} else {
				temp *= 2;
				if(Float.compare(temp, 1.0f) == 0){
					bits[i] = 1;
					break;
				}
				if(temp - 1 > 0){
					bits[i] = 1;
					temp = temp - 1;
				} else {
					bits[i] = 0;
				}
			}
		}
		
		String integralPart = toBinaryString8(Math.abs(integral));//符号位单独判断。此处传入integal的绝对值
		//Therefore (12.375)10 = (12)10 + (0.375)10 = (1100)2 + (0.011)2 = (1100.011)2
		StringBuffer fractionPartbuf = new StringBuffer();
		for(int i = 0; i < 31; i++){
			fractionPartbuf.append(bits[i]);
		}
		String binaryStr = integralPart + "." + fractionPartbuf.toString();
		int first = binaryStr.indexOf("1");
		int last = binaryStr.lastIndexOf("1");
		String binary = binaryStr.substring(first, last +1 );
		
		//  (1.x_1x_2...x_{23})_2 \times 2^{e} format
		String [] integralLength = binary.split("\\.");
		int exponent = integralLength[0].length() - 1;
		//  Finally we can see that:  (12.375)_{10} =(1.100011)_2 \times 2^{3} 
		String exponentBinary = toBinaryString8(exponent + 127);	
		
		// 符号位
		String s = "";
		if(value > 0){
			s = "0";
		} else {
			s = "1";
		}
		// 指数
		String e = exponentBinary;
		// 尾数
		StringBuffer m = new StringBuffer();
		m.append(integralPart.substring(integralPart.indexOf("1") + 1));
		m.append(fractionPartbuf.substring(0, fractionPartbuf.lastIndexOf("1") + 1));
		int length = 23 - m.toString().length();
		for(int i = 0; i < length; i++){
			m.append("0");
		}
		System.out.println(s +"-"+ e +"-"+ m);		
	}
	
	private static String toBinaryString8(int t){
		int value = t;
		StringBuffer binary = new StringBuffer();
		for(int i = 0; i < 8; i++){
			int tmp = (value & 0x80>>>i)>>>(7-i);
			binary.append(tmp);
		}
		return binary.toString();
	}
}