浮点型变量

计数整数整数是伟大的,但有时我们需要存储非常大的数字,或数字带有小数部分。 浮点型变量是一个变量,它可以容纳一个真正的数字,如4.0,2.5,3.33,或0.1226。 有三种不同的浮点数据类型: 浮点, 双精度长双 。 通常是一个浮动的4个字节,双8个字节,但这些都不是严格的要求,因此尺寸可能会发生变化。 龙双打被添加后,它的发布架构,支持更大的浮点数的语言。 但通常情况下,他们也是8个字节,相当于一个双。 总是签署浮点数据类型(可容纳正值和负值)。

下面是一些浮点数的声明:

1
2
3
float fValue;
double dValue;
long double dValue2;

The floating part of the name floating point refers to the fact that a floating point number can have a variable number of decimal places. For example, 2.5 has 1 decimal place, whereas 0.1226 has 4 decimal places.

When we assign numbers to floating point numbers, it is convention to use at least one decimal place. This helps distinguish floating point values from integer values.

1
2
int nValue = 5; // 5 means integer
float fValue = 5.0; // 5.0 means floating point

How floating point variables store information is beyond the scope of this tutorial, but it is very similar to how numbers are written in scientific notation. Scientific notation is a useful shorthand for writing lengthy numbers in a concise manner. In scientific notation, a number has two parts: the significand, and a power of 10 called an exponent. The letter ‘e’ or ‘E’ is used to separate the two parts. Thus, a number such as 5e2 is equivalent to 5 * 10^2, or 500. The number 5e-2 is equivalent to 5 * 10^-2, or 0.05.

In fact, we can use scientific notation to assign values to floating point variables.

1
2
3
4
5
double dValue1 = 500.0;
double dValue2 = 5e2; // another way to assign 500
 
double dValue3 = 0.05;
double dValue4 = 5e-2; // another way to assign 0.05

Furthermore, if we output a number that is large enough, or has enough decimal places, it will be printed in scientific notation:

1
2
3
4
5
6
7
8
9
10
11
#include <iostream>
int main()
{
     using namespace std;
 
     double dValue = 1000000.0;
     cout << dValue << endl;
     dValue = 0.00001;
     cout << dValue << endl;
     return 0;
}

Outputs:

1e+006
1e-005

Precision

Consider the fraction 1/3. The decimal representation of this number is 0.33333333333333… with 3′s going out to infinity. An infinite length number would require infinite memory, and we typically only have 4 or 8 bytes. Floating point numbers can only store a certain number of digits, and the rest are lost. The precision of a floating point number is how many digits it can represent without information loss.

When outputting floating point numbers, cout has a default precision of 6 — that is, it assumes all variables are only significant to 6 digits, and hence it will truncate anything after that.

The following program shows cout truncating to 6 digits:

1
2
3
4
5
6
7
8
9
10
11
12
#include <iostream>
int main()
{
     using namespace std;
     float fValue;
     fValue = 1.222222222222222f;
     cout << fValue << endl;
     fValue = 111.22222222222222f;
     cout << fValue << endl;
     fValue = 111111.222222222222f;
     cout << fValue << endl;
}

This program outputs:

1.22222
111.222
111111

Note that each of these is only 6 digits.

However, we can override the default precision that cout shows by using the setprecision() function that is defined in a header file called iomanip.

1
2
3
4
5
6
7
8
9
10
11
#include <iostream>
#include <iomanip> // for setprecision()
int main()
{
     using namespace std;
 
     cout << setprecision(16); // show 16 digits
     float fValue = 3.33333333333333333333333333333333333333f;
     cout << fValue << endl;
     double dValue = 3.3333333333333333333333333333333333333;
     cout << dValue << endl;

Outputs:

3.333333253860474
3.333333333333334

Because we set the precision to 16 digits, each of the above numbers has 16 digits. But, as you can see, the numbers certainly aren’t precise to 16 digits!

Variables of type float typically have a precision of about 7 significant digits (which is why everything after that many digits in our answer above is junk). Variables of type double typically have a precision of about 16 significant digits. Variables of type double are named so because they offer approximately double the precision of a float.

Now let’s consider a really big number:

1
2
3
4
5
6
7
8
9
#include <iostream>
 
int main()
{
     using namespace std;
     float fValue = 123456789.0f;
     cout << fValue << endl;
     return 0;
}

Output:

1.23457e+008

1.23457e+008 is 1.23457 * 10^8, which is 123457000. Note that we have lost precision here too!

Consequently, one has to be careful when using floating point numbers that require more precision than the variables can hold.

Rounding errors

One of the reasons floating point numbers can be tricky is due to non-obvious differences between binary and decimal (base 10) numbers. In normal decimal numbers, the fraction 1/3rd is the infinite decimal sequence: 0.333333333… Similarly, consider the fraction 1/10. In decimal, this is easy represented as 0.1, and we are used to thinking of 0.1 as an easily representable number. However, in binary, 0.1 is represented by the infinite sequence: 0.00011001100110011…

You can see the effects of this in the following program:

1
2
3
4
5
6
7
8
#include <iomanip>
int main()
{
     using namespace std;
     cout << setprecision(17);
     double dValue = 0.1;
     cout << dValue << endl;
}

This outputs:

0.10000000000000001

Not quite 0.1! This is because the double had to truncate the approximation due to it’s limited memory, which resulted in a number that is not exactly 0.1. This is called a rounding error .

Rounding errors can play havoc with math-intense programs, as mathematical operations can compound the error. In the following program, we use 9 addition operations.

1
2
3
4
5
6
7
8
9
10
#include <iostream>
#include <iomanip>
int main()
{
     using namespace std;
     cout << setprecision(17);
     double dValue;
     dValue = 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1;
     cout << dValue << endl;
}

This program should output 1, but it actually outputs:

0.99999999999999989

Note that the error is no longer in the last column like in the previous example! It has propagated to the second to last column. As you continue to do mathematical operations, this error can propagate further, causing the actual number to drift farther and farther from the number the user would expect.

Comparison of floating point numbers

One of the things that programmers like to do with numbers and variables is see whether two numbers or variables are equal to each other. C++ provides an operator called the equality operator (==) precisely for this purpose. For example, we could write a code snippet like this:

1
2
3
4
5
int x = 5; // integers have no precision issues
if (x==5)
     cout << "x is 5" << endl;
else
     cout << "x is not 5" << endl;

This program would print “x is 5″.

However, when using floating point numbers, you can get some unexpected results if the two numbers being compared are very close. Consider:

1
2
3
4
5
6
7
8
float fValue1 = 1.345f;
float fValue2 = 1.123f;
float fTotal = fValue1 + fValue2; // should be 2.468
 
if (fTotal == 2.468)
     cout << "fTotal is 2.468" ;
else
     cout << "fTotal is not 2.468" ;

This program prints:

fTotal is not 2.468

This result is due to rounding error. fTotal is actually being stored as 2.4679999, which is not 2.468!

For the same reason, the comparison operators >, >=, <, and <= may produce the wrong result when comparing two floating point numbers that are very close.

Conclusion

To summarize, the two things you should remember about floating point numbers:

1) Floating point numbers offer limited precision. Floats typically offer about 7 significant digits worth of precision, and doubles offer about 16 significant digits. Trying to use more significant digits will result in a loss of precision. (Note: placeholder zeros do not count as significant digits, so a number like 22,000,000,000, or 0.00000033 only counts for 2 digits).

2) Floating point numbers often have small rounding errors. Many times these go unnoticed because they are so small, and because the numbers are truncated for output before the error propagates into the part that is not truncated. Regardless, comparisons on floating point numbers may not give the expected results when two numbers are close.

The section on relational operators has more detail on comparing floating point numbers.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值