比较浮点数的黄金法则 -- The golden rule for floating number comparison

How to compare floating numbers is an ooold, yet frequently asked, question. Here I am not going to repeat the old answers, since as you know tons of good links are there in the web that explained well how floating number works in computers and how comparisons should in principle be carried out. Instead, I am going to straightly give you a few pieces of code that work universally for IEEE 754 floating numbers, regardless their magnitudes and the comparison rigorosity. They are based on the golden rule as I am going to show you. At the end, I will discuss upon the performance optimization for a few special cases.

OK, here we go. But... wait a second, let me first be super clear of a few of things:
1. I am talking about the IEEE standard 754 floating numbers. This is almost needless to say because this specification is adopted by most, if not all, modern computers. But just to be super clear, I am assuming your computer also works with this specification.
2. I assume that you, the reader, know some basics about floating numbers, such as: floating numbers are not continuously represented in computers.
3. I assume that you, the reader, could read C++ program. This is important since all code and examples here are given in C++.
If any of the three assumptions was broken for your case, then you should not read this article.

Congrats, you passed the "configure" stage. :-) We now set off.

//  Compares equality
inline  bool  is_equal( Real x, Real y )
{
   Real x1  = abs_( x );
   Real y1  = abs_( y );
   Real z1  = (x1 > y1 ) ? x1           : y1;
   Real eps = (z1 > 1.0) ? z1 * epsilon : epsilon;

   return abs_( x - y ) <= eps;
}

This piece of code reveals the golden rule, anything else here are derived from this rule. In this code, Real is a floating number, be it float or double or even long double; abs_ is a function to get the absolute value of a given number, be it a standard function or your own function; REAL_EPSILON is a adjustable parameter representing the rigorosity of the comparison, I will elaborate upon it later. What this code does is to first calculate the proper range within which the two real numbers are considered the same. This is the real thing. Actually all lines before the return statement are dedicated to this task. This range is calculated automatically based on the preset rigorosity as represented by, again, REAL_EPSILON. Once this range is obtained, the remaining of the whole comparison task becomes trivial: just calculate the difference of the two numbers, and then compare the absolute value of this difference with the range; if it is smaller than the range, the two numbers are considered the same; otherwise, different. You may ask why the range is calculated as described? Well, to explain that, you need to learn some basics about the floating number, which I have to skip, as I said.

What really needs to explain here is the value for the rigorosity. What value should it take? You might immediately think of the *_EPSILON macros/constants as defined in <cfloat> library. That is, however, not a very good choice. The reason is that for a given value x, probably only 2 or 3 numbers are considered equal by this rigorosity, which is too rigorous and no much difference from the direct comparison. I recommend a value that is 5-10 times the *_EPSILON for the rigorosity for general applications, but your mileages might be different.

//  Compares equality with adjustable rigorosity.
//  Slight meta-tricks to figure out REAL_EPSILON for different Reals. Yes, the meta thing is kinda verbose if you don't have some a meta library.
template  < typename T >
struct  what_real;
template 
<   >   struct  what_real < float >   static const int value = 0; } ;
template 
<   >   struct  what_real < double >   static const int value = 1; } ;
template 
<   >   struct  what_real < long   double >   static const int value = 2; } ;
template 
<   >   struct  what_real < my_real >   static const int value = 3; } ;
extern  Real REAL_EPSILON;   //  Its value has to be set outside the header, in a .cc file.

namespace   {
    
const int GAUGE = 7;
}


inline 
bool  is_equal( Real x, Real y, real_epsilon  =  GAUGE  *  REAL_EPSILON )
{

...
   Real eps 
= (z1 > 1.0? z1 * real_epsilon : real_epsilon;

   
return abs_( x - y ) <= eps;
}

The meta-thing is just added for fun. The main point is that you now can use the function like: is_equal( x, y, 1E-6 ), where 1E-6 is an online specification of the rigorosity. More convenient, huh!?

With the above code understood, we can easily define greater-than and less-than functions (so they are omitted here).

Now let's deal with performance for some special cases. A common case is to compare a floating number against zero. It is better to dedicate a function for this case. Replacing zero to the argument y, we reduce the code to the following:

inline  bool  is_zero( Real x, real_epsilon  =  GAUGE  *  REAL_EPSILON )
{
   
return abs_( x ) <= real_epsilon;
}

That's simple and good. How about for 1, 2, 8, or an arbitrary integer? If we cannot define a different function for each of these values, what can we do? Template is the answer. We can write a code like the following:


namespace {

template 
<bool C, int A, int B>
struct select_value
{
     
static const int value = A;
}
;

template 
<int A, int B>
struct select_value<false, A, B>
{
    
static const int value = B;
}
;

template 
<int X>
struct meta_abs
{
    
static const int value = select_value<(X>0), X, -X>::value;
}
;

}


template 
< int  Y >
is_equal_to( x, real_epsilon 
=  GAUGE  *  REAL_EPSILON )
{
   
return abs_( x - Y ) <= meta_abs<Y>::value > 1 ? real_epsilon * meta_abs<Y>::value : real_epsilon;
}


Here in the anonymous namespace, we defined a meta-function for calculation of the absolute value of a given constant integer Y. This transfers part of the calculation to the compile time, optimizing the overall performance of the comparing operation.

That is all I'd like to say regarding the floating number comparison. The remaining is yours.

BTW, not all of the code here were tested with a compiler.
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值