Floating Point in C --- CS:APP

All versions of C provide two different floating-point data types: float and double. On machines that support IEEE floating point, these data types correspond to single-and double-precision floating point. In addition, the machines use the round-to-even rounding mode. Unfortunately, since the C standards do not require the machine to use IEEE floating point, there are no standard methods to change the rounding mode or to get special values such as -0, +00,-00, or NaN.

Most systems provide a combination of include ('.h') files and procedure libraries to provide access to these features, but the details vary from one system to another. For example, the GNU compiler GCC defines program constants INFINITY and NAN when the following sequence occurs in the program file:

#define _GNU_SOURCE 1

#define <math.h>

More recent versions of C, including ISO C99, include a third floating-point data type, long double. For many machines and compilers, this data type is equivalent to the double data type. For Intel-compatible machines, however, GCC implements this data type using an 80-bit "extended precision" format, providing a much larger range and precision than does the standard 64-bit format.

When casting values between int, float, and double formats, the program changes the numeric values and the bit representations as follows (assuming a 32-bit int):

  • From int to float, the number cannot overflow, but it may be rounded.
  • From int or float to double, the exact numeric value can be preserved because double has both greater range(i.e., the range of representable values), as well as greater precision(i.e., the number of significant bits).
  • From double to float, the value can overflow to +00 or -00, since the range is smaller. Otherwise, it may be rounded, because the precision is smaller.
  • From float or double to int the value will be rounded toward zero. For examople, 1.999 will be converted to 1, while -1.999 will be converted to -1. Furthermore, the value may overflow. The C standards do not specify a fixed result for this case. Intel-compatible microprocessors designate thebit pattern[10...00] as an integer indefinite value. Any conversion from floating point to integer that cannot assign a reasonable integer approximation yields this value. Thus, the expression (int) +1e10 yields -21483648, generating a negative value from a positive one.

转载于:https://my.oschina.net/u/566401/blog/151406

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值