Floating-Point Numbers 位运算

最新推荐文章于 2021-04-15 22:47:31 发布

weixin_30339969

最新推荐文章于 2021-04-15 22:47:31 发布

阅读量182

点赞数

原文链接：http://www.cnblogs.com/renxiaomiao/p/9642657.html

版权

8378: Floating-Point Numbers

时间限制: 1 Sec 内存限制: 128 MB
提交: 15 解决: 6
[提交] [状态] [讨论版] [命题人:admin]

题目描述

In this problem, we consider floating-point number formats, data representation formats to approximate real numbers on computers.
Scientific notation is a method to express a number, frequently used for numbers too large or too small to be written tersely in usual decimal form. In scientific notation, all numbers are written in the form m × 10e. Here, m (called significand) is a number greater than or equal to 1 and less than 10, and e (called exponent) is an integer. For example, a number 13.5 is equal to 1.35×101, so we can express it in scientific notation with significand 1.35 and exponent 1.
As binary number representation is convenient on computers, let's consider binary scientific notation with base two, instead of ten. In binary scientific notation, all numbers are written in the form m × 2e. Since the base is two, m is limited to be less than 2. For example, 13.5 is equal to 1.6875×23, so we can express it in binary scientific notation with significand 1.6875 and exponent 3. The significand 1.6875 is equal to 1 + 1/2 + 1/8 + 1/16, which is 1.10112 in binary notation. Similarly, the exponent 3 can be expressed as 112 in binary notation.
A floating-point number expresses a number in binary scientific notation in finite number of bits. Although the accuracy of the significand and the range of the exponent are limited by the number of bits, we can express numbers in a wide range with reasonably high accuracy.
In this problem, we consider a 64-bit floating-point number format, simplified from one actually used widely, in which only those numbers greater than or equal to 1 can be expressed. Here, the first 12 bits are used for the exponent and the remaining 52 bits for the significand. Let's denote the 64 bits of a floating-point number by b64...b1. With e an unsigned binary integer (b64...b53)2, and with m a binary fraction represented by the remaining 52 bits plus one (1.b52...b1)2, the floating-point number represents the number m × 2e.
We show below the bit string of the representation of 13.5 in the format described above.

In floating-point addition operations, the results have to be approximated by numbers representable in floating-point format. Here, we assume that the approximation is by truncation. When the sum of two floating-point numbers a and b is expressed in binary scientific notation as a + b = m × 2e (1 ≤ m < 2, 0 ≤ e < 212), the result of addition operation on them will be a floating-point number with its first 12 bits representing e as an unsigned integer and the remaining 52 bits representing the first 52 bits of the binary fraction of m.
A disadvantage of this approximation method is that the approximation error accumulates easily. To verify this, let's make an experiment of adding a floating-point number many times, as in the pseudocode shown below. Here, s and a are floating-point numbers, and the results of individual addition are approximated as described above.
s := a
for n times {
s := s + a
}
For a given floating-point number a and a number of repetitions n, compute the bits of the floating-point number s when the above pseudocode finishes.

输入

The input consists of at most 1000 datasets, each in the following format.
n
b52...b1
n is the number of repetitions. (1 ≤ n ≤ 1018) For each i, bi is either 0 or 1. As for the floating-point number a in the pseudocode, the exponent is 0 and the significand is b52...b1.

The end of the input is indicated by a line containing a zero.

输出

For each dataset, the 64 bits of the floating-point number s after finishing the pseudocode should be output as a sequence of 64 digits, each being 0 or 1 in one line.

样例输入

1
0000000000000000000000000000000000000000000000000000
2
0000000000000000000000000000000000000000000000000000
3
0000000000000000000000000000000000000000000000000000
4
0000000000000000000000000000000000000000000000000000
7
1101000000000000000000000000000000000000000000000000
100
1100011010100001100111100101000111001001111100101011
123456789
1010101010101010101010101010101010101010101010101010
1000000000000000000
1111111111111111111111111111111111111111111111111111
0

样例输出

0000000000010000000000000000000000000000000000000000000000000000
0000000000011000000000000000000000000000000000000000000000000000
0000000000100000000000000000000000000000000000000000000000000000
0000000000100100000000000000000000000000000000000000000000000000
0000000000111101000000000000000000000000000000000000000000000000
0000000001110110011010111011100001101110110010001001010101111111
0000000110111000100001110101011001000111100001010011110101011000
0000001101010000000000000000000000000000000000000000000000000000

来源/分类

ICPC Japan IISF 2018

[提交] [状态]

这个题，最难的可能是读题，又臭又长，还不容易理解，比赛时一直没把题意理解全，直到大佬讲题时，会爆掉？？？一脸懵逼

哦，原来题都没理解对，不明白我们想了半天这个题的意义在哪。自闭

好好说题意，纪念一下题面都不理解的痛苦

题意大概模拟了电脑浮点数的表示，但是是用二进制，64位，前12位为指数，2的幂，后52位为小数位，也就是1.(后52位)

输入后52位，相加n次，输出全部64位的结果

假设十进制数44，加一次88，加两次132，但是只有两位，所以舍掉一位，变为 $13 \times 10^{1}$ ，同时44也舍掉一位，因为现在结果中

的13的3是十位，个位应该舍掉

然后二进制下假设小数位只有两位，是01，加一次是10，加两次11，加四次101，但是只能表示两位，所以要舍掉后一位，也就

是表示成 $10 \times 2^{1}$ ，舍掉最后一位，前面的指数就加1，所以是2的一次幂， 52位同理

先转换为十进制long long 保存，然后每次求出需要舍位的次数，与要加的次数n比较，如果n小，那么就直接加上，不然就要舍位

操作，舍位操作用二进制的右移一位来算，结果右移一位，那么相加的数也要右移一位

代码：

#include <bits/stdc++.h>
using namespace std;
typedef long long ll;
const int maxn=1e5+10;
string s;
int main()
{
    ll n;
    while(~scanf("%lld",&n)){
        if(n == 0)  break;
        cin >> s;
        s = "1" + s;  //把小数位前面的1加上计算，相当于先把小数点挪到最后一位，算完再把小数点挪回来
        ll x = 0;
        for(int i=0; i<53; i++){
            if(s[i]=='1')
                x += (1ll << (52-i));  //转换为十进制
        }
        ll ans = x;  //保存后52位结果
        ll e = 0;    //保存前12位结果，即2的幂数
        while(n){
            ll timee = ((1ll << 53) - ans) / x;  //计算要舍位时需要相加多少次
            if(((1ll << 53) - ans) % x != 0)     //如果不是整除，自动向下取整，那么再加一次才是能舍位的情况
                timee++;
            if(n < timee){          //如果n小于舍位次数，直接相加
                ans += x * n;
                break;
            }
            ans += x * timee;        //不然加上舍位次数的x
            ans >>= 1;               //加完右移一位，舍掉最后一位
            x >>= 1;                 //相加的数x也右移一位
            e++;                     //指数加一
            n -= timee;              //减掉加上的次数
            if(x == 0)  break;       //如果x已经右移完，就结束
        }
        for(int i=11; i>=0; i--){
            if(e & (1ll << i))  cout<<"1";
            else  cout<<"0";
         }
        for(int i=51; i>=0; i--){
            if(ans & (1ll << i))  cout<<"1";
            else   cout<<"0";
        }
        cout<<endl;
    }
    return 0;
}

转载于:https://www.cnblogs.com/renxiaomiao/p/9642657.html

weixin_30339969

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Floating-Point Numbers 位运算

8378: Floating-Point Numbers时间限制:1 Sec内存限制:128 MB提交:15解决:6[提交] [状态] [讨论版] [命题人:admin]题目描述In this problem, we consider floating-point number formats, data representatio...
复制链接

扫一扫