解题报告 之 ZOJ 3827 Information Entropy

18 篇文章 0 订阅

解题报告 之 ZOJ 3827 Information Entropy


Description

Information Theory is one of the most popular courses in Marjar University. In this course, there is an important chapter about information entropy.

Entropy is the average amount of information contained in each message received. Here, a message stands for an event, or a sample or a character drawn from a distribution or a data stream. Entropy thus characterizes our uncertainty about our source of information. The source is also characterized by the probability distribution of the samples drawn from it. The idea here is that the less likely an event is, the more information it provides when it occurs.

Generally, "entropy" stands for "disorder" or uncertainty. The entropy we talk about here was introduced by Claude E. Shannon in his 1948 paper "A Mathematical Theory of Communication". We also call it Shannon entropy or information entropy to distinguish from other occurrences of the term, which appears in various parts of physics in different forms.

Named after Boltzmann's H-theorem, Shannon defined the entropy Η (Greek letter Η, η) of a discrete random variable X with possible values {x1, x2, ..., xn} and probability mass function P(X) as:

$$\Large H(X)=E(-\ln(P(x)))$$

Here E is the expected value operator. When taken from a finite sample, the entropy can explicitly be written as

$$\Large H(X)=-\sum_{i=1}^{n}P(x_i)\log_{~b}(P(x_i))$$

Where b is the base of the logarithm used. Common values of b are 2, Euler's number e, and 10. The unit of entropy is bit for b = 2, natfor b = e, and dit (or digit) for b = 10 respectively.

In the case of P(xi) = 0 for some i, the value of the corresponding summand 0 logb(0) is taken to be a well-known limit:

$$ \Large 0 \log_{~b}(0) = \lim_{p \to 0 +} p \log_{~b} (p)$$

Your task is to calculate the entropy of a finite sample with N values.

Input

There are multiple test cases. The first line of input contains an integer T indicating the number of test cases. For each test case:

The first line contains an integer N (1 <= N <= 100) and a string S. The string S is one of "bit", "nat" or "dit", indicating the unit of entropy.

In the next line, there are N non-negative integers P1P2, .., PNPi means the probability of the i-th value in percentage and the sum of Piwill be 100.

Output

For each test case, output the entropy in the corresponding unit.

Any solution with a relative or absolute error of at most 10-8 will be accepted.

Sample Input

3
3 bit
25 25 50
7 nat
1 2 4 8 16 32 37
10 dit
10 10 10 10 10 10 10 10 10 10

Sample Output

1.500000000000
1.480810832465
1.000000000000

题目大意:给出了信息熵的计算方法,H(X)=i=1nP(xi)log b(P(xi)),如果是bit则b取2,如果是nat则b取e,如果是dit则b取10。然后给出某信息中n个字符出现的概率,求这条信息的平均信息熵。

分析:其实就是一道阅读理解,直接套公式即可。主要是觉得这个题有很多小点聚集在一起。第一就是ln(P)怎么弄出来?答案是用对数的性质将之转化为log(P)/log(e)。那e又怎么得到呢,答案是cmath中的函数exp(1)。然后注意这个当概率趋近于0的时候的信息熵怎么算呢,作为一个数学渣表示完全不知道如何求极限,所以可以计算一下当P取1e-10,1e-15,1e-20的时候的变化趋势,很明显可以得出此时信息熵为趋近于0。然后再代入公式计算即可。

上代码:
#include <iostream>
#include <cstdio>
#include <algorithm>
#include <cmath>
#include <string>

using namespace std;

int main()
{
    int T;
    cin>>T;
    while(T--)
    {

        //cout<<p*log2(p)<<endl;
        //cout<<exp(1)<<endl;
        double ans=0;
        int n;
        double b;
        string base;

        scanf("%d",&n);

        cin>>base;
        if(base=="bit") b=2.0;
        else if(base=="nat") b=exp(1);
        else b=10.0;

        for(int i=1;i<=n;i++)
        {
            double per;
            scanf("%lf",&per);
            per/=100.0;
            if(per==0) continue;
            ans+=(per*(log10(1.0/per)/log10(b)));

        }
        printf("%.12lf\n",ans);



    }


    return 0;
}

区域赛拿奖全靠叔叔啦,,膜拜之~
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值