1,信息量
https://en.wikipedia.org/wiki/Information_content
the information content(self-information) can express the length of a message needed to transmit the event given the random variable
Claude Shannon's definition of self-information was chosen to meet several axioms:
- An event with probability 100% is perfectly unsurprising and yields no information.
- The less probable an event is, the more surprising it is and the more information it yields.
- If two independent events are measured separately, the total amount of information is the sum of the self-informations of the individual events.
The detailed derivation is below, but it can be shown that there is a unique function of probability that meets these three axioms, up to a multiplicative scaling factor. Broadly given an event { x} with probability { P},
the information content is defined as follows:
The base of the log is left unspecified, which corresponds to the scaling factor above. Different choices of base correspond to different units of information: if the logarithmic base is 2, the unit is named the "bit" or "shannon"; if the logarithm is the natural logarithm (corresponding to base Euler's number e ≈ 2.7182818284), the unit is called the "nat", short for "natural"; and if the base is 10, the units are called "hartleys", decimal "digits", or occasionally "dits".
2,信息熵
https://en.wikipedia.org/wiki/Entropy_(information_theory)
An equivalent definition of entropy is the expected value of the self-information of a variable.
Assume we have the binary sequence 001011010010010 :
our sequence consists of 15 symbols with 9 zeroes and 6 ones. we compute the Shannon entropy of this model to be 0.97 bits per symbol.
We then multiply our entropy by the number of symbols(15) in the input to realize a minimum compressed size of 15 bits(integer) for this sequence.
可做补充:
https://blog.csdn.net/king52113141314/article/details/108473360