5.6 PREDICTING PROBABILITIES

最新推荐文章于 2024-07-23 01:03:39 发布

DataMining2013

最新推荐文章于 2024-07-23 01:03:39 发布

阅读量442

点赞数

分类专栏： DataMining

DataMining 专栏收录该内容

7 篇文章 0 订阅

订阅专栏

Two popular criterions used to evaluate probabilistic prediction：

(1)Quadratic[kwɒ'drætɪk] Loss Function:∑_j(p_j− a_j)

Suppose for a single instance there are k possible outcomes, or classes.

1)a probability vector p1, p2,…, pk for the classes (where these probabilities sum to 1) 2)a vector a1, a2, …, ak whose ith component, where i is the actual class, is 1 and all other components are 0. 3)∑_j(p_j− a_j)=1−2 p_i+∑_jp_j²

Note that this is for asingle instance: The summation is over possible outputs, not over differentinstances. Just one of the a’s will be 1 and the rest 0, so the sum contains contributions of p_j²for the incorrect predictions and(1– p_i)² for the correct one.

(2) Informational Loss Function:−log₂p_i

1)where the ith prediction is the correct one. Because probabilities are always less than 1, their logarithms are negative, and the minus sign makes the outcome positive.

However,there are some objective differences between the two that may help you form an opinion.
The quadratic loss function takes into account not only the probability assigned to the event that actually occurred but also the other probabilities. For example, in a four-class situation, suppose you assigned 40% to the class that actually came up and distributed the remainder among the other three classes. The quadratic loss will depend on how you distributed it because of the sum of the pj2 that occurs in the expression given earlier for the quadratic loss function.The loss will be smallest if the 60% was distributed evenly among the three classes: An uneven distribution will increase the sum of the squares. The informational loss function, on the other hand, depends solely on the prob-ability assigned to the class that actually occurred. If you’re gambling on a particular event coming up, and it does, who cares about potential winnings from other events? If you assign a very small probability to the class that actual-ly occurs, the information loss function will penalize you massively. The maximum penalty, for a zero proba-bility, is infinite. The quadratic loss function, on the other hand, is milder, being bounded by1+∑_jp_j²which can never exceed 2.

DataMining2013

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
5.6 PREDICTING PROBABILITIES

Two popular criterions used to evaluate probabilistic prediction： (1)Quadratic[kwɒ'drætɪk] Loss Functionquadratic loss function:∑j(p j− a j)Suppose for a single instance there are k possible
复制链接

扫一扫

专栏目录