Two popular criterions used to evaluate probabilistic prediction:
(1)Quadratic[kwɒ'drætɪk] Loss Function:∑j(p j− a j)
Suppose for a single instance there are k possible outcomes, or classes.
1)a probability vector p1, p2,…, pk for the classes (where these probabilities sum to 1) 2)a vector a1, a2, …, ak whose ith component, where i is the actual class, is 1 and all other components are 0. 3)∑j(p j− a j)=1−2 pi+∑j pj2
Note that this is for asingle instance: The summation is over possible outputs, not over differentinstances. Just one of the a’s will be 1 and the rest 0, so the sum contains contributions of pj2 for the incorrect predictions and(1– pi)2 for the correct one.
(2) Informational Loss Function:−log2pi
1)where the ith prediction is the correct one. Because probabilities are always less than 1, their logarithms are negative, and the minus sign makes the outcome positive.
However,there are some objective differences between the two that may help you form an opinion.
The quadratic loss function takes into account not only the probability assigned to the event that actually occurred but also the other probabilities. For example, in a four-class situation, suppose you assigned 40% to the class that actually came up and distributed the remainder among the other three classes. The quadratic loss will depend on how you distributed it because of the sum of the pj2 that occurs in the expression given earlier for the quadratic loss function.The loss will be smallest if the 60% was distributed evenly among the three classes: An uneven distribution will increase the sum of the squares. The informational loss function, on the other hand, depends solely on the prob-ability assigned to the class that actually occurred. If you’re gambling on a particular event coming up, and it does, who cares about potential winnings from other events? If you assign a very small probability to the class that actual-ly occurs, the information loss function will penalize you massively. The maximum penalty, for a zero proba-bility, is infinite. The quadratic loss function, on the other hand, is milder, being bounded by1+∑jpj2 which can never exceed 2.