Cross-Entropy Loss

最新推荐文章于 2024-10-02 01:49:36 发布

拉普拉斯的汪

最新推荐文章于 2024-10-02 01:49:36 发布

阅读量267

点赞数 1

分类专栏： Deep Learning 文章标签：深度学习算法概率论

本文链接：https://blog.csdn.net/qq_39599295/article/details/119988923

版权

Reference:

https://en.wikipedia.org/wiki/Cross_entropy

https://d2l.ai/chapter_linear-networks/softmax-regression.html#loss-function

Definition: Cross-Entropy

The cross-entropy of the distribution $q$ relative to a distribution $p$ over a given set is defined as follows:
$H(p,q)=-E_p[\log q] \tag{1}$
where $E_p[\cdot]$ is the expected value operator with respect to the distribution $p$ .

For discrete probability distribution $p$ and $q$ with the same support $\mathcal X$ this means:
$H(p,q)=-\sum_{x\in \mathcal X}p(x)\log q(x)\tag{2}$
The situation for continuous distributions is analogous:
$H(p,q)=-\int _\mathcal X P(x)\log Q(x)dr(x)\tag{3}$
N.B: The notation $H (p, q)$ is also used for the joint entropy of $p$ and $q$ .

Relation to Log-likelihood

In classification problems we want to estimate the probability of different outcomes. Suppose that the entire dataset $\{\mathbf X, \mathbf y\}$ has $N$ samples, where the sample indexed by $i$ consists of a feature vector $\mathbf x^{(i)}$ and a label $y^{(i)}$ . Let the estimated probability of outcome $k\in \mathcal K$ be $\hat p(y=k|\mathbf x;\mathbf w)$ and let the frequency (empirical probability) of outcome $k$ in the training set be $q(y=k|\mathbf x)$ . The likelihood of the parameters $\mathbf w$ is