TensorFlow 实战（一）—— 交叉熵（cross entropy）的定义

最新推荐文章于 2024-08-25 18:49:06 发布

五道口纳什

最新推荐文章于 2024-08-25 18:49:06 发布

阅读量1.7w

点赞数

分类专栏： caffe-TensorFlow-keras-theano-pytorch 文章标签：交叉熵损失函数多分类问题

本文链接：https://blog.csdn.net/lanchunhui/article/details/61413557

版权

caffe-TensorFlow-keras-theano-pytorch 专栏收录该内容

35 篇文章 3 订阅

订阅专栏

对多分类问题（multi-class），通常使用 cross-entropy 作为 loss function。cross entropy 最早是信息论（information theory）中的概念，由信息熵（information entropy，与压缩比率有关）变化而来，然后被用到很多地方，包括通信，纠错码，博弈论和机器学习等。交叉熵与信息熵的关系请见：<a href=“http://blog.csdn.net/lanchunhui/article/details/50970625”, target="_blank">机器学习基础（六）—— 交叉熵代价函数（cross-entropy error）。

在运作对 loss function 的定义时， $y$ 是预测的概率分布， $y^{'}$ 是真实的概率分布（在多分类问题的 one-hot 编码），用来判断模型对真实概率分布估计的准确程度。

$H(y,y')=H_{y'}(y)=-\sum_{i}y'_i\log y_i$

$i$ 表示的是样本编号。此外交叉熵还可以用来度量两个同维度的向量之间的举例，二分类问题又可进一步展开为：

$H(y,a)=H_y(a)=-\left(y\log a+(1-y)\log (1-a)\right)$

# y_ 真实输出值，y 预测值
y_ = tf.placeholder(tf.float32, [None, 10])
cross_ent = -tf.reduce_mean(tf.reduce_sum(y_*tf.log(y), reduce_indices=[1]))

预测值y是经过一系列的机器学习（深度学习）的算法得到（y_以预先 placeholder，占位），此时便可定义优化算法：

lr = 1e-4
train_step = tf.train.AdamOptimizer(lr).minimize(cross_ent)

1. softmax_cross_entropy_with_logits

https://blog.csdn.net/mao_xiao_feng/article/details/53382790

softmax_cross_entropy_with_logits 该接口对 y_*tf.log(y) 取了负号，但未求和

其 logits 参数表示未经过 softmax 前的输出；

logits = tf.constant([[1, 2, 3], [2, 1, 3], [3, 1, 2]], tf.float64)
y = tf.nn.softmax(logits)
y_ = tf.constant([[0, 0, 1], [0, 1, 0], [0, 0, 1]], tf.float64)

cross_ent = -tf.reduce_sum(y_ * tf.log(y))
cross_ent2 = tf.reduce_sum(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=y_))

with tf.Session() as sess:
	print('cross_ent: ', sess.run(cross_ent))
	print('cross_ent2: ', sess.run(cross_ent2))