tensorflow 下 cross_entropy 的计算

最新推荐文章于 2022-09-28 20:27:56 发布

z2539329562

最新推荐文章于 2022-09-28 20:27:56 发布

阅读量1.4k

点赞数 1

分类专栏：人工智能，调bug tensorflow学习文章标签： tensorflow cross entropy

本文链接：https://blog.csdn.net/z2539329562/article/details/83824444

版权

人工智能，调bug 同时被 2 个专栏收录

61 篇文章 3 订阅

订阅专栏

tensorflow学习

28 篇文章 0 订阅

订阅专栏

本文主要包括
tf.losses.softmax_cross_entropy()、
tf.nn.softmax_cross_entropy_with_logits_v2()、
tf.losses.softmax_cross_entropy()、
以及通过交叉熵公式实现cross_entropy

'''
tf.losses.softmax_cross_entropy(
    onehot_labels,
    logits,
    weights=1.0,
    label_smoothing=0,
    scope=None,
    loss_collection=tf.GraphKeys.LOSSES,
    reduction=Reduction.SUM_BY_NONZERO_WEIGHTS
)

Args:
onehot_labels: One-hot-encoded labels.
logits: Logits outputs of the network.
weights: Optional Tensor that is broadcastable to loss.
label_smoothing: If greater than 0 then smooth the labels.
scope: the scope for the operations performed in computing the loss.
loss_collection: collection to which the loss will be added.
reduction: Type of reduction to apply to loss.  这个地方比较要命，要使用就需要导入模块
Returns:
Weighted loss Tensor of the same type as logits. If reduction is NONE, this has shape [batch_size]; otherwise, it is scalar.
#该函数默认返回值为标量，如果需要产生[batch_size]的交叉熵列表，就需要对reduction进行赋值。


tf.nn.softmax_cross_entropy_with_logits_v2(
    _sentinel=None,
    labels=None,
    logits=None,
    dim=-1,
    name=None
)

Args:
_sentinel: Used to prevent positional parameters. Internal, do not use.
labels: Each vector along the class dimension should hold a valid probability distribution e.g. for the case in which labels 
        are of shape [batch_size, num_classes], each row of labels[i] must be a valid probability distribution.
logits: Unscaled log probabilities.
dim: The class dimension. Defaulted to -1 which is the last dimension.
name: A name for the operation (optional).

Returns:
A Tensor that contains the softmax cross entropy loss. Its type is the same as logits and its shape is the same as labels except 
that it does not have the last dimension of labels.
#该函数默认返回一个交叉熵列表，shape=[batch_size, ]，没有办法自动计算为一个数值。
'''


'''
使用pycharm进行跳转可以看到gen_nn_ops.py文件，利用githbu无法找到该文件。tf.losses.softmax_cross_entropy() 
内部调用softmax_cross_entropy()函数，该函数是先执行平滑操作，然后转到nn.softmax_cross_entropy_with_logits_v2()。
在该函数内部进入gen_nn_ops.softmax_cross_entropy_with_logits(),进入softmax_cross_entropy_with_logits()函数后，
执行SoftmaxCrossEntropyWithLogits()不过这个函数被封装了，看不到。

所以tf.losses.softmax_cross_entropy()和tf.nn.softmax_cross_entropy_with_logits_v2()
去掉平滑功能后，数据处理是一致的，只是最后返回的时候不同。tf.losses.softmax_cross_entropy()默认返回一个标量，
tf.nn.softmax_cross_entropy_with_logits_v2()默认返回列表，当然这也就意味着不会进行tf.reduce_mean操作。

'''

import tensorflow as tf
from tensorflow.python.ops.losses.losses_impl import Reduction


y_ = tf.constant([i+1 for i in range(6)], shape=[2, 3], dtype=tf.float32)
y = tf.constant([i+2 for i in range(6)], shape=[2, 3], dtype=tf.float32)
z = tf.nn.softmax(tf.clip_by_value(y, 1.5, 4.5))

sess = tf.Session()

#loss function
cross_entropy = -tf.reduce_mean(tf.reduce_sum(y_ * tf.log(z), axis=1)) #这里的axis需要注意，下面再细讲。
print("cross entropy: ", sess.run(cross_entropy))

#cross entropy
cross_entropy1 = tf.nn.softmax_cross_entropy_with_logits_v2(labels=y_, logits=tf.clip_by_value(y, 1.5, 4.5))
print("cross entropy1: ", sess.run(cross_entropy1))

#cross entropy
cross_entropy2 = tf.losses.softmax_cross_entropy(onehot_labels=y_, logits=tf.clip_by_value(y, 1.5, 4.5), reduction=Reduction.NONE)
print("cross entropy2: ", sess.run(cross_entropy2))

#loss function
cross_entropy3 = tf.reduce_mean(cross_entropy1)
print("cross entropy3: ", sess.run(cross_entropy3))

#loss function  经过tf.reduce_mean()
cross_entropy4 = tf.losses.softmax_cross_entropy(onehot_labels=y_, logits=tf.clip_by_value(y, 1.5, 4.5))
print("cross entropy4: ", sess.run(cross_entropy3))

sess.close()

output:

cross entropy:  11.46241
cross entropy1:  [ 6.4456353 16.479183 ]
cross entropy2:  [ 6.4456353 16.479183 ]
cross entropy3:  11.462409
cross entropy4:  11.462409

import tensorflow as tf

y_ = tf.constant([i+1 for i in range(6)], shape=[2, 3], dtype=tf.float32)
y = tf.constant([i+2 for i in range(6)], shape=[2, 3], dtype=tf.float32)
z = tf.nn.softmax(tf.clip_by_value(y, 1.5, 4.5))

sess = tf.Session()

#loss function
cross_entropy = -tf.reduce_sum(y_ * tf.log(z), axis=1)
print("cross entropy: ", sess.run(cross_entropy), sess.run(tf.shape(cross_entropy)))

cross_entrop1 = -tf.reduce_sum(y_ * tf.log(z), axis=0)
print("cross entropy1: ", sess.run(cross_entrop1), sess.run(tf.shape(cross_entropy1)))

output:
    cross entropy:  [ 6.445636 16.479183] [2]
    cross entropy1:  [6.8020554 8.308273  7.8144917] [3]

可以看到，当axis=0时，一维张量存在三个元素
          当axis=1时，一维张量存在两个元素

张量运算对于axis的选取，按照如下的图来

'''
tf.nn.softmax_cross_entropy_with_logits_v2()
tf.losses.softmax_cross_entropy()
中 logits 和 labels 的 shape 相同， 均为 [batch_size， num_classes]

tf.nn.sparse_softmax_cross_entropy_with_logits()
中 logits 的 shape = [batch_size， num_classes]
labels 的 shape = [batch_size]
'''
import tensorflow as tf

logits = tf.constant([i+1 for i in range(9)], shape=[3, 3], dtype=tf.float32)

y = tf.nn.softmax(logits)
y = tf.log(y)

y_ = tf.constant([
    [0, 0, 1.0],
    [0, 0, 1.0],
    [0, 0, 1.0]
])

#等价
cross_entropy = -tf.reduce_sum(y_ * y, axis=1)

dense_y_ = tf.arg_max(y_, 1)
cross_entropy1 = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=dense_y_, logits=logits)

sess = tf.Session()
print(sess.run(cross_entropy))
print(sess.run(cross_entropy1))

output:

[0.407606   0.407606   0.40760598]
[0.40760595 0.40760595 0.40760595]

logits 理解为未经放缩的、原生态的，所以在使用函数时，如果函数名里带着 with logits，通常应该是和输入的logits在维度上保持一定相似性，比如[batch_size]，而不是单纯的 scalar（标量）。