Tensorflow之softmax_v.s._sigmoid

多分类softmax激活函数 & 二分类sigmoid激活函数

(1)多分类:样本属于第 k k k个类别(总共 K K K个类别)的概率
S k = e x k ∑ i = 1 K e x i S_k=\frac{e^{x_k}}{\sum\limits_{i=1}^K e^{x_i}} Sk=i=1Kexiexk
其中 x k x_k xk是样本经过隐层线性组合后的结果。
(2)二分类:样本属于正类1(正类1、负类0)的概率
P ( Y = 1 ∣ x ) = 1 1 + e − h ( x ) = 1 1 + e − ( θ T x + b ) P(Y=1|x)=\frac{1}{1+e^{-h(x)}}=\frac{1}{1+e^{-(\theta^T x + b)}} P(Y=1x)=1+eh(x)1=1+e(θTx+b)1
其中 h ( x ) h(x) h(x)是隐层线性组合后的结果,即 θ \theta θ是隐层参数, b b b是偏置项。

几个基本函数的用法

(1)验证softmax

import tensorflow as tf
hidden_layer = tf.Variable([1.0,2.0,3.0])
softmax_active_predict = tf.nn.softmax(hidden_layer)
init_op=tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init_op)
    print(sess.run(softmax_active_predict))
#Output:一个样本属于三个类别的概率
#[0.09003057 0.24472848 0.66524094]

import numpy as np
aa=np.array([1.0,2.0,3.0])
aa_e=np.exp(aa)
fm=sum(aa_e)
print(aa_e, fm)
for i in aa:
    print(np.exp(i)/fm)
#Output:
#[ 2.71828183  7.3890561  20.08553692] #30.19287485057736
#0.09003057317038046
#0.24472847105479767
#0.6652409557748219

(2)验证sigmoid

import tensorflow as tf
hidden_layer = tf.Variable([1.0,2.0,3.0]) 
sigmoid_active_predict = tf.nn.sigmoid(hidden_layer)
init_op=tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init_op)
    print(sess.run(sigmoid_active_predict))
#Output:三个样本属于正类1的概率
#[0.7310586  0.880797   0.95257413]

import numpy as np
aa=np.array([1.0,2.0,3.0])
aa_e=np.exp(aa)
for i in aa:
    print(1/(1+np.exp(-i)))
#Output:
#0.7310585786300049
#0.8807970779778823
#0.9525741268224334

(3)验证tf.nn.sigmoid_cross_entropy_with_logits
该API接口合并了求sigmoid和cross_entropy的过程,参数labels和logits的shape和dtype都要相同

Case1:二分类[正类是1,负类是-1]
l o g l o s s = ∑ i − l o g ( s i g m o i d ( y t r u e ( i ) y p r e d i c t ( i ) ) ) = ∑ i − l o g ( 1 1 + e − ( y t r u e ( i ) y p r e d i c t ( i ) ) ) logloss=\sum\limits_{i}-log(sigmoid(y^{(i)}_{true}y^{(i)}_{predict}))=\sum\limits_{i}-log\Big(\frac{1}{1+e^{-(y^{(i)}_{true}y^{(i)}_{predict})}}\Big) logloss=ilog(sigmoid(ytrue(i)ypredict(i)))=ilog(1+e(ytrue(i)ypredict(i))1)

import tensorflow as tf
hidden_layer = tf.Variable([[1.0,2.0,3.0]])
y_true = tf.Variable([[-1.0,1.0,1.0]])
diff = tf.nn.sigmoid_cross_entropy_with_logits(labels=y_true, logits=hidden_layer)
cross_entropy = tf.reduce_mean(diff)
#注意:此时对应的正类是1,负类是-1
init_op=tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init_op)
    print('real&pred diff:',sess.run(diff))
    print('logloss/cross_entropy:',sess.run(cross_entropy))


import numpy as np
aa=np.array([1.0,2.0,3.0])
label=np.array([-1,1,1])
sig=[]
for i in range(len(aa)):
    sig.append(label[i] * -np.log(1/(1+np.exp(-aa[i]))) + 
               (1-label[i]) * -np.log(1 - 1/(1+np.exp(-aa[i]))))
logloss=np.mean(np.array(sig))
print('sigmoid:',sig)
print('logloss:',logloss)

#real&pred diff: [[2.3132617  0.126928   0.04858735]]
#logloss/cross_entropy: 0.8295924
#sigmoid: [2.3132616875182226, 0.12692801104297263, 0.04858735157374191]
#logloss: 0.8295923500449791

Case2:二分类[正类是1,负类是0]
l o g l o s s = − ∑ i [ y t r u e ( i ) l o g y p r e d i c t ( i ) + ( 1 − y t r u e ( i ) ) l o g ( 1 − y p r e d i c t ( i ) ) ] logloss=-\sum\limits_{i} \Big[y^{(i)}_{true} log{y^{(i)}_{predict}} + (1-y^{(i)}_{true}) log(1-{y^{(i)}_{predict}})\Big] logloss=i[ytrue(i)logypredict(i)+(1ytrue(i))log(1ypredict(i))]

import tensorflow as tf
hidden_layer = tf.Variable([[1.0,2.0,3.0]])
y_true = tf.Variable([[1.0,0.0,1.0]])
cross_entropy = tf.reduce_sum(tf.nn.sigmoid_cross_entropy_with_logits(labels=y_true, logits=hidden_layer))
#注意:此时对应的正类是1,负类是0
init_op=tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init_op)
    print('logloss/cross_entropy:',sess.run(cross_entropy))
#Output:
#logloss/cross_entropy: 2.4887772

import numpy as np
h=np.array([1.0,2.0,3.0])
label=np.array([1,0,1])
pred=1/(1+np.exp(-h))
sig=[]
for i in range(len(h)):
    sig.append(label[i]*np.log(pred[i])+(1-label[i])*(np.log(1-pred[i])))
print('logloss:',-sum(sig))
#Output:
#[0.73105858 0.88079708 0.95257413]
#logloss: 2.488777050134936

(4)验证tf.nn.softmax_cross_entropy_with_logits
softmax损失函数:[a]样本 i i i属于各个类别的输出 a i ( k ) = e h k ∑ k e h k a^{(k)}_i=\frac{e^{h_k}}{\sum\limits_{k}e^{h_k}} ai(k)=kehkehk
[b]对所有样本的损失 l o g l o s s = − ∑ i ∑ k y i ( k ) l o g ( a i ( k ) ) logloss=-\sum\limits_{i}\sum\limits_{k} y^{(k)}_i log(a^{(k)}_i) logloss=ikyi(k)log(ai(k))

import tensorflow as tf
hidden_layer = tf.Variable([[1.0,2.0,3.0],[1.0,2.0,3.0],[1.0,2.0,3.0]])   #三个样本属于正类1的隐层计算结果
y_true = tf.Variable([[0.0,0.0,1.0],[0.0,0.0,1.0],[0.0,0.0,1.0]])
cross_entropy = tf.reduce_sum(tf.nn.softmax_cross_entropy_with_logits(labels=y_true, logits=hidden_layer))
init_op=tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init_op)
    print('logloss/cross_entropy:',sess.run(cross_entropy))
#Output:
#logloss/cross_entropy: 1.2228179

import numpy as np
h=np.array([[1.0,2.0,3.0],[1.0,2.0,3.0],[1.0,2.0,3.0]])
label=np.array([[0.0,0.0,1.0],[0.0,0.0,1.0],[0.0,0.0,1.0]])
soft_max=[]
for i in h:
    temp_sum=sum(np.exp(i))
    soft_max.append(list(np.exp(i)/temp_sum))
print('softmax_matrix:',soft_max)

pred=np.log(np.array(soft_max))
print(-sum(sum(np.multiply(label,pred))))   #矩阵对应位置相乘
#Output:
#softmax_matrix: [[0.09003057317038046, 0.24472847105479767, 0.6652409557748219], 
#                 [0.09003057317038046, 0.24472847105479767, 0.6652409557748219], 
#                 [0.09003057317038046, 0.24472847105479767, 0.6652409557748219]]
#1.2228178933331408

(5)tf.placeholder(dtype, shape=None, name=None)
dtype:数据类型。shape:数据维度。name:名称。
占位符placeholder没有初始值,tensorflow给它分配必要的内存。在session中,占位符用 feed_dict 传送数据。feed_dict是一个字典,在字典中需要给出每一个用到的占位符的取值。使用大量数据训练神经网络时,训练样本需要按batch传入,如果每次迭代的样本要用常量,那么tensorFlow 每次都会在计算图中增加一个结点,计算图会非常大;使用占位符,可以使多次传入数据都占用同一个结点

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值