TensorFlow学习笔记（二）神经网络与全连接层

最新推荐文章于 2024-07-25 10:08:39 发布

红烧黄辣丁

最新推荐文章于 2024-07-25 10:08:39 发布

阅读量241

点赞数 1

分类专栏： Tensorflow 文章标签： tensorflow 深度学习

本文链接：https://blog.csdn.net/qq_44789094/article/details/105172710

版权

Tensorflow 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

TensorFlow学习笔记（二）神经网络与全连接层

数据集加载

（1）keras.datasets

常用数据集：boston housing, mnist/fashion mnist, cifar10/100, imdb

MINIST

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import  datasets
(x, y), (x_test, y_test) = keras.datasets.mnist.load_data()
x.shape	#(60000, 28, 28)
y.shape	#(60000,)
x.min(), x.max(), x.mean()	#(0, 255, 33.318421449829934)
y[:4]	#array([5, 0, 4, 1], dtype=uint8)
y_onehot = tf.one_hot(y, depth=10)
y_onehot[:2]

#<tf.Tensor: id=646, shape=(2, 10), dtype=float32, numpy=
#array([[0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
#       [1., 0., 0., 0., 0., 0., 0., 0., 0., 0.]], dtype=float32)>

CIFAR10/100

(x, y), (x_test, y_test) = keras.datasets.cifar10.load_data()
x.shape, y.shape, x_test.shape, y_test.shape
#((50000, 32, 32, 3), (50000, 1), (10000, 32, 32, 3), (10000, 1))

x.min(), x.max()	#(0, 255)
y[:4]	
#array([[6],
#       [9],
#       [9],
#       [4]], dtype=uint8)

（2）tf.data.Dataset.from_tensor_slices

将tensor沿其第一个维度切片，返回一个含有N个样本的数据集

(x, y), (x_test, y_test) = keras.datasets.cifar10.load_data()
db = tf.data.Dataset.from_tensor_slices(x_test)
next(iter(db)).shape	#TensorShape([32, 32, 3])

# .shuffle——打乱顺序
db = tf.data.Dataset.from_tensor_slices((x_test, y_test))
db = db.shuffle(10000)

# .map——预处理
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import  datasets
(x, y), (x_test, y_test) = keras.datasets.cifar10.load_data()
db = tf.data.Dataset.from_tensor_slices((x_test, y_test))
def preprocess(x, y):
    x = tf.cast(x, dtype=tf.float32)/255
    y = tf.cast(y, dtype=tf.int32)
    y = tf.one_hot(y, depth=10)
    return x,y
db2 = db.map(preprocess)
res = next(iter(db2))
res[0].shape, res[1].shape
#(TensorShape([32, 32, 3]), TensorShape([1, 10]))

# .batch
db3 = db2.batch(32)
res = next(iter(db3))
res[0].shape, res[1].shape
#(TensorShape([32, 32, 32, 3]), TensorShape([32, 1, 10]))

# .repeat——在读取到组之后的数据时重启数据集
db4 = db3.repeat()
db4 = db3.repeat(2)

注1：

batchsize：批大小。在深度学习中，一般采用SGD训练，即每次训练在训练集中取batchsize个样本训练；
iteration：1个iteration等于使用batchsize个样本训练一次；
epoch：1个epoch等于使用训练集中的全部样本训练一次，通俗的讲epoch的值就是整个数据集被轮几次。

（比如训练集有500个样本，batchsize = 10 ，那么训练完整个样本集：iteration=50，epoch=1）

注2：关于tf.one_hot

tf.one_hot(
    indices,
    depth,
    on_value=None,
    off_value=None,
    axis=None,
    dtype=None,
    name=None
)
Returns a one-hot tensor(返回一个one_hot张量).
 
The locations represented by indices in indices take value on_value, while all other locations take value off_value.
(由indices指定的位置将被on_value填充, 其他位置被off_value填充).
 
on_value and off_value must have matching data types. If dtype is also provided, they must be the same data type as specified by dtype.
(on_value和off_value必须具有相同的数据类型).
 
If on_value is not provided, it will default to the value 1 with type dtype.
 
If off_value is not provided, it will default to the value 0 with type dtype.
 
If the input indices is rank N, the output will have rank N+1. The new axis is created at dimension axis (default: the new axis is appended at the end).
(如果indices是N维张量，那么函数输出将是N+1维张量,默认在最后一维添加新的维度).
 
If indices is a scalar the output shape will be a vector of length depth.
(如果indices是一个标量, 函数输出将是一个长度为depth的向量)
 
If indices is a vector of length features, the output shape will be:
 
  features x depth if axis == -1.
(如果indices是一个长度为features的向量,则默认输出一个features*depth形状的张量)
  depth x features if axis == 0.
(如果indices是一个长度为features的向量,axis=0,则输出一个depth*features形状的张量)
 
If indices is a matrix (batch) with shape [batch, features], the output shape will be:
 
  batch x features x depth if axis == -1
(如果indices是一个形状为[batch, features]的矩阵,axis=-1(默认),则输出一个batch * features * depth形状的张量)
 
  batch x depth x features if axis == 1
(如果indices是一个形状为[batch, features]的矩阵,axis=1,则输出一个batch * depth * features形状的张量)
  depth x batch x features if axis == 0
(如果indices是一个形状为[batch, features]的矩阵,axis=0,则输出一个depth * batch * features形状的张量)

全连接层

单层

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import  datasets

x = tf.random.normal([4,784])
net = tf.keras.layers.Dense(512)    #[x,784]->[x,512]
out = net(x)
out.shape, net.kernel.shape, net.bias.shape
#(TensorShape([4, 512]), TensorShape([784, 512]), TensorShape([512]))

net = tf.keras.layers.Dense(10)
net.build(input_shape=(None, 4))
net.kernel.shape, net.bias.shape
#(TensorShape([4, 10]), TensorShape([10]))

net.build(input_shape=(None, 20))
net.kernel.shape, net.bias.shape
#(TensorShape([20, 10]), TensorShape([10]))

net.build(input_shape=(2, 4))
net.kernel.shape, net.bias
#(TensorShape([4, 10]),
# <tf.Variable 'bias:0' shape=(10,) dtype=float32, numpy=array([0., 0., 0., 0., 0., 0., #0., 0., 0., 0.], dtype=float32)>)

keras.Sequential([layrers1, layers2, layers3])——容器

#Sequential
x = tf.random.normal([2, 3])
model = keras.Sequential([
    keras.layers.Dense(2, activation='relu'),
    keras.layers.Dense(2, activation='relu'),
    keras.layers.Dense(2)
])
model.build(input_shape=[None, 3])
model.summary()

for p in model.trainable_variables:
    print(p.name, p.shape)

输出方式

（1） $y\in R^d$

linear regression
naive classification with MSE
other general prediction
$o u t = r e l u (X @ W + b)$
- logits——无激活函数输出

（2） $y_i \in [0,1]$

binary classification
- $y>0.5,\rightarrow1$
- $\rightarrow0$
Image Generation
rgb
sigmoid function
- tf.sigmoid
  $f(x)=\frac{1}{1+e^{-x}}$

a = tf.linspace(-6.,6,10)   # 在[-6,6]范围内返回有10个等间距的样本
tf.sigmoid(a)
x = tf.random.normal([1,28,28])*5
x = tf.sigmoid(x)
tf.reduce_min(x), tf.reduce_max(x)

（3）$ y_i \in [0,1]， \sum y_i=1$

注：sigmod并不能实现概率之和为1

softmax

$\displaystyle\sigma(z)_j = \frac{e^{z_j}}{\sum_{k=1}^{K} e^{z_k}}, j=1,...,K.$

a =tf.linspace(-2.,2,5)
tf.nn.softmax(a)

Classification实例

logits = tf.random.uniform([1,10], minval=-2, maxval=2)
prob = tf.nn.softmax(logits, axis=1)
tf.reduce_sum(prob, axis=1)

（4） $y_i \in [-1,1]$

Tanh

$tanh(x) = sinh(x)/cosh(x) = (e^x-e^{-x})/(e^x+e^{-x})$

a =tf.linspace(-2.,2,5)
tf.tanh(a)
#<tf.Tensor: id=520, shape=(5,), dtype=float32, numpy=
#array([-0.9640276, -0.7615942,  0.       ,  0.7615942,  0.9640276],
#      dtype=float32)>

损失函数

MSE
- $\frac{1}{N}\sum(y-out)^2$
- $L_{2-norm} = \sqrt{\sum(y-out)^2}$

y = tf.constant([1,2,3,0,2])
y = tf.one_hot(y, depth=4)
y = tf.cast(y , dtype=tf.float32)
out = tf.random.normal([5, 4])
loss1 = tf.reduce_mean(tf.square(y-out))
loss2 = tf.square(tf.norm(y-out))/(5*4)
loss3 = tf.reduce_mean(tf.losses.MSE(y, out))
loss1, loss2, loss3

#(<tf.Tensor: id=593, shape=(), dtype=float32, numpy=1.2126634>,
# <tf.Tensor: id=602, shape=(), dtype=float32, numpy=1.2126634>,
# <tf.Tensor: id=607, shape=(), dtype=float32, numpy=1.2126634>)

Cross Entropy（交叉熵）
- Entropy
  - uncertainty
  - measure of surprise
  - lower entropy $\rightarrow$ more certainty

$-\sum_{i}P(i)\log P(i)$

a = tf.fill([4], 0.25)
a*tf.math.log(a)/tf.math.log(2.)
-tf.reduce_sum(a*tf.math.log(a)/tf.math.log(2.))	#<tf.Tensor: id=631, shape=(), dtype=float32, numpy=2.0>

a = tf.constant([0.1, 0.1, 0.1, 0.7])
-tf.reduce_sum(a*tf.math.log(a)/tf.math.log(2.))	#<tf.Tensor: id=640, shape=(), dtype=float32, numpy=1.3567796>

a = tf.constant([0.01, 0.01, 0.01, 0.97])
-tf.reduce_sum(a*tf.math.log(a)/tf.math.log(2.))	#<tf.Tensor: id=649, shape=(), dtype=float32, numpy=0.24194068>

Cross Entropy

$\displaystyle H(p,q) = -\sum_x p(x)\log q(x)$

$\displaystyle H(p,q) = H(p)+D_{KL}(p|q)$

* for $p = q$

Minima: $H (p, q) = H (p)$

* for $p$ : one-hot encoding

$\log1 = 0)$

$H([0,1,0],[p_0,p_1,p_2]) = 0 + D_{KL}(p|q) = -1 \log {q_1}$

Binary Classification

Two Cases——Single output
$-P(cat)\log Q(cat) - (1-P(cat))\log (1-Q(cat))$
其中， $P (d o g) = (1 - P (c a t))$
$-\sum_{i=(cat,dog)} P(i)\log Q(i) \\ =-P(cat)\log Q(cat)-P(dog)\log Q(dog)\\ =-[y\log (p) + (1-y)\log(1-p)]$

tf.losses.categorical_crossentropy([0,1,0,0], [0.25,0.25,0.25,0.25])
#<tf.Tensor: id=666, shape=(), dtype=float32, numpy=1.3862944>

tf.losses.categorical_crossentropy([0,1,0,0], [0.1,0.1,0.7,0.1])
#<tf.Tensor: id=700, shape=(), dtype=float32, numpy=2.3025851>

tf.losses.categorical_crossentropy([0,1,0,0], [0.1,0.7,0.1,0.1])
#<tf.Tensor: id=751, shape=(), dtype=float32, numpy=0.35667497>

tf.losses.categorical_crossentropy([0,1,0,0], [0.01,0.97,0.01,0.01])
#<tf.Tensor: id=819, shape=(), dtype=float32, numpy=0.030459179>

LossHinge Loss

$\sum_{i}max(0,1-y_i*h_\theta(x_i))$

红烧黄辣丁

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
TensorFlow学习笔记（二）神经网络与全连接层

二、神经网络与全连接层数据集加载（1）keras.datasets常用数据集：boston housing, mnist/fashion mnist, cifar10/100, imdbMINISTimport tensorflow as tffrom tensorflow import kerasfrom tensorflow.keras import datasets(...
复制链接

扫一扫

专栏目录