之前展示过只使用python、numpy构建简单的神经网络:前向传播、反向传播、链式法则求导等,有助于理解相关知识,但工程中是不可能这样做的。
一些基本的概念
1、常量 tf.constant
```python
import tensorflow as tf
a1 = tf.constant([1,2,3],tf.int32,name="a1")
print(a1) # 打印:Tensor("a1:0", shape=(3,), dtype=int32)
print(type(a1)) # 打印 <class 'tensorflow.`python`.framework.ops.Tensor'>
# 常量
tf.zeros([3,2],tf.float32) # 维度3 X 2
常量定以后值不可变
2、变量 tf.Variable 和 tf.Session()
```python
v1 = tf.Variable(a1**2,tf.float32,name=v1")
init = tf.global_variables_initializer()
with tf.Session() as session:
session.run(init)
print(session.run(loss))
3、占位符 tf.placeholder()
X = tf.placeholder(dtype=tf.float32, shape=[144, 10], name=‘X’)
```python
sess = tf.Session()
x = tf.placeholder(tf.int64,name='x') # 定义一个占位符
# 使用字典 feed_dict= 命名变量进行传输数据进行计算
print(sess.run(2*x, feed_dict={s:3}))
sess.close()
4、变量和占位符的区别
1、tf.variable 在声明时需要指定初始化值
tf.placeholder, 不必指定初始值,而在Session.run 中使用字典传递,可以理解占位符就是一个通道或形参,用来传递数据。
2、 使用中,占位符通常用来传递训练样本数据,变量通常用来保持中间变量等。
其他常用简单函数
tensorflow 功能丰富,具体的需要参考其api。 这里简单整理一些常用的函数。
1、矩阵乘法
tf.matmul(W,X)
2、加法
tf.add()
3、激活函数
- tf.nn.relu(features, name=None)
- tf.nn.relu6(features, name=None)
- tf.nn.softplus(features, name=None)
- tf.nn.dropout(x, keep_prob, noise_shape=None, seed=None, name=None)
- tf.nn.bias_add(value, bias, name=None)
- tf.sigmoid(x, name=None)
- tf.tanh(x, name=None)
4、交叉熵损失函数
tf.nn.sigmoid_cross_entropy_with_logits( _sentinel=None, labels=None, logits=None,name=None)
(2)
−
(
y
(
i
)
log
σ
(
z
(
i
)
)
+
(
1
−
y
(
i
)
)
log
(
1
−
σ
(
z
(
i
)
)
)
- \large ( \small y^{(i)} \log \sigma(z^{(i)}) +(1-y^{(i)})\log (1-\sigma(z^{(i)})\large )\small\tag{2}
−(y(i)logσ(z(i))+(1−y(i))log(1−σ(z(i)))(2)
其他损失函数: 可以参考api
5、独热编码
tf.one_hot(labels, depth, axis)
代码示例
用tensorflow 定义一个简单的神经网络
```python
import math
import numpy as np
import h5py
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.python.framework import ops
# 代价函数
def compute_cost(Z3, Y):
"""
Computes the cost
Z3 -- 输出层线性单元的输出
Y -- 标签
Returns:
cost - Tensor of the cost function
"""
logits = tf.transpose(Z3)
labels = tf.transpose(Y)
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits =logits , labels = labels))
return cost
# 对样本标签数据进行独热编码
def one_hot_matrix(labels , num_classes):
"""
labels: 1Xm 如:[1,3,0,2,1] num_classes = 4
编码后: 0 0 1 0 0
1 0 0 0 1
0 0 0 1 0
0 1 0 0 0
编码后的形式: 由tf.one_hot(labels, depth, axis)中的 axis 决定
"""
num_classes = tf.constant(num_classes,name="num_classes")
one_hot_matrix = tf.one_hot(labels,C,axis=0)
sess = tf.Session()
# 获取独热编码
one_hot = sess.run(one_hot_matrix)
sess.close()
return one_hot
def create_placeholders(n_x, n_y):
"""
创建神经网络中的 输入 和 输出占位符
"""
X = tf.placeholder(tf.float32,shape=(n_x,None),name="X")
Y = tf.placeholder(tf.float32,shape=(n_y,None),name="Y")
return X, Y
# 前向传播
def forward_propagation(X, parameters):
"""
2层神经网络, 中间层激活函数 使用, relu : tf.nn.relu()
输出层使用SOFTMAX 分类器
"""
W1 = parameters['W1']
b1 = parameters['b1']
W2 = parameters['W2']
b2 = parameters['b2']
W3 = parameters['W3']
b3 = parameters['b3']
Z1 = tf.add(tf.matmul(W1,X),b1)
A1 = tf.nn.relu(Z1)
Z2 = tf.add(tf.matmul(W2,A1),b2)
A2 = tf.nn.relu(Z2)
Z3 = tf.add(tf.matmul(W3,A2),b3)
return Z3
# 初始化参数
def initialize_parameters():
"""
初始化参数,参数的维度 和 神经网络的结构和 数据的维度相关,这个里 只是给个例子
"""
tf.set_random_seed(1) #种子
W1 = tf.get_variable("W1",[25,12288],initializer = tf.contrib.layers.xavier_initializer(seed = 1))
b1 = tf.get_variable("b1",[25,1],initializer = tf.contrib.layers.xavier_initializer(seed = 1))
W2 = tf.get_variable("W2",[12,25],initializer = tf.contrib.layers.xavier_initializer(seed = 1))
b2 = tf.get_variable("b2",[12,1],initializer = tf.contrib.layers.xavier_initializer(seed = 1))
W3 = tf.get_variable("W3",[6,12],initializer = tf.contrib.layers.xavier_initializer(seed = 1))
b3 = tf.get_variable("b3",[6,1],initializer = tf.contrib.layers.xavier_initializer(seed = 1))
parameters = {"W1": W1,
"b1": b1,
"W2": W2,
"b2": b2,
"W3": W3,
"b3": b3}
return parameters
def model(X_train, Y_train, X_test, Y_test, learning_rate = 0.0001,
num_epochs = 3500, minibatch_size = 32, print_cost = True):
"""
构建模型
"""
ops.reset_default_graph()
tf.set_random_seed(1)
seed = 2
(n_x, m) = X_train.shape # 训练数据 n_x 特征维度, m样本个数
n_y = Y_train.shape[0]
cost =[]
# 定义占位符
X, Y = create_placeholders(n_x, n_y)
parameters = initialize_parameters() # 初始化参数
Z3 = forward_propagation(X, parameters)
# 代价函数 计算图
cost = compute_cost(Z3, Y)
# 定义优化器
optimizer = tf.train.GradientDescentOptimizer(learning_rate = learning_rate).minimize(cost)
#初始化变量
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
# 迭代循环
for epoch in range(num_epochs):
epoch_cost = 0. # 每次迭代的 的代价
num_minibatches = int(m / minibatch_size) # 计算mini-batch 的数据集的划分的个数
seed = seed + 1
minibatches = random_mini_batches(X_train, Y_train, minibatch_size, seed)
for minibatch in minibatches:
(minibatch_X, minibatch_Y) = minibatch
_ , minibatch_cost = sess.run([optimizer, cost], feed_dict={X: minibatch_X, Y:minibatch_Y})
epoch_cost += minibatch_cost / num_minibatches
if print_cost ==True and epoch % 100 ==0:
print("第 %i次迭代后的代价: %f" %(epoch , epoch_cost))
if print_cost == True and epoch % 5 == 0:
costs.append(epoch_cost)
plt.plot(np.squeeze(costs))
plt.ylabel('cost')
plt.xlabel('iterations (per tens)')
plt.title("Learning rate =" + str(learning_rate))
plt.show()
parameters = sess.run(parameters) # 获取参数, 转化成numpy.ndarray
print ("Parameters have been trained!")
correct_prediction = tf.equal(tf.argmax(Z3), tf.argmax(Y))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
print ("Train Accuracy:", accuracy.eval({X: X_train, Y: Y_train}))
print ("Test Accuracy:", accuracy.eval({X: X_test, Y: Y_test}))
return parameters