TensorFlow入门笔记

最新推荐文章于 2020-12-29 10:05:42 发布

Initializer_nj

最新推荐文章于 2020-12-29 10:05:42 发布

阅读量280

点赞数

文章标签： TensoeFlow python

本文链接：https://blog.csdn.net/baidu_29604413/article/details/79398411

版权

这个呢，是Coursera上面的一个Tutorial，我发现用它来开始学习TensorFlow真的非常友好，我就顺便自己再练习一遍并把它翻译过来啦，大牛清喷，我只是小白，在这里学习交流

1. Tensorflow Library

首先就是import tensorflow的各种库啦，在这里要实现：

数据的初始化
Start a Session
训练算法
建立一个神经网络

import math
import numpy as np
import h5py
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.python.framework import ops
from tf_utils import load_dataset, random_mini_batches, convert_to_one_hot, predict

%matplotlib inline
np.random.seed(1)

使用TensorFlow写一个程序基本的步骤：

创建变量
写入对这些变量的操作
初始化变量
创建Session
运行Sessin

例如，我们在深度学习中的Loss Function 该怎么实现呢？

y_hat = tf.constant(36, name='y_hat')            # 设置 y_hat 为数字9
y = tf.constant(39, name='y')                    # 设置 y 为6

loss = tf.Variable((y - y_hat)**2, name='loss')  # loss function

init = tf.global_variables_initializer()         # 添加节点用于初始化所有变量

a = tf.constant(2)
b = tf.constant(10)
c = tf.multiply(a,b)
print(c)

print结果 session.run(init) # 初始化 print(session.run(loss))

运行结果肯定是9 啦

为了更好的理解这里的Session的意思，我们再举一个例子，

a = tf.constant(2)
b = tf.constant(10)
c = tf.multiply(a,b)
print(c)

运行结果是20吗？很可惜，不是，结果是

Tensor("Mul:0", shape=(), dtype=int32)

在这里这个小程序并没有运行

我们加上

sess = tf.Session()
print(sess.run(c))

结果： 20

当然，运行Session之前，千万不要忘了初始化你的变量，这点很重要。

TensorFlow有两种方法创建Session：

方法一：

sess = tf.Session()
result = sess.run(..., feed_dict = {...})
sess.close() # Close the session

方法二：

with tf.Session() as sess: 
    result = sess.run(..., feed_dict = {...})
    # 这里不需要colse session

接下来我们要理解一个很重要的概念，placeholder，用于传递进来的训练样本，你可以理解为一个占位符，它不用指定初始值，可在运行时，通过Session.run()的feed_dict参数指定，参见tf.placeholder()

举个例子吧，

x = tf.placeholder(tf.int64, name = 'x')
print(sess.run(2 * x, feed_dict = {x: 3}))
sess.close()

结果：6

1.1 Linear Function

我们用linear function ： Y = WX + b 来写一个小程序。

先说明一下，权重W的形状是 (4, 3)， X是(3, 1) ， b是(4, 1)，至于，tf.matmul()， tf.add()，np.random.randn()，点链接吧

def linear_function():    
    np.random.seed(1)
 
    X = tf.constant(np.random.randn(3, 1), name = 'X')
    W = tf.constant(np.random.randn(4, 3), name = 'W')
    b = tf.constant(np.random.randn(4, 1), name = 'b')
    Y = tf.add(tf.matmul(W,X), b)
   
    sess = tf.Session()
    result = sess.run(Y)

    sess.close()

    return result

print( "result = " + str(linear_function()))

试一下，结果为

result = [[-2.15657382]
          [ 2.95891446]
          [-1.08926781]
          [-0.84538042]]

我们接下来再举几个深度学习中常用的函数作为例子，帮助我们更好的熟悉TensorFlow的使用

1.2 Sigmoid

关于sigmoid的介绍，点这里，简单来说，sigmoid函数再神经网络中被当作阈值函数，将变量映射到 (0, 1)之间。

需要说一下的是，Tensorflow中有 tf.sigmoid() 或者 tf.softmax() 这种函数，这里作练习TensorFlow之用。

def sigmoid(z):

    x = tf.placeholder(tf.float32, name = 'x')

    sigmoid = tf.sigmoid(x)

    with tf.Session() as sess:
        result = sess.run(sigmoid, feed_dict = {x: z})
   
    return result

1.3 Cost Function

逻辑回归的cost function为

其中 a = sigmoid (z)，写成

如果不理解cost function的含义，可以参考。

代码如下：

def cost(logits, labels):
    """
    logits -- z 向量在最后sigmoid函数之前的节点
    labels -- y
    """
   
    z = tf.placeholder(tf.float32, name = 'z')
    y = tf.placeholder(tf.float32, name = 'y')
    cost = tf.nn.sigmoid_cross_entropy_with_logits(logits = z, labels = y)

    sess = tf.Session()
    cost = sess.run(cost, feed_dict = {z: logits, y: labels})
    sess.close()

    
    return cost

2. 使用TensorFlow建立一个神经网络

2.0 目标

我们训练一个识别手势的神经网络，6种状态如下图所示：

训练集合为1080 张 64 *64 的图片，每个数字有180张图片

测试集合为 120 张 64*64 的图片，每个数字有20张图片

需要指出的是，这里的数据集里每张图片已经被标记了label，label就是图片代表的含义

首先加载数据集：

X_train_orig, Y_train_orig, X_test_orig, Y_test_orig, classes = load_dataset()

改变index的值可以查看数据集中的图片

index = 0
plt.imshow(X_train_orig[index])
print ("y = " + str(np.squeeze(Y_train_orig[:, index])))

接下来，我们要将图像数据集平坦化，再将其归一化，再将每个标签转化为一个onehot矢量：

关于onehot，在这里

# 平坦化训练集和测试集
X_train_flatten = X_train_orig.reshape(X_train_orig.shape[0], -1).T
X_test_flatten = X_test_orig.reshape(X_test_orig.shape[0], -1).T
# 归一化，也就是除以255
X_train = X_train_flatten/255.
X_test = X_test_flatten/255.
# 将每个标签转化为onehot矢量
Y_train = convert_to_one_hot(Y_train_orig, 6)
Y_test = convert_to_one_hot(Y_test_orig, 6)

在往下写之前，声明神经网络各层的内容为：

LINEAR ->RELU ->LINEAR ->RELU ->LINEAR ->SOFTMAX

以一层为例，一个神经元可以形象地表示为：

2.1 placeholders

老生常谈。

def create_placeholders(n_x, n_y):

#n_x -- 平坦化以后数据的大小64 * 64 * 3 = 12288

 #n_y -- class的数量 0，1，2，3，4，5，所以大小为6 
 
   #Y -- 标签的占位, 形状[n_y, None] ，类型"float" 
  #X -- 输入数据的占位，形状[n_x, None]，类型"float" #None可以使输入数据的数量更加灵活 
 
   Y = tf.placeholder(tf.float32, shape = (n_y, None), name = 'PlaceHolder_2') 
  X = tf.placeholder(tf.float32, shape = (n_x, None), name = 'PlaceHolder_1') 
 
   return X, Y 
 
2.2 初始化系数

def initialize_parameters():

tf.set_random_seed(1) # 这一行仅仅是为了让每次生成的随机数相同，加不加都可以

W1 = tf.get_variable('W1', [25, 12288], initializer = tf.contrib.layers.xavier_initializer(seed = 1))

b1 = tf.get_variable('b1', [25, 1], initializer = tf.zeros_initializer())

b2 = tf.get_variable('b2', [12, 1], initializer = tf.zeros_initializer())

W2 = tf.get_variable('W2', [12, 25], initializer = tf.contrib.layers.xavier_initializer(seed = 1))

W3 = tf.get_variable('W3', [6, 12], initializer = tf.contrib.layers.xavier_initializer(seed = 1)) b3 = tf.get_variable('b3', [6, 1], initializer = tf.zeros_initializer())

parameters = {"W1": W1, "b1": b1, "W2": W2, "b2": b2, "W3": W3, "b3": b3}

return parameters

2.3 TensorFlow中的前向传播

def forward_propagation(X, parameters):
    """
    LINEAR -> RELU -> LINEAR -> RELU -> LINEAR -> SOFTMAX
    X --输入的数据集，形状为 (一个example的大小,example的数量)
    parameters -- 字典类型包含了 "W1", "b1", "W2", "b2", "W3", "b3"
    Z3 -- 最后一个LINEAR神经元的输出
    """
    
    # 从 parameters中读取各个系数
    W1 = parameters['W1']
    b1 = parameters['b1']
    W2 = parameters['W2']
    b2 = parameters['b2']
    W3 = parameters['W3']
    b3 = parameters['b3']

 Z1 = tf.add(tf.matmul(W1, X), b1) # Z1 = np.dot(W1, X) + b1
 A1 = tf.nn.relu(Z1)               # A1 = relu(Z1)
 Z2 = tf.add(tf.matmul(W2, A1), b2)# Z2 = np.dot(W2, a1) + b2
 A2 = tf.nn.relu(Z2)               # A2 = relu(Z2)
 Z3 = tf.add(tf.matmul(W3, A2), b3)# Z3 = np.dot(W3,Z2) + b3

return Z3

2.4 计算 Cost

在这里我们使用 tf.reduce.mean() 来计算成本函数：

tf.transpose()

def compute_cost(Z3, Y):
    """
    Arguments:
    Z3 -- 前向传播的输出 ，形状为 (6, number of examples)
    Y -- "true" 标签组成的向量, 形状同 Z3
    """
    
    logits = tf.transpose(Z3)
    labels = tf.transpose(Y)

    cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits = logits, labels = labels))

    return cost