(持续更新)Tensorflow学习笔记

最新推荐文章于 2024-09-16 16:25:00 发布

!柯西洗袜子

最新推荐文章于 2024-09-16 16:25:00 发布

阅读量806

点赞数

分类专栏： Tensorflow 文章标签：神经网络 tensorflow python

本文链接：https://blog.csdn.net/weixin_43395063/article/details/104448816

版权

Tensorflow 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

tensorflow笔记

参考教程: 幕课: 人工智能实践-北京大学,网易云课堂: 吴恩达机器学习

文章目录

tensorflow笔记

一. 概述

(一) 概念

图灵测试: 提问者和回答者隔开,提问这随机向机器提问,如果有超过30%的人认为回答者是人而不是机器,则算法通过图灵测试
感知机(preceptron):单层神经网络,无法计算抑或逻辑
BP: 反向传播算法
SVM: 支持向量机
- 免去神经网络參數的不足
- 避免局部最优
DBN: 深层神经网络
CNN: 卷积神经网络
人工智能: 机器模拟人的意识和思維
机器学习: 在任务T上,随经验E的增加,效果P随之增加,则这个程序可以从经验中进行学习
- 三要素:
  - 数据
  - 算法
  - 算力
机器学习的过程:

单个神经元模型:

二. python语法串讲

(一) linux一些指令

pwd:当前所在目录
(以pwd打印的是以根目录为起点的绝对路径)
ls:打印当前路径下的文件和目录
mkdir newName: 在当前路径新建一个叫newName的文件夹
cd name: 进入name文件夾
sudo rm -r filename:强制刪除文件夹并提示
sudo rm -rf filename:强制刪除文件夹但不提示

ubuntu vim:

vim filename.py打开或新建名为filename的文本
python filename.py运行名为filename的python文件
[esc] :q退出vim
[esc] :wq保存并退出
[esc] :q!不保存并退出

(二) pyhton基础语法

1. 基础

\:转义字符, 如 \t 表示tab
%:占位符,在对应的位置用%后所表示的变量替換%

2. 列表 $ list[num] $

列表名[起,止]前闭后开
列表名[ : ]访问所有
列表名[起 : 止 : 步长]从起点开始每隔步长个元素取一个元素,注意步长带方向；止可省略不写
列表名[索引号] = 新值修改
del 列表名[索引号]刪除
列表名.insert(插入位置索引号,新元素)插入

3. 元組 $ tuple(num) $

元組一旦定义,不可改变

4. 字典 `dic{鍵1:值1, 鍵2:值2, ...}`

dic[鍵x] = 值x索引

exp: dic = {1:"123", "name":"Mike", "height":178}
索引: dic["name"] 表示"Mike"

dic[鍵i] = 新值i修改
del dic[鍵i]刪除
dic[鍵i] = 新值i插入

5. 条件

(1)

if 条件成立 :
    dosomething

(2)

    if 条件1成立 :
        执行任务1
    else :
        执行任务2

(3)

    if 条件1成立 :
        执行任务1
    elif 条件2成立 :
        执行任务2
    .
    .
    .
    else :
        执行任务n

注意:
1. python用左对齐表示代码层次
2. 报错: SyntaxError: Non-ASCII character '\xe8' in file a.py on line 1, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for detail
  原因及解決: 中文无法编码,解決方式:在.py文件第一行加入 #coding:utf-8

6. 循环

(1)

 for 变量 in range(start,end) :
     dosomething

(2)

 for 变量 in 列表名 :
     dosomething

(3)

 while 条件 :
     dosomething

(4) 終止循环用break

一個例子:

code:
 for i in range(0,5) :
    print "i am counting %s" %i
結果: 
i am counting 0
i am counting 1
i am counting 2
i am counting 3
i am counting 4

7. 函數

(1) 定义函數:

def 函數名 (參數表) :
    函數体

(2) 使用函數:

函數名 (參數表)

(3) 內建函數: python解釋器自帶的函數

如:abs(num) #絕對值函數

8. 模块

模块是函數的集合,先导入,再使用

import  time
time.asctime() #輸出當前時間

9. 包

包含多個模塊

from PIL import Image #从PIL这个包导入Image模块

10. 类对象实例化

类: 函數的集合,可实例化出对象的模具
实例化: 对象 = 类()
对象: 实例化出的个体, 实实在在完成具体工作
面向對象: 程序員反復修改优化后,实例化對象,對象調用类函數执行具体操作

类的定义:

class 类名 (父类名) :
    具体函數

(1) 类里定义參數必須是self 函数时,语法规定第一个参数必须是self
例如:

class Animal:
    def breath(self):
        print "breathing"

(2) __init__函數,在新对象实例化时会自动运行,用于给新对象赋初值
例如:

class Cats(Animal):
    def __int__(self, spots):
        self.spots = spots
    def catch_mouse(self):
        print "catch mouse"

实例化:

kitty = Cat(10) 
print kitty.spots # 10
kitty.catch_mouse() # catch mouse

(3) 对象調用类的函數,用对象名.函數名()
对象調用类的变量,用对象名.变量名
(4) 类内定义函數時,如調用自身或父类的函數或变量,需用self.引导,写为self.函數名或self.变量名

一個实例
animal.py

class Animals():
    def breath(self):
        printf "breathing"
    def move(self):
        printf "moving"
    def eat(self):
        printf "eating food"
class Mammals(Animals):
    def breastfeed(self):
        printf "feeding young"
class Cats(Mammals):
    def __init__(self, spots):
        self.spots = spots
    def catch_mouse(self):
        print "catching mouse"
    def left_foot_forward(self):
        printf "left foot forward"
    def left_foot_backward(self):
        printf "left foot backward"
    def dance(self):
        self.left_foot_forward()
        self.left_foot_backward()
        self.left_foot_backward()
        self.left_foot_forward()
kitty = Cats(10)
print kitty.spots
kitty.dance()
kitty.breastfeed()
kitty.move()

运行結果:

10    
left foot forward    
left foot backward      
left foot backward    
left foot forward    
feeding young
moving

11. 文件

(1) 写: 开->存->关

import pickle #引入pickle包
文件變量 = open("文件路徑文件名(如save.dat)", "wb") #开
pickle.dump(待寫入的变量, 文件变量) #存
文件變量.close() #关

例如:

待写入的变量(数据)
game_data = {
    "position":"N2 E3"
    "pocket":["key","knife"]
    "money":160
}

写入save.dat文件:
save_flie = open("save.dat","wb")
pickle.dump(game_data, save_file)
save_file.close()

(2) 读: 开->取->关

import pickle
文件变量 = open("文件路徑文件名","rb") #开
放內容的变量 = pickle.load(文件变量) #取
文件变量.close() #关

例如:

load_file = open("save.dat","rb")
load_game_data = pickle.load(load_file)
load_file.close()

补充

python中虽然沒有访问控制关键字(如C++的private等),但在python編輯器中对访問控制有一定約束
(1) 单下划线 = protected,只允許本身与子类訪問,如:_foo
(2) 双下划线 = private,如:__foo
(3) 头尾双下划线 = 特別方法,如:__init__()

三. 一些模块

(一) turtle模块

1. 一些问题和解決方法

No module named _tkinter
- python2.7 : sudo get-apt install python-tk
- python3 : sudo apt-get install python3-tk

2. 基本操作

import turtle导入turtle模块
$ t = turtle.Pen() $ 用Pen类实例化一個叫t的对象
$ t.forward(n) $ 让t向前走n個像素点
$ t.backward(n) $ 让t后退n個像素點
$ t.left(n) $ 让t左转n度
$ t.right(n) $ 让t右转n度
$ t.reset() $ 让t复位

(二) matplotlib模块

1. 引入

sudo pip install matplotlib

2. 功能

实现图形可视化

3. 操作

//引入模块
import matplotlib.pyplot as plt
//可视化数据点
plt.scatter(x坐标,y坐标,c="颜色")
plt.show()
//可视化坐标轴,形成网格坐标点
xx, yy = np.mgrid[起:止:步长, 起:止:步长]
//x,y坐标拉直(分别成一行),形成矩阵,收集区域内所有网格坐标点
grid=np.c_[xx.ravel(), yy.ravel()]
//将收集到的坐标点计算后赋值给probs(坐标点偏红或偏蓝的量化值),喂入神经网络
probs=sess.run(y, feed_dict={x:grid})
probs=probs.reshape(xx.shape)
//描色并图形化
plt.contour(x轴坐标值,y轴坐标值,该店的高度,levels=[等高线的高度])

四. Tensorflow框架

(一) 张量计算图会话

基于tensorflow的NN: 用张量表示数据,用计算图搭建神经网络,用会话执行计算图,优化线上的权重(参数),得到模型
张量(tensor): 多维数组(列表)
- 张量可以表示0阶到n阶的数据(几个中扩就几阶)
阶: 张量的维数
数据类型:
- tf.float3232位浮点数;
- tf.int3232位整形数
- tf.constant(常数)定义常数
计算图(graph): 搭建神经网络的计算过程,只搭建,不运算. 承载一个或多个计算节点(神经元)

$ Y = XW = {x}{1}*{w}{1} + {x}{2}*{w}{2} $

一个例子: 实现两个常数的加法

新建文件 tf3_1.py
import tensorflow as tf
a = tf.constant([[1.0,2.0]])
b = tf.constant([[3.0],[4.0]])

y = tf.matmul(a, b)
print y
运行结果
Tensorflow("matmul:0", shape=(1,1), dtype=float32)
分析:

add:0: result是一个名叫 $a d d : 0$ 的张> 量
$shape({x}_{1},{x}_{2},{x}_{3},...)$ : 张量的维度,有几个 ${x}_{n}$ 就是几维张量, ${x}_{i}$ 的数值是对应数组的长度
dtype: 数据类型

会话(Session): 执行计算图中的节点运算

with tf.Session() as sess:
  print sess.run(需要计算的节点变量)

一个例子

新建文件 tf3_2.py

import tensorflow as tf
x = tf.constant([[1.0, 2.0]])
w = tf.constant([[3.0],[4.0]])
y = tf.matmul(x, w)
print y 
with tf.Session() as sess:
  print sess.run(y)

运行结果

Tensor("Matmul:0", shape=(1, 1), dtype=float32)
[[11.]]

注意:可能warning如下:

Tensor("MatMul:0", shape=(1, 1), dtype=float32)
2019-07-13 17:57:09.100298: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2019-07-13 17:57:09.100369: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2019-07-13 17:57:09.100393: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2019-07-13 17:57:09.100412: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2019-07-13 17:57:09.100443: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
[[11.]]

这是因为电脑支持一些可以加速的指令,但是运行代码的时候并没有启动这些指令.可以用如下方法屏蔽这些提示:
(1) xxx@aaa:~/tf$ vim ~/.bashrc进入主目录下的bashrc文件
(2) 最后一行添加export TF_CPP_MIN_LOG_LEVEL=2把tensorflow的提示等级降低,保存并退出
(3) source ~/.bashrc刚才的配置文件生效

(二) 前向传播

参数: 即权重 ${w}_{i}$ ,用变量表示,一般随机给初值

w = tf.Variable(tf.random_normal([2,3], stddev=2, mean=0, seed=1))

tf.random_normal()生成正态分布的随机数
tf.truncated_normal()去掉过大偏离点的正态分布随机数
tf.random_normal()平均分布的随机数
tf.random_normal([2,3])产生[2,3]矩阵
stddev = 2标准差为2
mean = 0均值为0
seed = 1随机种子,去掉每次生成的结果不一样
标准差,均值,随机种子没有特殊要求可以不写
tf.zeros生成全0数组,如tf.zeros([3,2], int32)
tf.ones全1数组, 如 tf.ones([3,2],int32)
tf.fill全定制数组,如 tf.fill([3,3], int32)
tf.constant直接给值,如 tf.constant([3,2,1])表示直接生成 $[3, 2, 1]$

神经网络实现过程:

$\begin{cases} 1 准备数据集,提取特征,作为输入喂给神经网络\\ 2 前向传播:搭建NN结构,从输入到输出(先搭建计算图, 再用会话执行)\\ \qquad NN传播算法 ==> 计算输出\\ 3 反向传播:大量特征数据喂给NN,迭代优化NN参数\\ \qquad NN反向传播算法 ==> 优化参数训练模型\\ \end{cases}$
$ 使用过程\quad4使用训练好的模型预测和分类 $

前向传播: 搭建模型,实现推理

一个全连接网络的例子

生产一批零件将体积 ${x}_{1}$ 和重量 ${x}_{2}$ 作为特征输入NN,通过NN后输出一个数值

tensorflow描述计算过程:
(1) X是输入为 $1\times2$ 的矩阵: ${W}_{前节点编号,元节点编号}^{(层数)}$ 为待优化的参数
(2) $W^{(1)} = \left[ \begin{matrix} {w}_{1.1}^{(1)} & {w}_{1.2}^{(1)} & {w}_{1.3}^{(1)} \\ {w}_{2.1}^{(1)} & {w}_{2.2}^{(1)} & {w}_{2.3}^{(1)} \end{matrix} \right]为2\times3矩阵$

(3) $a^{(1)} = [a_{11}, a_{12}, a_{13}]为1\times3$ 矩阵 =XW^{(1)}$

(4) $W^{(2)} = \left[ \begin{matrix} {w}_{1.1}^{(2)} \\ {w}_{2.1}^{(2)} \\ {w}_{3.1}^{(2)} \end{matrix} \right] 为3\times1矩阵$

(5) $y = a^{(1)}W^{(2)}$

用两个式子表示:

$a = t f . m a t m u l (X, W 1)$
$y = t f . m a t m u l (a, W 2)$
a: 第一个计算层(第一次层网络)
$W^{(1)}$ : 第一层参数

变量初始化,计算图节点运算都要用会话实现:

with tf.Session() as sess:
    sess.run()

变量初始化: 在sess.run函数中用tf.global_variables_initializer()

init_op=tf.global_variables_initializer()
sess.run(init_op)

计算图节点运算: 在sess.run函数中写入待运算的节点

sess.run(y)

用tf.placeholder占位, 在sess.run()函数中用feed_dic喂数据

//喂一组数据:
x = tf.placeholder(tf.float32, shape = (1,2))
sess.run(y,feed_dict={x:[[0.5,0.6]]})
//喂多组数据:
x = tf.placeholder(tf.float32, shape = (None,2))
sess.run(y,feed_dict={x:[[0.1,0.2],[0.3,0.4],[0.4,0.5]]})

注意:shape = (x,y)中,x表示喂入神经网络的数据的组数,None表示不知道几组数据;y表示数组的特征个数

一个例子

  1 #coding:utf-8
  2 #两层简单神经网络(全连接)
  3 import tensorflow as tf
  4 
  5 #定义输入和参数
  6 #用placeholder定义输入
  7 x = tf.placeholder(tf.float32, shape=(None, 2))
  8 w1= tf.Variable(tf.random_normal([2,3], stddev=1, seed=1))
  9 w2= tf.Variable(tf.random_normal([3,1], stddev=1, seed=1))
 10 
 11 #定义前向传输过程
 12 a = tf.matmul(x, w1)
 13 y = tf.matmul(a, w2)
 14 
 15 #调用会话计算结果
 16 with tf.Session() as sess:
 17     init_op = tf.global_variables_initializer()
 18     sess.run(init_op)
 19     print"the result of y is:\n",sess.run(y, feed_dict={x: [[0.7,0.5],[0.2,0    .3],[0.3,0.4],[0.4,0.5]]})
 20     print "w1:\n", sess.run(w1)
 21     print "w2:\n", sess.run(w2)

运行结果:

the result of y is:
[[3.0904665]
 [1.2236414]
 [1.7270732]
 [2.2305048]]
w1:
[[-0.8113182   1.4845988   0.06532937]
 [-2.4427042   0.0992484   0.5912243 ]]
w2:
[[-0.8113182 ]
 [ 1.4845988 ]
 [ 0.06532937]]

(三) 反向传播

反向传播: 训练模型参数,在所有参数上用梯度下降,使NN模型在训练数据上的损失函数最小.
损失函数(loss): 预测值( $y$ )与已知答案( $y_-$ )的差距
均方误差 MSE: $ MSE(y_- - y) = \sum_{i=1}^n\frac{{(y - y_-)}^2}{n}$ loss = tf.reduce_mean(tf.square(y_ - y))
反向传播训练方法: 以减小loss值为优化目标

train_step = tf.train.GradientDescentOptimizer(learing_rate).minimize(loss)#梯度下降
train_step = tf.train.MomentumOptimizer(learning_rate, momentum).minimize(loss)#Momentum优化器
train_step = tf.train.AdamOptimizer(learning_rate).minimize(loss)#Adam优化器

学习率: 决定参数每次更新的幅度(一般尽量选较小数如0.001,具体视情况而定)

一个例子:
描述: 有一批零件,有两个特征:体积和数量, 零件的标签:合格与否. 需要通过神经网络对零件实现预测和分类

  1 #coding:utf-8
  2 #0导入模块，生成模拟数据集。
  3 import tensorflow as tf
  4 import numpy as np#科学计算模块
  5 BATCH_SIZE = 8#一次喂8组数据
  6 SEED = 23455
  7 
  8 #基于seed产生随机数
  9 rdm = np.random.RandomState(SEED)
 10  #随机数返回32行2列的矩阵 表示32组 体积和重量 作为输入数据集
 11 X = rdm.rand(32,2)#32组,2列(体积,重量)
 12 #从X这个32行2列的矩阵中 取出一行 判断如果和小于1 给Y赋值1(合格) 如果和不小于1 给Y赋值0(标签) 
 13 #作为输入数据集的标签（正确答案）.因为没有数据集,所以虚拟了样本和标签 
 14 Y_ = [[int(x0 + x1 < 1)] for (x0, x1) in X]
 15 print "X:\n",X
 16 print "Y_:\n",Y_
 17 
 18 #1定义神经网络的输入、参数和输出,定义前向传播过程。
 19 x = tf.placeholder(tf.float32, shape=(None, 2))
 20 y_= tf.placeholder(tf.float32, shape=(None, 1))
 21 
 22 w1= tf.Variable(tf.random_normal([2, 3], stddev=1, seed=1))
 23 w2= tf.Variable(tf.random_normal([3, 1], stddev=1, seed=1))
 24 
 25 a = tf.matmul(x, w1)
 26 y = tf.matmul(a, w2)
 27 
 28 #2定义损失函数及反向传播方法。
 29 loss_mse = tf.reduce_mean(tf.square(y-y_))
 30 train_step = tf.train.GradientDescentOptimizer(0.001).minimize(loss_mse)
 31 #train_step = tf.train.MomentumOptimizer(0.001,0.9).minimize(loss_mse)
 32 #train_step = tf.train.AdamOptimizer(0.001).minimize(loss_mse)
 33 
 34 #3生成会话，训练STEPS轮
 35 with tf.Session() as sess:
 36     init_op = tf.global_variables_initializer()
 37     sess.run(init_op)
 38     # 输出目前（未经训练）的参数取值。
 39     print "w1:\n", sess.run(w1)
 40     print "w2:\n", sess.run(w2)
 41     print "\n"
 42 
 43     # 训练模型。
 44     STEPS = 3000
 45     for i in range(STEPS):
 46         start = (i*BATCH_SIZE) % 32
 47         end = start + BATCH_SIZE
 48         sess.run(train_step, feed_dict={x: X[start:end], y_: Y_[start:end]})
 49         if i % 500 == 0:
 50             total_loss = sess.run(loss_mse, feed_dict={x: X, y_: Y_})
 51             print("After %d training step(s), loss_mse on all data is %g" % (i, total_loss))
 52 
 53     # 输出训练后的参数取值。
 54     print "\n"
 55     print "w1:\n", sess.run(w1)
 56     print "w2:\n", sess.run(w2)

- 输出:

X:
[[0.83494319 0.11482951]
 [0.66899751 0.46594987]
 [0.60181666 0.58838408]
 [0.31836656 0.20502072]
 [0.87043944 0.02679395]
 [0.41539811 0.43938369]
 [0.68635684 0.24833404]
 [0.97315228 0.68541849]
 [0.03081617 0.89479913]
 [0.24665715 0.28584862]
 [0.31375667 0.47718349]
 [0.56689254 0.77079148]
 [0.7321604  0.35828963]
 [0.15724842 0.94294584]
 [0.34933722 0.84634483]
 [0.50304053 0.81299619]
 [0.23869886 0.9895604 ]
 [0.4636501  0.32531094]
 [0.36510487 0.97365522]
 [0.73350238 0.83833013]
 [0.61810158 0.12580353]
 [0.59274817 0.18779828]
 [0.87150299 0.34679501]
 [0.25883219 0.50002932]
 [0.75690948 0.83429824]
 [0.29316649 0.05646578]
 [0.10409134 0.88235166]
 [0.06727785 0.57784761]
 [0.38492705 0.48384792]
 [0.69234428 0.19687348]
 [0.42783492 0.73416985]
 [0.09696069 0.04883936]]
Y_:
[[1], [0], [0], [1], [1], [1], [1], [0], [1], [1], [1], [0], [0], [0], [0], [0], [0], [1], [0], [0], [1], [1], [0], [1], [0], [1], [1], [1], [1], [1], [0], [1]]
w1:
[[-0.8113182   1.4845988   0.06532937]
 [-2.4427042   0.0992484   0.5912243 ]]
w2:
[[-0.8113182 ]
 [ 1.4845988 ]
 [ 0.06532937]]


After 0 training step(s), loss_mse on all data is 5.13118
After 500 training step(s), loss_mse on all data is 0.429111
After 1000 training step(s), loss_mse on all data is 0.409789
After 1500 training step(s), loss_mse on all data is 0.399923
After 2000 training step(s), loss_mse on all data is 0.394146
After 2500 training step(s), loss_mse on all data is 0.390597


w1:
[[-0.7000663   0.9136318   0.08953571]
 [-2.3402493  -0.14641267  0.58823055]]
w2:
[[-0.06024267]
 [ 0.91956186]
 [-0.0682071 ]]

(四)神经网络搭建八股: 准备, 前传, 反传, 迭代

准备:

import
常量定义
生成数据集

前向传播:

x  =
y_ = 
w1 = 
w2 = 
a  =
y  =

反向传播: 定义损失函数,反向传播方法

loss = 
train_step =

生成会话, 训练STEPS轮

with tf.session() as sess:
    init_op = tf.global_variables_initializer()
    sess.run(init_op)

    STEPS = 3000#迭代次数
    for i in range(STEPS):
        start = 
        end = 
        sess.run(train_step, feed_dict)

注意:由于真实情况有大量数据,常用print打印出迭代过程参数的变化

五. 神经网络优化

(一) 损失函数

1943年McCulloch Pitts神经元模型

$f(\sum{x_i}{w_i}+b)$
$f$ 是激活函数(activation function)
$b$ 是偏置项(bias)

激活函数

relu: $f(x)=max(x,0)=\left\{ \begin{array}{rcl} 0 & x <=0\\ x & x>=0 \end{array} \right.$ tf.nn.relu()
sigmoid: $f(x)=\frac{1}{1+e^{-x}}$ tf.nn.sigmoid()
tanh: $f(x)=\frac{1-e^{-2x}}{1+e^{-2x}}$ tf.nn.tanh()

NN复杂度: 多用NN层数和NN参数的个数表示
- $层数 = 隐藏层的层数 + 1 个输出层$
- $总参数 = 总 W + 总 b$
损失函数loss,学习率learning_rate,华东平均ema,正则化regularization
- 损失函数(loss):预测值(y)与已知答案(y_)的差距
  - $\begin{cases} 均方误差:mse(Mean Squared Error)\\ 自定义\\ 交叉熵:ce(Cross Entropy) \end{cases}$

均方误差 $m s e$ : $MSE(y_{\_},y)=\frac{\sum_{i=1}^n (y-y_{\_})^2}{n}$ loss_mse=tf.reduce_mean(tf.square(y_-y))
一个例子: 预测酸奶日销量y. x1和x2是影响日销量的因素. (建模前应预先采集的数据有: 每日x1 x2和销量y_ (即已知答案, 最佳情况: 产量=销量) 拟造数据集X,Y_: y_=x1+x2; 噪声:-0.005~0.005; ) 拟合可以预测销量的函数
代码如下:

  1 #coding:utf-8
  2 #预测多或预测少的影响一样
  3 #0导入模块,生成数据集
  4 import tensorflow as tf
  5 import numpy as np
  6 BATCH_SIZE = 8
  7 SEED = 23455
  8 
  9 rdm = np.random.RandomState(SEED)
 10 X = rdm.rand(32,2)
 11 Y_ = [[x1 + x2 + rdm.rand()/10.0-0.05] for (x1, x2) in X]
 12 
 13 #1定义神经网络的输入,参数和输出,定义前向传播过程
 14 x = tf.placeholder(tf.float32, shape=(None, 2))
 15 y_= tf.placeholder(tf.float32, shape=(None, 1))
 16 w1= tf.Variable(tf.random_normal([2,1], stddev = 1, seed = 1))
 17 y = tf.matmul(x, w1)
 18 
 19 #2定义损失函数及反向传播方法
 20 #定义损失函数为MSE,反向传播方法为梯度下降
 21 loss_mse = tf.reduce_mean(tf.square(y_ - y))
 22 train_step = tf.train.GradientDescentOptimizer(0.001).minimize(loss_mse)
 23 
 24 #3生成会话,训练STEPS轮
 25 with tf.Session() as sess:
 26     init_op = tf.global_variables_initializer()
 27     sess.run(init_op)
 28     STEPS = 20000
 29     for i in range(STEPS):
 30         start = (i*BATCH_SIZE) % 32
 31         end = (i*BATCH_SIZE) % 32 + BATCH_SIZE
 32         sess.run(train_step, feed_dict={x: X[start:end], y_: Y_[start:end]})
 33         if i % 500 == 0:
 34             print "After %d traning steps, w1 is: " % (i)
 35             print sess.run(w1), "\n"
 36     print "Final w1 is :\n", sess.run(w1)
 37

运行结果:

After 0 traning steps, w1 is: 
[[-0.80974597]
 [ 1.4852903 ]] 

After 500 traning steps, w1 is: 
[[-0.46074435]
 [ 1.641878  ]] 

After 1000 traning steps, w1 is: 
[[-0.21939856]
 [ 1.6984766 ]] 

After 1500 traning steps, w1 is: 
[[-0.04415595]
 [ 1.7003176 ]] 

After 2000 traning steps, w1 is: 
[[0.08942621]
 [1.673328  ]] 

After 2500 traning steps, w1 is: 
[[0.19583555]
 [1.6322677 ]] 

After 3000 traning steps, w1 is: 
[[0.28375748]
 [1.5854434 ]] 

After 3500 traning steps, w1 is: 
[[0.35848638]
 [1.5374472 ]] 

After 4000 traning steps, w1 is: 
[[0.42332518]
 [1.4907393 ]] 

After 4500 traning steps, w1 is: 
[[0.48040026]
 [1.4465574 ]] 

After 5000 traning steps, w1 is: 
[[0.53113604]
 [1.4054536 ]] 

After 5500 traning steps, w1 is: 
[[0.5765325]
 [1.3675941]] 

...

After 16000 traning steps, w1 is: 
[[0.95107025]
 [1.0415728 ]] 

After 16500 traning steps, w1 is: 
[[0.9560928]
 [1.037164 ]] 

After 17000 traning steps, w1 is: 
[[0.96064115]
 [1.0331714 ]] 

After 17500 traning steps, w1 is: 
[[0.96476096]
 [1.0295546 ]] 

After 18000 traning steps, w1 is: 
[[0.9684917]
 [1.0262802]] 

After 18500 traning steps, w1 is: 
[[0.9718707]
 [1.0233142]] 

After 19000 traning steps, w1 is: 
[[0.974931 ]
 [1.0206276]] 

After 19500 traning steps, w1 is: 
[[0.9777026]
 [1.0181949]] 

Final w1 is :
[[0.98019385]
 [1.0159807 ]]

可以看到,随着迭代次数的增加,两个参数越来越趋进于1
拟合结果: $y = 0.98x_1 + 1.02x_2$
(但是,由实际我们可知,预测商品销量,预测多了,损失成本;预测少了,损失利润. 所以接下来我们使用自定义损失函数)

自定义损失函数
若利润 $\neq$ 成本,则mse产生的loss无法利益最大化
自定义损失函数 $loss(y_{\_},y)=\sum_n f(y_{\_},y)$
,其中,y_是来自数据集的标准答案,y是计算出的预测答案
$f(y_{\_},y)=\begin{cases} PROFIT*(y_{\_} - y) & y < y_{\_} & 预测的y少了,损失利润\\ COST*(y-y_{\_}) & y >= y_{\_} & 预测的y多了,损失成本 \end{cases}$
loss = tf.reduce_sum(tf.where(tf.greater(y,y_),COST(y-y_), PROFIT(y_-y)))
如:预测酸奶销量,成本(COST)1元, 利润(PROFIT)9元.
预测小了损失利润9元,预测大了损失成本1元
预测少了损失大,希望函数往多了预测.
代码如下:

  1 #coding:utf-8
  2 import tensorflow as tf
  3 import numpy as np
  4 BATCH_SIZE = 8
  5 SEED = 23455
  6 COST = 1
  7 PROFIT = 9
  8 
  9 rdm = np.random.RandomState(SEED)
 10 X = rdm.rand(32,2)
 11 Y_= [[x1 + x2 + rdm.rand()/10.0-0.05] for (x1,x2) in X]
 12 
 13 x = tf.placeholder(tf.float32, shape = (None,2))
 14 y_= tf.placeholder(tf.float32, shape = (None,1))
 15 w1= tf.Variable(tf.random_normal([2,1],stddev = 1, seed = 1))
 16 y = tf.matmul(x, w1)
 17 
 18 loss = tf.reduce_sum(tf.where(tf.greater(y, y_), (y - y_)*COST, (y_ - y)*PRO    FIT))
 19 train_step = tf.train.GradientDescentOptimizer(0.001).minimize(loss)
 20 
 21 with tf.Session() as sess:
 22     init_op = tf.global_variables_initializer()
 23     sess.run(init_op)
 24     STEPS = 20000
 25     for i in range (STEPS):
 26         start = (i*BATCH_SIZE) % 32
 27         end = (i*BATCH_SIZE) % 32 + BATCH_SIZE
 28         sess.run(train_step, feed_dict={x: X[start:end], y_: Y_[start:end]})
 29         if i % 500 == 0:
 30             print "After %d training steps, w1 is: " % (i)
 31             print sess.run(w1), "\n"
 32     print "Final w1 is :\n", sess.run(w1)

拟合结果如下:

Final w1 is :
[[1.020171 ]
 [1.0425103]]

可见两个参数均大于1,模型在往多了预测

交叉熵ce(Cross Entropy): 表征两个概率分布之间的距离 $H(y_{\_},y)=-\sum y_{\_}*logy$
- ce = -tf.reduce_mean(y_ *tf.log(tf.clip_by_value(y, 1e-12, 10.)))
- y小于 $10^{-12}$ 时为 $10^{-12}$ ,大于1.0时为1.0
- softmax():
  1. 当n分类的n个输出 $y_1,y_2,...,y_n)$ 通过softmax()函数,便满足了概率分布需求: $\forall x,P(X=x)\in [0,1]且\sum_x P(X=x) = 1$
  2. $softmax(y_i) = \frac{e^{y_i}}{\sum_{j=1}^n e^{y_i}}$
  3. ce = tf.nn.aparse_softmax_cross_entropy_with_logits(logits=y,labels=tf.argmax(y_,1))
    cem=tf.reduce_mean(ce)

(二) 学习率

学习率learninng_rate: 每次参数更新的幅度
$w_{n+1}=w_n -learning\_rate\nabla$
- $w_{n+1}$ :更新后的参数
- $w_n$ : 当前参数
- learning_rate: 学习率
- $\nabla$ : 损失函数的梯度(导数)

一个例子

  1 #coding:utf-8
  2 #设损失函数 loss=(w+1)^2 ,令w初值是常数5. 反向传播就是求最优w,即最小loss对应
    的w值
  3 import tensorflow as tf
  4 #定义待优化参数w初值为5
  5 w = tf.Variable(tf.constant(5, dtype=tf.float32))
  6 #定义损失函数loss
  7 loss = tf.square(w+1)
  8 #定义反向传播方法
  9 train_step = tf.train.GradientDescentOptimizer(0.2).minimize(loss)
 10 #生成会话,训练40轮
 11 with tf.Session() as sess:
 12     init_op=tf.global_variables_initializer()
 13     sess.run(init_op)
 14     for i in range(40):
 15         sess.run(train_step)
 16         w_val = sess.run(w)
 17         loss_val = sess.run(loss)
 18         print "After %s steps: w is %f,  loss is %f.\n" %(i, w_val, loss_val    )

结果:

After 0 steps: w is 2.600000,  loss is 12.959999.
After 1 steps: w is 1.160000,  loss is 4.665599.
After 2 steps: w is 0.296000,  loss is 1.679616.
After 3 steps: w is -0.222400,  loss is 0.604662.
After 4 steps: w is -0.533440,  loss is 0.217678.
After 5 steps: w is -0.720064,  loss is 0.078364.
After 6 steps: w is -0.832038,  loss is 0.028211.
After 7 steps: w is -0.899223,  loss is 0.010156.
After 8 steps: w is -0.939534,  loss is 0.003656.
After 9 steps: w is -0.963720,  loss is 0.001316.
After 10 steps: w is -0.978232,  loss is 0.000474.
After 11 steps: w is -0.986939,  loss is 0.000171.
After 12 steps: w is -0.992164,  loss is 0.000061.
After 13 steps: w is -0.995298,  loss is 0.000022.
After 14 steps: w is -0.997179,  loss is 0.000008.
After 15 steps: w is -0.998307,  loss is 0.000003.
After 16 steps: w is -0.998984,  loss is 0.000001.
After 17 steps: w is -0.999391,  loss is 0.000000.
After 18 steps: w is -0.999634,  loss is 0.000000.
After 19 steps: w is -0.999781,  loss is 0.000000.
After 20 steps: w is -0.999868,  loss is 0.000000.
After 21 steps: w is -0.999921,  loss is 0.000000.
After 22 steps: w is -0.999953,  loss is 0.000000.
After 23 steps: w is -0.999972,  loss is 0.000000.
After 24 steps: w is -0.999983,  loss is 0.000000.
After 25 steps: w is -0.999990,  loss is 0.000000.
After 26 steps: w is -0.999994,  loss is 0.000000.
After 27 steps: w is -0.999996,  loss is 0.000000.
After 28 steps: w is -0.999998,  loss is 0.000000.
After 29 steps: w is -0.999999,  loss is 0.000000.
After 30 steps: w is -0.999999,  loss is 0.000000.
After 31 steps: w is -1.000000,  loss is 0.000000.
After 32 steps: w is -1.000000,  loss is 0.000000.
After 33 steps: w is -1.000000,  loss is 0.000000.
After 34 steps: w is -1.000000,  loss is 0.000000.
After 35 steps: w is -1.000000,  loss is 0.000000.
After 36 steps: w is -1.000000,  loss is 0.000000.
After 37 steps: w is -1.000000,  loss is 0.000000.
After 38 steps: w is -1.000000,  loss is 0.000000.
After 39 steps: w is -1.000000,  loss is 0.000000.

改变学习率大小后再运行,发现 学习率大了震荡不收敛,小了收敛速度慢

指数衰减学习率:
- 根据运行BATCH_SIZE的轮数动态更新学习率
- $learning\_rate=LEARNING\_RATE\_BASE \times LEARNING\_RATE\_DECAY^{\frac{global\_step}{LEARNING\_RATE\_STPE}}$
  其中,learningn_rate是 学习率基数；learning_rate_base是 学习率初始值 ; learning_rate_decay是 学习率衰减率(0,1) ；global_step是 运行了几轮BATCH_SIZE；多少轮更新一次学习率= $\frac{总样本数}{BATCH\_SIZE}$
```
global_step = tf.Variable(0, trainable=False)#运行到第几轮的计数器
learning_rate = tf.train.exponential_decay(LEARNING_RATE_BASE,global_step,LEARNING_RATE_STEP,LEARNING_RATE_DECAY,staircase=True)#为True时,学习率呈梯形衰减；反之学习率是一条平滑下降的曲线
```

一个例子

  1 #coding:utf-8
  2 #设损失函数 loss=(w+1)^2, 令w初值是常熟10
  3 #使用指数衰减的学习率,在迭代初期得到较高的下降速度,可以在较小的训练轮数下取得更快收敛速度
  4 import tensorflow as tf
  5 
  6 LEARNING_RATE_BASE = 0.1#最初学习率
  7 LEARNING_RATE_DECAY = 0.99#学习率衰减率
  8 LEARNING_RATE_STEP = 1#喂入多少轮BATCH_SIZE后更新一次学习率,一般是 总样本数/BATCH_SIZE 的值
  9 
 10 #运行了几轮BATCH_SIZE的计数器,初值给0,意为不被训练
 11 global_step = tf.Variable(0, trainable=False)
 12 #定义指数下降学习率
 13 learning_rate = tf.train.exponential_decay(LEARNING_RATE_BASE, global_step, LEARNING_RATE_STEP, LEARNING_RATE_DECAY, staircase=True)
 14 #定义待优化函数, 初值给10
 15 w = tf.Variable(tf.constant(5, dtype=tf.float32))
 16 #定义损失函数loss
 17 loss = tf.square(w+1)
 18 #定义反向传播方法
 19 train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss, global_step=global_step)
 20 #生成会话, 训练40轮
 21 with tf.Session() as sess:
 22     init_op = tf.global_variables_initializer()
 23     sess.run(init_op)
 24     for i in range(40):
 25         sess.run(train_step)
 26         learning_rate_val = sess.run(learning_rate)
 27         global_step_val = sess.run(global_step)
 28         w_val = sess.run(w)
 29         loss_val = sess.run(loss)
 30         print "\nAfter %s steps: global_step is %f, w is %f\n learning_rate is %f\n loss is %f" % (i, global_step_val, w_val, learning_rate_val, loss_val)

运行结果

After 0 steps: global_step is 1.000000, w is 3.800000
 learning_rate is 0.099000
 loss is 23.040001

After 1 steps: global_step is 2.000000, w is 2.849600
 learning_rate is 0.098010
 loss is 14.819419

After 2 steps: global_step is 3.000000, w is 2.095001
 learning_rate is 0.097030
 loss is 9.579033

After 3 steps: global_step is 4.000000, w is 1.494386
 learning_rate is 0.096060
 loss is 6.221961

After 4 steps: global_step is 5.000000, w is 1.015167
 learning_rate is 0.095099
 loss is 4.060896

After 5 steps: global_step is 6.000000, w is 0.631886
 learning_rate is 0.094148
 loss is 2.663051

After 6 steps: global_step is 7.000000, w is 0.324608
 learning_rate is 0.093207
 loss is 1.754587

After 7 steps: global_step is 8.000000, w is 0.077684
 learning_rate is 0.092274
 loss is 1.161403
...

After 33 steps: global_step is 34.000000, w is -0.989550
 learning_rate is 0.071055
 loss is 0.000109

After 34 steps: global_step is 35.000000, w is -0.991035
 learning_rate is 0.070345
 loss is 0.000080

After 35 steps: global_step is 36.000000, w is -0.992297
 learning_rate is 0.069641
 loss is 0.000059

After 36 steps: global_step is 37.000000, w is -0.993369
 learning_rate is 0.068945
 loss is 0.000044

After 37 steps: global_step is 38.000000, w is -0.994284
 learning_rate is 0.068255
 loss is 0.000033

After 38 steps: global_step is 39.000000, w is -0.995064
 learning_rate is 0.067573
 loss is 0.000024

After 39 steps: global_step is 40.000000, w is -0.995731
 learning_rate is 0.066897
 loss is 0.000018

(三) 滑动平均`ema`

滑动平均(影子): 记录了没一个参数(包括w和b)在过往一段时间内值的平均,增加了模型的泛化性
- 参数发生变化,影子慢慢追随
- $\times 影子 + (1-衰减率) \times 参数$
  - $影子初值 = 参数初值$
  - $衰减率(MOVING\_AVERAGE\_DECAY) = min\{衰减率, \frac{1+轮数}{10+轮数}\}$
```
ema = tf.train.ExponentialMovingAverage(衰减率MOVING_AVERAGE_DECAY,当前轮数global_step)#计算衰减率
ema_op = ema.apply([])#[]中列出需要计算滑动平均的参数,或者将[]用函数用tf.trainable_variables()代替,可自动列出所有参数
with tf.control_dependencise([train_step, ema_op]):
    train_op = tf.no_op(name='train')

还可以用函数返回某参数的滑动平均值:
ema.average(参数名)
```

一个例子

  1 #coding:utf-8
  2 import tensorflow as tf
  3 
  4 w1 = tf.Variable(0, dtype=tf.float32)
  5 global_step = tf.Variable(0, trainable=False)
  6 MOVING_AVERAGE_DECAY = 0.99
  7 ema = tf.train.ExponentialMovingAverage(MOVING_AVERAGE_DECAY, global_step)
  8 
  9 ema_op = ema.apply(tf.trainable_variables())
 10 
 11 with tf.Session() as sess:
 12     init_op = tf.global_variables_initializer()
 13     sess.run(init_op)
 14 
 15     print sess.run([w1,ema.average(w1)])
 16 
 17     sess.run(tf.assign(w1,1))
 18     sess.run(ema_op)
 19     print sess.run([w1, ema.average(w1)])
 20 
 21     sess.run(tf.assign(global_step, 100))
 22     sess.run(tf.assign(w1, 10))
 23     sess.run(ema_op)
 24     print sess.run([w1, ema.average(w1)])
 25 
 26     for i in range(5):
 27         sess.run(ema_op)
 28         print sess.run([w1, ema.average(w1)])

运行结果

[0.0, 0.0]
[1.0, 0.9]
[10.0, 1.6445453]
[10.0, 2.3281732]
[10.0, 2.955868]
[10.0, 3.532206]
[10.0, 4.061389]
[10.0, 4.547275]

可以看到,开始时参数w1是0,滑动平均是0;w1设定为1,滑动平均为0.9;当迭代轮数更新为100轮时,参数设定为10,滑动平均向10逼近

(四) 正则化

过拟合现象: 模型在训练数据集上的正确率非常高,但对于未训练的新数据很难作出正确反映,即过拟合现象
正则化缓解过拟合: 正则化在损失函数中引入模型复杂度指标, 利用给W加权值,弱化训练数据的噪声(一般不正则化b) $loss=loss(y与y_{\_})+REGULARIZER*loss(w)$
- loss(y与y_): 模型中所有参数的损失函数
- REGULARIZER: 用超参数REGULARIZER给出参数w在总loss中的比例,及正则化的权重
- loss(w): 需要正则化的参数

loss(w)=tf.contrib.layers.l1_regularizer(REGULARIZER)(w)#w绝对值求和
loss(w)=tf.contrib.layers.l2_regularizer(REGULARIZER)(w)#w平方求和
//以上正则化二选一
tf.add_to_collection('losses',tf.contrib.layers.l2_regularizer(regulizer)(w))#把内容加到集合对应位置做加法
loss = cem + tf.add_n(tf.get_collection('losses'))

一个例子:
随机给出平面坐标上的一些点,当点的x和y坐标值平方和小于2时,该点为红色,否则为蓝色.要求拟合一条线作为红蓝点阵之间的分界线

  1 #coding:utf-8
  2 import tensorflow as tf
  3 import numpy as np
  4 import matplotlib.pyplot as plt
  5 BATCH_SIZE = 30
  6 seed = 2
  7 
  8 rdm = np.random.RandomState(seed)
  9 
 10 X = rdm.randn(300,2)
 11 Y_ = [int(x0*x0 + x1*x1 < 2) for (x0,x1) in X]
 12 Y_c = [['red' if y else 'blue'] for y in Y_]
 13 
 14 X = np.vstack(X).reshape(-1,2)#n行2列
 15 Y_ = np.vstack(Y_).reshape(-1,1)
 16 print X
 17 print Y_
 18 print Y_c
 19 
 20 plt.scatter(X[:,0], X[:,1], c = np.squeeze(Y_c))#X[:,0]表示取X第一行元素
 21 plt.show()
 22 
 23 def get_weight(shape, regularizer):
 24     w = tf.Variable(tf.random_normal(shape), dtype=tf.float32)
 25     tf.add_to_collection('losses', tf.contrib.layers.l2_regularizer(regularizer)(w))
 26     return w
 27 
 28 def get_bias(shape):
 29     b = tf.Variable(tf.constant(0.01, shape=shape))
 30     return b
 31 
 32 x = tf.placeholder(tf.float32, shape=(None, 2))
 33 y_ = tf.placeholder(tf.float32, shape=(None, 1))
 34 
 35 w1 = get_weight([2,11], 0.01)
 36 b1 = get_bias([11])
 37 y1 = tf.nn.relu(tf.matmul(x, w1)+b1)
 38 
 39 w2 = get_weight([11,1], 0.01)
 40 b2 = get_bias([1])
 41 y = tf.matmul(y1, w2)+b2#输出层不激活
 42 
 43 loss_mse = tf.reduce_mean(tf.square(y-y_))
 44 loss_total = loss_mse + tf.add_n(tf.get_collection('losses'))
 45 
 46 train_step = tf.train.AdamOptimizer(0.0001).minimize(loss_mse)
 47 
 48 with tf.Session() as sess:
 49     init_op = tf.global_variables_initializer()
 50     sess.run(init_op)
 51     STEPS = 40000
 52     for i in range(STEPS):
 53         start = (i*BATCH_SIZE) % 300
 54         end = start + BATCH_SIZE
 55         sess.run(train_step, feed_dict={x:X[start:end], y_:Y_[start:end]})
 56         if i % 2000 == 0:
 57             loss_mse_v = sess.run(loss_mse, feed_dict={x:X, y_:Y_})
 58             print("After %d steps, loss is:%f" %(i, loss_mse_v))
 59 
 60     xx, yy = np.mgrid[-3:3:.01, -3:3:.01]
 61     grid = np.c_[xx.ravel(), yy.ravel()]
 62     probs = sess.run(y, feed_dict={x:grid})
 63     probs = probs.reshape(xx.shape)
 64 
 65     print "w1:\n",sess.run(w1)
 66     print "b1:\n",sess.run(b1)
 67     print "w2:\n",sess.run(w2)
 68     print "b2:\n",sess.run(b2)
 69 
 70 plt.scatter(X[:,0], X[:,1], c=np.squeeze(Y_c))
 71 plt.contour(xx, yy, probs, levels=[.5])
 72 plt.show()
 73 
 74 #定义反向传播方法:包含正则化
 75 train_step = tf.train.AdamOptimizer(0.0001).minimize(loss_total)
 76 
 77 with tf.Session() as sess:
 78     init_op = tf.global_variables_initializer()
 79     sess.run(init_op)
 80     STEPS = 40000
 81     for i in range(STEPS):
 82         start = (i*BATCH_SIZE) % 300
 83         end = start + BATCH_SIZE
 84         sess.run(train_step, feed_dict={x: X[start:end], y_:Y_[start:end]})
 85         if i % 2000 == 0:
 86             loss_v = sess.run(loss_total, feed_dict={x:X, y_:Y_})
 87             print("After %d steps, loss is:%f" %(i, loss_v))
 88 
 89     xx, yy = np.mgrid[-3:3:.01, -3:3:.01]
 90     grid = np.c_[xx.ravel(), yy.ravel()]
 91     probs = sess.run(y, feed_dict={x:grid})
 92     probs = probs.reshape(xx.shape)
 93     print "w1:\n",sess.run(w1)
 94     print "b1:\n",sess.run(b1)
 95     print "w2:\n",sess.run(w2)
 96     print "b2:\n",sess.run(b2)
 97 
 98 plt.scatter(X[:,0], X[:,1], c=np.squeeze(Y_c))
 99 plt.contour(xx, yy, probs, levels=[.5])
100 plt.show()

包含正则化的模型具有更好的泛化性

(五) 神经网络搭建八股

forward.py
前向传播: 搭建网络,设计网络结构

def forward(x, regularizer):
    w=
    b=
    y=
    return y

def get_weight(shape, regularizer):
    w=tf.Variable()
    tf.add_to_collection('losses',tf.contrib.layers.l2_regularizer(regularizer)(w))#把每一个w的正则化损失加到总损失中返回正则化函数
    return w

def get_bias(shape):#参数:某层中b的个数
    b=tf.Variable()
    return b

backward.py
反向传播: 训练网络,优化网络参数

def backward():
    x=tf.placeholder()
    y_=tf.placeholder()
    y=forward.forward(x,REGULARIZER)
    global_step=tf.Variable(0,trainable=False)
    loss=

//正则化
y与y_的差距(loss_mse)=tf.reduce_mean(tf.square(y-y_))#均方误差
或
ce = tf.nn.spqrse_softmax_cross_entropy_with_logits(logits=y,labels=tf.argnax(y_, 1))//交叉熵
y与y_的差距(cem)=tf.reduce_mean(ce)

加入正则化后
loss = y与y_的差距 + tf.add_n(tf.get_collection('losses'))

//指数衰减学习率
learning_rate = tf.train.exponential_decay(
    LLEARNING_RATE_BASE,
    global_step,
    数据集总样本数/BATCH_SIZE,
    LEARNING_RATE_DECAY,
    staircases=True)
)

//滑动平均
见前一章代码

一个例子:

generateds.py

  1 #coding:utf-8
  2 import numpy as np
  3 import matplotlib.pyplot as plt
  4 
  5 seed = 2
  6 
  7 def generateds():
  8     rdm = np.random.RandomState(seed)
  9     X = rdm.randn(300,2)
 10     Y_ = [int(x0*x0 + x1*x1 < 2) for (x0, x1) in X]
 11     Y_c = [['red' if y else 'blue'] for y in Y_]
 12     X = np.vstack(X).reshape(-1,2)
 13     Y_ = np.vstack(Y_).reshape(-1,1)
 14 
 15     return X, Y_, Y_c

forward.py

  1 #coding:utf-8
  2 import tensorflow as tf
  3 
  4 def get_weight(shape, regularizer):
  5     w = tf.Variable(tf.random_normal(shape), dtype=tf.float32)
  6     tf.add_to_collection('losses', tf.contrib.layers.l2_regularizer(regularizer)(w))
  7     return w
  8 
  9 def get_bias(shape):
 10     b = tf.Variable(tf.constant(0.01, shape=shape))
 11     return b
 12 
 13 def forward(x, regularizer):
 14     w1 = get_weight([2,11], regularizer)
 15     b1 = get_bias([11])
 16     y1 = tf.nn.relu(tf.matmul(x, w1) + b1)
 17 
 18     w2 = get_weight([11,1], regularizer)
 19     b2 = get_bias([1])
 20     y = tf.matmul(y1, w2) + b2
 21 
 22     return y

backward.py

  1 #coding:utf-8
  2 import tensorflow as tf
  3 import numpy as np
  4 import matplotlib.pyplot as plt
  5 import opt4_8_generateds
  6 import opt4_8_forward
  7 
  8 STEPS = 40000
  9 BATCH_SIZE = 30
 10 LEARNING_RATE_BASE = 0.001
 11 LEARNING_RATE_DECAY = 0.999
 12 REGULARIZER = 0.01
 13 
 14 def backward():
 15     x = tf.placeholder(tf.float32, shape=(None, 2))
 16     y_ = tf.placeholder(tf.float32, shape=(None, 1))
 17 
 18     X, Y_, Y_c = opt4_8_generateds.generateds()
 19 
 20     y = opt4_8_forward.forward(x, REGULARIZER)
 21 
 22     global_step = tf.Variable(0, trainable=False)
 23 
 24     learning_rate = tf.train.exponential_decay(
 25             LEARNING_RATE_BASE,
 26             global_step,
 27             300/BATCH_SIZE,
 28             LEARNING_RATE_DECAY,
 29             staircase=True)
 30 
 31     loss_mse = tf.reduce_mean(tf.square(y-y_))
 32     loss_total = loss_mse + tf.add_n(tf.get_collection('losses'))
 33 
 34     train_step = tf.train.AdamOptimizer(learning_rate).minimize(loss_total)
 35 
 36     with tf.Session() as sess:
 37         init_op = tf.global_variables_initializer()
 38         sess.run(init_op)
 39         for i in range(STEPS):
 40             start = (i*BATCH_SIZE) % 300
 41             end = start + BATCH_SIZE
 42             sess.run(train_step, feed_dict={x: X[start:end], y_:Y_[start:end]})
 43             if i % 2000 == 0:
 44                 loss_v = sess.run(loss_total, feed_dict={x:X, y_:Y_})
 45                 print("After %d steps, loss is: %f" % (i, loss_v))
 46 
 47         xx, yy = np.mgrid[-3:3:.01, -3:3:.01]
 48         grid = np.c_[xx.ravel(), yy.ravel()]
 49         probs = sess.run(y, feed_dict={x:grid})
 50         probs = probs.reshape(xx.shape)
 51 
 52     plt.scatter(X[:,0], X[:,1], c=np.squeeze(Y_c))
 53     plt.contour(xx, yy, probs, levels=[.5])
 54     plt.show()
 55 
 56 if __name__ == '__main__':
 57     backward()

六. 全连接网络基础

(一) MNIST数据集

MNIST数据集: 含7w张图片,其中有6w张用来训练,1w张用来测试.其中每张图片大小为28*28(=784,即长为784的一维数组)像素
数组中,每位数字在0到1之间,黑底用0表示,白色用1表示,越接近1,颜色越白
数组有一个标签(长为10的一维数组),每一位表示0~9之间的数字可能的概率

from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('(保存路径)./data/',one_hot=True)#以读热码的形式存取

//返回各子集样本数
print "train data size:", mnist.train.num_examples
print "validation data size:", mnist.validation.num_examples
print "test data size:", mnist.test.num_examples

//返回标签和数据
mnist.train.labels[0(编号)]#返回训练集中制定编号的标签或图片
mnist.train.images[0(编号)]

//取一小撮数据,准备喂入神经网络训练
BATCH_SIZE = 200#一次喂200张图
xs, ys = mnist.train.next_batch(BATCH_SZIE)#从训练集中随机抽取BATCH_SIZE组标签,分别赋给xs和ys
print "xs shape:", xs.shape#(200,784)200行,每行784个像素
print "ys shape:", ys.shape#(200,10) 200行,每行10个元素表示10个标签

几个函数

tf.get_collection("")从集合中取全部变量,生成一个列表
tf.add_n([])列表内对应元素相加
tf.cast(x,dtype)把x转为dtype类型
tf.argmax(x,axis)返回最大值所在索引号,如tf.argmax([1,0,0],1) 在第一维度找最大值索引号,返回0
os.path.join("home","name")os模块的函数,返回路径(home/name)
字符串.split()按指定拆分对字符串切片,返回分割后的列表
with tf.Graph().as_default() as g:其内定义的节点在计算图g中

模型的保存和加载

//保存
saver=tf.train.Saver()#实例化saver对象
with tf.Session() as sess:
    for i in range(STEPS):
        if i % 轮数 == 0:
            saver.save(sess,os.path.join(MODEL_SAVE_PATH,MODEL_NAME),global_step=global_step)

//加载
with tf.Session() as sess:
    ckpt=tf.train.get_checkpoint_state(存储路径)
    if ckpt and ckpt.medel_checkpoint_path:
        saver.restore(sess.ckpt.model_checkpoint_path)

// 实例化可还原滑动平均值的saver
ema=tf.train.ExponentialMovingAverage(滑动平均基数)
ema_restore=ema.variables_to_restore()
saver=tf.train.Saver(ema_restore)

//计算准确率
correct_prediction=tf.equal(tf.argmax(y,1),tf.argmax(y_,1))
accuracy=tf.reduce_mean(tf.cast(correct_prediction,tf.float32))

(二) 模块化搭建全链接神经网络

遇到问题

TensorFlow IOError: [Errno socket error] [Errno 104] Connection reset by peer
解决方法: 网络出问题，看看能不能访问http://yann.lecun.com/exdb/mnist/，调节网络配置，翻过防火墙，能够访问后就没有问题了。贴一个大牛的解决方法

backward.py

def backward(mnist):
    x=
    y_=
    y=
    global_step=
    loss=
    //<正则化,指数衰减学习率,滑动平均>
    train_step=
    实例化saver
    with tf.Session() as sess:
        初始化
        for i in range(STEPS):
            sess.run(train_step,feed_dict={x: ,y_: })
            if i % 轮数 ==0:
                print
                saver.save( )

//损失函数loss含正则化regularization
//backward.py中加
ce=tf.nn.sparse_softmax_cross_entropy_with_logits(logits=y,labels=tf.argmax(y_,1))
cem=tf.reduce_mean(ce)
loss=cem+tf.add_n(tf.get_collection('losses'))
//forward.py中加
if regularizer != None:tf.add_to_collection('losses',tf.contrib.layers.l2_regularizer(regularizer)(w))

//学习率learning_rate
//backward.py中加入
learning_rate=tf.train.exponential_decay(
    LEARNING_RATE_BASE,
    global_step,
    LEARNING_RATE_STEP,
    LEARNING_RATE_DECAY,
    staircase=True)

//滑动平均ema
//backward.py中加入
ema=tf.train.ExponentialMovingAverage(MOVING_AVERAGE_DECAY,global_step)
ema_op=ema.apply(tf.trainable_variables())
with tf.control_dependencies([train_step,ema_op]):
    train_op=tf.no_op(name='train')

test.py

def test(mnist):
    with tf.Graph().as_default() as g:
        定义x y_ y
        实例化可还原滑动平均值的saver
        计算正确率
        while True:
            with tf.Session() as sess:
                ckpt=tf.train.get_checkpoint_state(存储路径)//加载ckpt模型
                if ckpt and ckpt.model_checkpoint_path://如果已有ckpt模型则恢复
                    saver.restore(sess,ckpt.model_checkpoint_path)//恢复会话
                    global_step=ckpt.model_checkpoint_path.spilt('/')[-1].split('-')[-1]//恢复轮数
                    accuracy_score=sess.run(accuracy,feed_dict={x:mnist.test.images,y_:mnist.test.labels})//计算准确率
                    打印提示
                else://如果没有模型
                    给出提示(print)
                    return

def main():
    mnist=input_data.read_data_sets("./data/",one_hot=True)
    test(mnist)
if __name__ == '__main__':
    main()