cs224n_Lecture7_TensorFlow框架及线性回归实现

最新推荐文章于 2024-04-23 17:01:55 发布

不一样的雅兰酱

最新推荐文章于 2024-04-23 17:01:55 发布

阅读量160

点赞数 1

分类专栏： NLP 文章标签： tensorflow 线性回归

本文链接：https://blog.csdn.net/silver1225/article/details/100094788

版权

NLP 专栏收录该内容

6 篇文章 0 订阅

订阅专栏

Introduction to TensorFlow

文章目录

Introduction to TensorFlow

深度学习框架简介

为什么要用成熟的框架，而不是从头写一个：

这些深度学习框架有助于扩展机器学习代码
可以自动计算梯度(Gradients)！（使得我们可以把重点放在高层次的数学上）
标准化机器学习应用，方便共享交流
多种算法、理念、抽象、编程语言等的融合
提供GPU并行运算的接口

TF是什么？

Tenorflow是谷歌开发的一个使用流式图进行数值计算的开源软件库。

It’s just a deep learning framework, an open source software library for numerical computation using flow graphs from Google.

TF的工作原理和基本范例

最核心的思想

将数值计算表示为图的形式(express a numeric computation as a graph)
- 图的节点是某种运算，支持任意数量的输入和输出
- 图的边是tensor（张量，n维数组），在节点之间流动
拥有一个隐藏层的神经网络的例子

其在TensorFlow中的计算图如下：

$h = R e L U (W x + b)$

其中ReLU是一个表示修正线性单元的激活函数，我们在线性输入上加入一些非线性函数，能够赋予神经网络一定的表达能力。ReLU表示在你的输入和0之间取最大值，即max(input,0)。
而 $x$ 只是一个placeholder（占位符），只在执行的时候填充输入，编程的时候指定大小即可。

构建运算图

要描述这样的graph flow，只需要编写代码：

import tensorflow as tf

# 定义一个变量b初始化为0 大小为100维； w是一个服从[-1,1]的均匀分布
b = tf.Varable(tf.Zeros((100,))) 
w = tf.Varable(tf.random_uniform((784,100),-1,1))

# x是一个占位符，并没有被初始化为任何值，编程时只需要指定它：仅接收32位浮点数的数据类型，接收一个(100,784)的shape
x = tf.placeholder(tf.float32,(100,784))

# actually build our flow graph
h = tf.nn.relu(tf.matmul(x,w) +b)

这段代码只是构建了运算图，连输入都没有，自然无法马上获取h的值。上述代码并没有显式地声明节点和边，TensorFlow根据数学表达式自动构造了运算图。

如何得到输出（Getting output）

到目前为止，我们只定义了一张图，如何执行它呢？我们可以通过session将这张图部署到某个执行环境（CPU、GPU、Google的TensorProcessingUnit……）上去。session就是支持图中所有操作的执行的硬件环境。

sess.run(fetches,feeds)

建立一个会话对象(session object)

调用两个参数fetches和feeds

fetches：图结点列表，返回这些结点的输出结果
feeds：将图结点映射至具体值的字典(Dictionary)

# 创建一个会话对象
sess = tf.Session()
# sess.run 初始化所有变量
sess.run(tf.initialize_all_variables())
sess.run(h,{x: np.random.random(100,784)})

如何训练模型

对于优化过程而言，如何定义损失？

# prediction是在网络结束时的预测值，是对神经网络顶层进行softmax函数计算，输出一个概率向量，这是一个回归过程
prediction = tf.nn.softmax(...)
# label是我们真实标记的占位符，我们的模型根据它来训练
label = tf.placeholder(tf.float32,[100,10])

# 创建交叉熵结点
cross_entropy = -tf.reduce_sum(label * tf.log(prediction), axis=1)

如何计算梯度
```
# 创建一个优化器
train_step = tf.teain.GradientDescentOptimizer(0.5).minimize(cross_entropy)
```
Tensorflow中有一个通用的抽象类叫优化器(optimizer)，在这个类中的每一个子类都是针对特定学习算法的优化器。

train_step这一步骤实际上在模型上对所有的变量进行梯度的计算，因为.minimize函数在这里做了两件事：
- 第一件是计算参数的梯度，本案例中参数为交叉熵，它和我们图中定义的所有变量相关。然后它会对这些变量根据梯度进行更新。
  - Q：我们是如何计算梯度的呢？
  - A：TensorFlow的工作方式是：每一个图节点都有一个附加的梯度操作，都有相对于输入预先构建的输出梯度。因此，当我们在计算交叉熵相对于所有参数的梯度时，通过图使用链式法则利用反向传播计算是非常简单的。

创建一个迭代的学习计划(Building training schedule)

# 建立一个字典，将值传给我们之前定义好的两个占位符,x和label是图中的结点，即字典中的关键字是图中的结点，对应的项是Numpy数据
for i in range(1000):
    batch_x, batch_label = data.next_batch()
    sess.run(train_step,feed_dict={x: batch_x,
                                label: batch_label })

变量共享

Q：有时候我们想要生成一张图的多个实例，或者在多机多个GPU上训练同一个模型，就会带来同一个变量在不同位置出现。如何在不同位置共享同一个变量呢？

A：

一种朴素的想法是：

在代码顶端创建这个变量的字典。把一些字符串的字典放到它们所代表的变量中。

variable_dict = {
    "weights":tf.Variable(tf.random_normal([782,100])),
    "biases":tf.Variable(tf.zeros([100]),name="biases")
}

这种做法的缺点在于：破坏了封装性！(No good for encapsulation!)

更成熟一些的做法：
- Variable_scope()提供了一个简单的命名空间方案来避免冲突
- Get_variable() 如果一个具备特定名字的向量不存在的话，该函数会为你创建一个变量
  
  否则，如果发现它存在，将访问该变量

总结

TF的用法总结如下：

创建图
a 前向传播/预测
b 优化操作
初始化session
在session中执行

从线性回归来熟悉TensorFlow

现场vim敲代码环节[撒花?]

首先初始程序框架如下，我们需要完成linear_regression()函数来拟合我们的数据分布

# linear regression
# Author: Nishith Khandwala (nishith@stanford.edu)
# Adapted from https://github.com/hans/ipython-notebooks/

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import numpy as np
import tensorflow as tf
import matplotlib
matplotlib.use('TKAgg')
from matplotlib import pyplot as plt

'''
Good ole linear regression: find the best linear fit to our data
'''

def generate_dataset():
    # data is generated by y = 2x + e
    # where 'e' is sampled from a normal distribution
    x_batch = np.linspace(-1, 1, 101)
    y_batch = 2 * x_batch + np.random.randn(*x_batch.shape) * 0.3
    return x_batch, y_batch
    
def linear_regression():
    return NotImplementedError

def run():
    pass
    x_batch, y_batch = generate_dataset()

    plt.figure(1)
    plt.scatter(x_batch, y_batch)
    plt.plot(x_batch, y_pred_batch)
    plt.savefig('plot.png')
        
if __name__ == '__main__':
    run()

小哥几乎是敲一句解释一句，把之前对tensorflow编程流程的讲解很好地融入了进来，代码的完成版如下：

# linear regression
# Author: Nishith Khandwala (nishith@stanford.edu)
# Adapted from https://github.com/hans/ipython-notebooks/

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import numpy as np
import tensorflow as tf
import matplotlib
matplotlib.use('TKAgg')
from matplotlib import pyplot as plt

'''
Good ole linear regression: find the best linear fit to our data
'''

def generate_dataset():
    # data is generated by y = 2x + e
    # where epsilon 'e' is sampled from a normal distribution :e是从正态分布采样得到的
    x_batch = np.linspace(-1, 1, 101)
    y_batch = 2 * x_batch + np.random.randn(*x_batch.shape) * 0.3
    return x_batch, y_batch

# 我们将用图来实现线性回归    
def linear_regression():
    # 首先定义流式图 flow graph -> 第一步创建占位符
    # 我们创建x类型为浮点数，希望形状更通用，设置shape为(None),意味着你可以动态改变每一批次数据的数量，然后把它们送进网络中
    
    x = tf.placeholder(tf.float32, shape=(None,), name="x")
    y = tf.placeholder(tf.float32, shape=(None,), name="y")

    # 定义变量作用域
    with tf.variable_scope("lreg") as scope:
        w = tf.Variable(np.random.normal(),name="w")
        y_pred = tf.multiply(w,x)

        loss = tf.reduce_mean(tf.square(y_pred - y))    
    return x, y, y_pred, loss

def run():
    x_batch, y_batch = generate_dataset()

    x,y,y_pred, loss=linear_regression()
    #优化器 定义学习率为0.1，最小化模型的损失
    optimizer = tf.train.GradientDescentOptimizer(0.1).minimize(loss)

    init = tf.global_variables_initializer()
    with tf.Session() as session:
        session.run(init)

        feed_dict = {x: x_batch,y: y_batch}
        for _ in range(30):
            loss_val, _= session.run([loss,optimizer], feed_dict)
            print('loss_val:',loss_val.mean())

        y_pred_batch = session.run(y_pred,{x: x_batch})

    plt.figure(1)
    plt.scatter(x_batch, y_batch)
    plt.plot(x_batch, y_pred_batch)
    plt.savefig('plot.png')
    plt.show()
    
if __name__ == '__main__':
    run()

实验运行结果：

在terminal打印loss值：

loss_val: 0.6620599
loss_val: 0.587803
loss_val: 0.5233017
…
loss_val: 0.10778806
loss_val: 0.10634918

绘制出图形如下：

不一样的雅兰酱

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
cs224n_Lecture7_TensorFlow框架及线性回归实现

Introduction to TensorFlow文章目录Introduction to TensorFlow深度学习框架简介TF是什么？TF的工作原理和基本范例从线性回归来熟悉TensorFlow深度学习框架简介为什么要用成熟的框架，而不是从头写一个：这些深度学习框架有助于扩展机器学习代码可以自动计算梯度(Gradients)！（使得我们可以把重点放在高层次的数学上）标准化机器学...
复制链接

扫一扫