Caffe2 - (十五) 简单的回归示例 Toy Regression

最新推荐文章于 2019-04-11 09:16:57 发布

AIHGF

最新推荐文章于 2019-04-11 09:16:57 发布

阅读量1.2k

点赞数

分类专栏： Caffe2 Caffe2 文章标签： Caffe2

本文链接：https://blog.csdn.net/zziahgf/article/details/79172040

版权

Caffe2 同时被 2 个专栏收录

37 篇文章 2 订阅

订阅专栏

Caffe2

37 篇文章 45 订阅

订阅专栏

Caffe2 - 简单的回归例子 Toy Regression

回归问题：

假设二维输入 $x$ ，一维输出 $y$ ，权重向量 $w = [2.0, 1.5]$ ，偏置 bias $b=0.5$ ，

$y = wx + b$

注：

这里训练数据是利用 Caffe2 Op 生成的. —— [Toy Regression]

实际模型训练中，训练数据一般是由外部数据集加载的，如 Caffe DB(key-value 存储) —— [Caffe2 - (九)MNIST 手写字体识别].

# 导入 python packages
from caffe2.python import core, cnn, net_drawer, workspace, visualize
import numpy as np
import matplotlib.pyplot as plt

1. 声明计算图

对于计算图(computation graphs) 的声明，这里有两个 graphs：

graph - 用于初始化计算涉及到的参数和常数.
main graph - 用于进行 SGD 计算.

init_net = core.Net("init")
# The ground truth parameters.
W_gt = init_net.GivenTensorFill([], "W_gt", shape=[1, 2], values=[2.0, 1.5])
B_gt = init_net.GivenTensorFill([], "B_gt", shape=[1], values=[0.5])
# Constant value ONE is used in weighted sum when updating parameters.
ONE = init_net.ConstantFill([], "ONE", shape=[1], value=1.)
# ITER is the iterator count.
ITER = init_net.ConstantFill([], "ITER", shape=[1], value=0, dtype=core.DataType.INT32)

# 待学习参数
# 随机初始化权重为区间[-1, 1]的值，初始化 bias=0.0
W = init_net.UniformFill([], "W", shape=[1, 2], min=-1., max=1.)
B = init_net.ConstantFill([], "B", shape=[1], value=0.0)
print('Created init net.')

2. 网络定义

网络定义主要流程：

forward pass 计算 loss；
backaward pass 自动计算微分(梯度)；
参数更新，基于 SGD.

train_net = core.Net("train")
# 首先，随机生成数据样本 X；并创建 ground truth
X = train_net.GaussianFill([], "X", shape=[64, 2], mean=0.0, std=1.0, run_once=0)
Y_gt = X.FC([W_gt, B_gt], "Y_gt")
# 添加 Gaussian noise 到 ground truth
noise = train_net.GaussianFill([], "noise", shape=[64, 1], mean=0.0, std=1.0, run_once=0)
Y_noise = Y_gt.Add(noise, "Y_noise")
# 由于不需要计算 Y_noise 的梯度，故设置 StopGradient 来忽略其自动微分计算.
Y_noise = Y_noise.StopGradient([], "Y_noise")

# 线性回归预测
Y_pred = X.FC([W, B], "Y_pred")

# Loss 函数是 squared L2 distance，并对 minibatch 求平均
dist = train_net.SquaredL2Distance([Y_noise, Y_pred], "dist")
loss = dist.AveragedLoss([], ["loss"])

以上实现主要包括：

随机生成 batch 的数据样本 X (GaussianFile)
根据 W_gt, B_gt 和 FC Op，生成 ground truth Y_gt
使用当前参数，W 和 B，进行预测
计算输出的 loss

3. 网络可视化

graph = net_drawer.GetPydotGraph(train_net.Proto().op, "train", rankdir="LR")
graph.write_png('graph.png')

这里写图片描述

# Get gradients for all the computations above.
gradient_map = train_net.AddGradientOperators([loss])
graph = net_drawer.GetPydotGraph(train_net.Proto().op, "train", rankdir="LR")
graph.write_png('graph_gradient.png')

这里写图片描述

4. 网络训练

计算梯度后，可以将 SGD 部分添加到 graph：

当前的 learning rate
参数更新.

# 增量迭代，+1
train_net.Iter(ITER, ITER)
# 计算迭代后的学习率
LR = train_net.LearningRate(ITER, "LR", base_lr=-0.1,
                            policy="step", stepsize=20, gamma=0.9)

# 权重累加
train_net.WeightedSum([W, ONE, gradient_map[W], LR], W)
train_net.WeightedSum([B, ONE, gradient_map[B], LR], B)

# 在此可视化 graph
graph = net_drawer.GetPydotGraph(train_net.Proto().op, "train", rankdir="LR")
graph.write_png('graph2.png')

这里写图片描述

网络训练：

workspace.RunNetOnce(init_net)
workspace.CreateNet(train_net)

# 查看参数 W 和 B 
print("Before training, W is: {}".format(workspace.FetchBlob("W")))
print("Before training, B is: {}".format(workspace.FetchBlob("B")))

# 迭代训练 100 次
for i in range(100):
    workspace.RunNet(train_net.Proto().name)

# 训练后的参数
print("After training, W is: {}".format(workspace.FetchBlob("W")))
print("After training, B is: {}".format(workspace.FetchBlob("B")))

print("Ground truth W is: {}".format(workspace.FetchBlob("W_gt")))
print("Ground truth B is: {}".format(workspace.FetchBlob("B_gt")))

检查训练过程参数的变化情况：

workspace.RunNetOnce(init_net)
w_history = []
b_history = []
for i in range(50):
    workspace.RunNet(train_net.Proto().name)
    w_history.append(workspace.FetchBlob("W"))
    b_history.append(workspace.FetchBlob("B"))
w_history = np.vstack(w_history)
b_history = np.vstack(b_history)
plt.plot(w_history[:, 0], w_history[:, 1], 'r')
plt.axis('equal')
plt.xlabel('w_0')
plt.ylabel('w_1')
plt.grid(True)
plt.figure()
plt.plot(b_history)
plt.xlabel('iter')
plt.ylabel('b')
plt.grid(True)
plt.show()