动手学深度学习
3.1 线性回归
从这章开始学习线性回归,并逐步学习各种深度学习模型
3.1.1 线性回归的基本要素
3.1.1.1 模型定义
y ^ = x 1 w 1 + x 2 w 2 + b \hat{y} = x_1w_1+x_2w_2+b y^=x1w1+x2w2+b
- w w w : 权重(weight)
- b b b : 偏差(bias)
3.1.1.2 模型训练
- 训练数据
- 数据集(training data set)
- 训练集(training set)
- 样本(smaple)
- 标签(label) y ^ \hat{y} y^
- 特征(feature) x 1 , x 2 x_1,x_2 x1,x2
- 损失函数
常用的损失函数时平方函数
l
(
i
)
(
w
1
,
w
2
,
b
)
=
1
2
(
y
^
(
i
)
−
y
(
i
)
)
)
2
l^{(i)}(w_1,w_2,b) =\frac1 2(\hat y^{(i)}-y^{(i)}))^2
l(i)(w1,w2,b)=21(y^(i)−y(i)))2
- 优化算法
当单位向量与偏导数的方向一致或相反时,值最大
3.1.1.1.3 模型预测
3.1.2 线性回归的表示方法
3.1.2.1 神经网络图
略
3.1.2.2 矢量计算表达式
先定义两个1000维的向量
import torch
from time import time
a = torch.ones(10000)
b = torch.ones(10000)
start = time()
c = torch.zeros(10000)
for i in range(10000):
c[i] = a[i]+b[i]
print(time() - start)
0.09472012519836426
start = time()
c = a + b
print(time() - start)
0.0010044574737548828
可以看到矢量计算速度很快
3.2线性回归的从零开始
%matplotlib inline
# 设置成嵌入显示
import torch
#from IPython import display
#from matplotlib import pyplot as plt
import numpy as np
#import random
import d21zh_pytorch
from d21zh_pytorch import *
3.2.1 生成数据集
num_inputs = 2
num_examples = 1000
true_w = [2,-3.4]
true_b = 4.2
features = torch.from_numpy(np.random.normal(0,1,(num_examples,num_inputs))) #正态分布
labels = true_w[0] *features[:,0] + true_w[1] *features[:,1]+true_b
labels+= torch.from_numpy(np.random.normal(0,0.01,size = labels.size())) #引入噪声
#def use_svg_display():
# display.set_matplotlib_formats('svg')
#
#def set_figsize(figsize = (3.5,3.5)):
# use_svg_display()
# plt.rcParams['figure.figsize'] = figsize
#set_figsize()
#plt.scatter(features[:,1].numpy(),labels.numpy(),1);
#都存到 d21zh_pytorch包里了
d21zh_pytorch.set_figsize()
d21zh_pytorch.plt.scatter(features[:,0].numpy(),labels.numpy(),1);
d21zh_pytorch.plt.scatter(features[:,1].numpy(),labels.numpy(),1); #把部分函数存到d21zh_pytorch
# 可以看到 labels 与 x1 和 x2 均有一定的线性关系
3.2.2 读取数据
# 学习说明 generator
batch_size = 10
for x,y in d21zh_pytorch.data_iter(batch_size,features,labels):
print(x)
print(y)
break
tensor([[-0.4564, 1.3566],
[-0.4095, 0.9336],
[ 0.2560, 0.5123],
[-0.5850, 0.8176],
[-0.2386, 0.0750],
[ 1.0011, -0.9369],
[ 1.1602, -1.5654],
[-0.1321, -0.4781],
[ 1.0826, -1.1944],
[ 1.1887, -0.3131]], dtype=torch.float64)
tensor([-1.3203, 0.2044, 2.9585, 0.2575, 3.4725, 9.3828, 11.8428, 5.5575,
10.4262, 7.6297], dtype=torch.float64)
3.2.3 初始化
w = torch.tensor(np.random.normal(0,0.01,(num_inputs, 1)) )
b = torch.zeros(1)
这些参数需要求梯度,因此
w.requires_grad_(True)
b.requires_grad_(True)
tensor([0.], requires_grad=True)
3.2.4 定义模型
函数实现
3.2.5定义损失函数
函数实现
3.2.6定义优化算法
函数实现
3.2.7训练模型
lr = 0.03
num_epochs = 3
net = linreg
loss = squared_loss
for epoch in range(num_epochs):
for X,y in data_iter(batch_size,features ,labels):
l = loss(net(X,w,b),y).sum()
l.backward()
sgd([w,b],lr,batch_size)
# 梯度清零
w.grad.data.zero_()
b.grad.data.zero_()
train_l = loss(net(features,w,b),labels)
print('epoch %d,loss %f'%(epoch+1,train_l.mean().item()))
epoch 1,loss 0.042516
epoch 2,loss 0.000171
epoch 3,loss 0.000052
print(true_w,'\n',w,'\n')
print(true_b,'\n',b,'\n')
[2, -3.4]
tensor([[ 1.9987],
[-3.4007]], dtype=torch.float64, requires_grad=True)
4.2
tensor([4.1996], requires_grad=True)