DataWhale组队学习打卡(二)

前言

记《动手学深度学习》组队学习第二次打卡

打卡内容

线性回归代码实现(基于Pytorch)

理论复习

线性回归理论部分可参考上一篇博客

线性回归模型从零开始的实现

借助jupyter运行代码,方便清晰展示各环节的输出情况。
1. 导入基础模块
In [ ]:

# import packages and modules
%matplotlib inline
import torch
from IPython import display
from matplotlib import pyplot as plt
from mpl_toolkits import mplot3d as p3d
import numpy as np
import random

print(torch.__version__)

2. 生成数据集
使用线性模型来生成数据集,生成一个1000个样本的数据集,下面是用来生成数据的线性关系:
p r i c e = w a r e ⋅ a r e a + w a g e ⋅ a g e + b price=w_{are} \cdot area+w_{age} \cdot age + b price=warearea+wageage+b
In [ ]:

# set input feature number 
num_inputs = 2
# set example number
num_examples = 1000

# set true weight and bias in order to generate corresponded label
true_w = [2, -3.4]
true_b = 4.2

features = torch.randn(num_examples, num_inputs,
                      dtype=torch.float32)
labels = true_w[0] * features[:, 0] + true_w[1] * features[:, 1] + true_b
labels += torch.tensor(np.random.normal(0, 0.01, size=labels.size()),
                       dtype=torch.float32)

这里随机产生的features是一个 1000 ∗ 2 1000*2 10002的张量,第一列代表属性 a r e a area area,第二列代表属性 a g e age age;labels对应公式中的 p r i c e price price,是一个 1000 ∗ 1 1000*1 10001的张量。代码最后在公式计算所得的labels基础上附加了随机扰动,使得数据更有真实性(显然, p r i c e price price不会只受 a r e a area area a g e age age的影响,加入随机干扰相当于加入其它未知因素对 p r i c e price price的影响)。
当然,此处只是简单地举个例子,并未对各属性值做过多的模拟以使其更符合实际情况。(比如:在实际生活中, a r e a area area a g e age age必然是正数,且 a r e are are a g e age age在数量级上有着较大差距。)
本文重在帮助读者从代码实现角度理解线性回归的过程,在实际项目开发过程中,数据自然有其获取途径。
3. 用图像展示生成的数据
In [ ]: 展示二维散点图,某一列属性与labels的关系示意

plt.scatter(features[:, 1].numpy(), labels.numpy(), 1);

In [ ]: 展示三维散点图,展示两列属性与labels的关系示意

fig = plt.figure()
ax = p3d.Axes3D(fig)
X = features[:, 0].numpy()
Y = features[:, 1].numpy()
Z = labels.numpy()
ax.scatter3D(X, Y, Z);

4. 读取数据集
In [ ]:

def data_iter(batch_size, features, labels):
    num_examples = len(features)
    indices = list(range(num_examples))
    random.shuffle(indices)  # random read 10 samples
    for i in range(0, num_examples, batch_size):
        j = torch.LongTensor(indices[i: min(i + batch_size, num_examples)]) # the last time may be not enough for a whole batch
        yield  features.index_select(0, j), labels.index_select(0, j)

In [ ]: 取出10个样本进行查看

batch_size = 10

for X, y in data_iter(batch_size, features, labels):
    print(X, '\n', y)
    break

5. 初始化模型参数
In [ ]:

w = torch.tensor(np.random.normal(0, 0.01, (num_inputs, 1)), dtype=torch.float32)
b = torch.zeros(1, dtype=torch.float32)

w.requires_grad_(requires_grad=True)
b.requires_grad_(requires_grad=True)

对模型中需要学习的参数进初始化,并开启梯度属性(在优化过程中,需要通过梯度不断迭代更新这两个参数)。 w w w 2 ∗ 1 2*1 21的张量,相当于 [ w a r e a w a g e ] \begin {bmatrix} w_{area} \\ w_{age} \end {bmatrix} [wareawage] b b b为标量。
6. 定义模型
定义用来训练参数的训练模型:
p r i c e = w a r e a ⋅ a r e a + w a g e ⋅ a g e + b price=w_{area} \cdot area + w_{age} \cdot age + b price=wareaarea+wageage+b
In [ ]:

def linreg(X, w, b):
    return torch.mm(X, w) + b

7. 定义损失函数
使用均方误差损失函数:
l ( i ) ( w , b ) = 1 2 ( y ^ ( i ) − y ( i ) ) 2 l^{(i)}(\bm w, b) = \frac 12 (\hat y^{(i)} - y^{(i)})^2 l(i)(w,b)=21(y^(i)y(i))2
In [ ]:

def squared_loss(y_hat, y): 
    return (y_hat - y.view(y_hat.size())) ** 2 / 2

说明:y.view()相当于y.reshape()
8. 定义优化函数
使用小批量随机梯度下降:
( w , b ) ← ( w , b ) − η ∣ B ∣ ∑ i ∈ B ∂ ( w , b ) l ( i ) ( w , b ) (\bm w, b) \leftarrow (\bm w, b) - \frac {\eta} {|B|} \sum_{i \in B} \partial_{(\bm w, b)}l^{(i)}(\bm w, b) (w,b)(w,b)BηiB(w,b)l(i)(w,b)
In [ ]:

def sgd(params, lr, batch_size): 
    for param in params:
        param.data -= lr * param.grad / batch_size # ues .data to operate param without gradient track

9. 训练
当数据集、模型、损失函数和优化函数定义完了之后就可来准备进行模型的训练了。
In [ ]:

# super parameters init
lr = 0.03
num_epochs = 5

net = linreg
loss = squared_loss

# training
for epoch in range(num_epochs):  # training repeats num_epochs times
    # in each epoch, all the samples in dataset will be used once
    
    # X is the feature and y is the label of a batch sample
    for X, y in data_iter(batch_size, features, labels):
        l = loss(net(X, w, b), y).sum()  
        # calculate the gradient of batch sample loss 
        l.backward()  
        # using small batch random gradient descent to iter model parameters
        sgd([w, b], lr, batch_size)  
        # reset parameter gradient
        w.grad.data.zero_()
        b.grad.data.zero_()
    train_l = loss(net(features, w, b), labels)
    print('epoch %d, loss %f' % (epoch + 1, train_l.mean().item()))

In [ ]: 简单展示一下,训练后得到的参数与真实参数

w, true_w, b, true_b

以上,便完成了从零实现线性回归模型。

当然,我们也可以借助PyTorch来实现线性回归模型

1. 导入基础模块
In [ ]:

import torch
from torch import nn
import numpy as np
torch.manual_seed(1)

print(torch.__version__)
torch.set_default_tensor_type('torch.FloatTensor')

2. 生成数据集
In [ ]:

num_inputs = 2
num_examples = 1000

true_w = [2, -3.4]
true_b = 4.2

features = torch.tensor(np.random.normal(0, 1, (num_examples, num_inputs)), dtype=torch.float)
labels = true_w[0] * features[:, 0] + true_w[1] * features[:, 1] + true_b
labels += torch.tensor(np.random.normal(0, 0.01, size=labels.size()), dtype=torch.float)

3. 读取数据集
In [ ]:

import torch.utils.data as Data

batch_size = 10

# combine featues and labels of dataset
dataset = Data.TensorDataset(features, labels)

# put dataset into DataLoader
data_iter = Data.DataLoader(
    dataset=dataset,            # torch TensorDataset format
    batch_size=batch_size,      # mini batch size
    shuffle=True,               # whether shuffle the data or not
    num_workers=2,              # read data in multithreading
)

In [ ]: 取出10个样本进行查看

for X, y in data_iter:
    print(X, '\n', y)
    break

4. 定义模型
In [ ]:

class LinearNet(nn.Module):
    def __init__(self, n_feature):
        super(LinearNet, self).__init__()      # call father function to init 
        self.linear = nn.Linear(n_feature, 1)  # function prototype: `torch.nn.Linear(in_features, out_features, bias=True)`

    def forward(self, x):
        y = self.linear(x)
        return y
    
net = LinearNet(num_inputs)
print(net)

5. 初始化模型参数
In [ ]:

from torch.nn import init

init.normal_(net[0].weight, mean=0.0, std=0.01)
init.constant_(net[0].bias, val=0.0)  # or you can use `net[0].bias.data.fill_(0)` to modify it directly

In [ ]: 查看网络参数

for param in net.parameters():
    print(param)

6. 定义损失函数
In [ ]:

loss = nn.MSELoss()    # nn built-in squared loss function
                       # function prototype: `torch.nn.MSELoss(size_average=None, reduce=None, reduction='mean')`

7. 定义优化函数
In [ ]:

import torch.optim as optim

optimizer = optim.SGD(net.parameters(), lr=0.03)   # built-in random gradient descent function
print(optimizer)  # function prototype: `torch.optim.SGD(params, lr=, momentum=0, dampening=0, weight_decay=0, nesterov=False)`

8. 训练
In [ ]:

num_epochs = 3
for epoch in range(1, num_epochs + 1):
    for X, y in data_iter:
        output = net(X)
        l = loss(output, y.view(-1, 1))
        optimizer.zero_grad() # reset gradient, equal to net.zero_grad()
        l.backward()
        optimizer.step()
    print('epoch %d, loss: %f' % (epoch, l.item()))

In [ ]: 简单展示一下,训练后得到的参数与真实参数

# result comparision
dense = net[0]
print(true_w, dense.weight.data)
print(true_b, dense.bias.data)

两种实现方式的比较

  1. 从零开始的实现(推荐用来学习)
    能够更好的理解模型和神经网络底层的原理
  2. 使用PyTorch的简洁实现
    能够更加快速地完成模型的设计与实现
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值