【深度学习笔记】4 线性回归

一:参考资料

李宏毅 Regression
Dive-into-DL-Pytorch

二: 线性回归数学表达式

回归和分类问题的区别:回归的输出是一个连续值,分类问题的输出是离散值

y ^ = X ∗ w + b \hat{y}=X*w+b y^=Xw+b
在这里插入图片描述
y ^ = ( y ^ ( 1 ) y ^ ( 2 ) ⋮ y ^ ( n ) ) \hat{y} =\begin{pmatrix} \hat{y}^{(1)}\\ \hat{y}^{(2)} \\ \vdots \\ \hat{y}^{(n)} \end{pmatrix} y^=y^(1)y^(2)y^(n) y = ( y ( 1 ) y ( 2 ) ⋮ y ( n ) ) y =\begin{pmatrix} y^{(1)}\\ y^{(2)} \\ \vdots \\ y^{(n)} \end{pmatrix} y=y(1)y(2)y(n) x = ( x 1 ( 1 ) x 2 ( 1 ) … x m ( 1 ) x 1 ( 2 ) x 2 ( 2 ) … x m ( 1 ) ⋮ ⋮ ⋮ ⋮ x 1 ( n ) x 2 ( n ) … x m ( n ) ) x =\begin{pmatrix} x^{(1)}_{1} &x^{(1)}_{2} &\dots& x^{(1)}_{m} \\ x^{(2)}_{1} &x^{(2)}_{2} &\dots& x^{(1)}_{m} \\ \vdots &\vdots&\vdots&\vdots\\ x^{(n)}_{1}&x^{(n)}_{2}&\dots&x^{(n)}_{m} \end{pmatrix} x=x1(1)x1(2)x1(n)x2(1)x2(2)x2(n)xm(1)xm(1)xm(n) ,

w = ( w 1 w 2 ⋮ w m ) w =\begin{pmatrix} w_1 \\ w_2\\ \vdots \\ w_{m} \end{pmatrix} w=w1w2wm

说明 \textcolor{blue}{\fbox{说明}} 我们有数据
( x 1 ( i ) , x 2 ( i ) … x m ( i ) , y ( i ) ) (x^{(i)}_{1},x^{(i)}_{2}\dots x^{(i)}_{m},{y}^{(i)}) (x1(i),x2(i)xm(i),y(i))表示第 i i i个样本,一共 n 个样本 ,m 为 feanture 的个数,w 称 weight,b 为 bias(一般只要使用标量即可,然后使用broadcast来与前面相加) y ^ \hat{y} y^ 为我们计算的结果
y ^ ( i ) = ∑ j = 1 m x j ( i ) w j + b \hat{y}^{(i)}=\sum^{m}_{j=1}x^{(i)}_{j}w_{j}+b y^(i)=j=1mxj(i)wj+b

损失函数 \textcolor{blue}{\fbox{损失函数}} 假设我们采用平方损失函数,我们有

l ( i ) ( y ^ ( i ) , y ( i ) ) = 1 2 ( y ^ ( i ) − y ( i ) ) 2 = 1 2 ( ∑ j = 1 m x j ( i ) w j + b − y ( i ) ) 2 l^{(i)}(\hat{y}^{(i)},{y}^{(i)})=\dfrac{1}{2}(\hat{y}^{(i)}-y^{(i)})^2=\dfrac{1}{2}(\sum^{m}_{j=1}x^{(i)}_{j}w_{j}+b-y^{(i)})^2 l(i)(y^(i),y(i))=21(y^(i)y(i))2=21(j=1mxj(i)wj+by(i))2

∂ l ( i ) ( y ^ ( i ) , y ( i ) ) ∂ w j = ( ∑ j = 1 m x j ( i ) w j + b − y ( i ) ) ∗ x j \dfrac{\partial{l^{(i)}(\hat{y}^{(i)},{y}^{(i)})}}{\partial w_j}=(\sum^{m}_{j=1}x^{(i)}_{j}w_{j}+b-y^{(i)})*x_j wjl(i)(y^(i),y(i))=(j=1mxj(i)wj+by(i))xj,
∂ l ( i ) ( y ^ ( i ) , y ( i ) ) ∂ b = ( ∑ j = 1 m x j ( i ) w j + b − y ( i ) ) \dfrac{\partial{l^{(i)}(\hat{y}^{(i)},{y}^{(i)})}}{\partial b}=(\sum^{m}_{j=1}x^{(i)}_{j}w_{j}+b-y^{(i)}) bl(i)(y^(i),y(i))=(j=1mxj(i)wj+by(i)),

l ( y ^ , y ; θ ) = 1 2 ∑ i = 1 n ( y ^ ( i ) − y ( i ) ) 2 = 1 2 ( y ^ − y ) T ( y ^ − y ) l(\hat{y},y;\theta)= \dfrac{1}{2}\sum^{n}_{i=1} (\hat{y}^{(i)}-y^{(i)} )^2=\dfrac{1}{2}(\hat{y}-y)^T(\hat{y}-y) l(y^,y;θ)=21i=1n(y^(i)y(i))2=21(y^y)T(y^y)

小批量随机梯度下降 \textcolor{blue}{\fbox{小批量随机梯度下降}} 我们采用 小批量随机梯度下降(mini-batch stochastic gradient descent)

w j = w j − η ∣ B ∣ ∑ i ∈ B l ( i ) ( y ^ ( i ) , y ( i ) ) ∂ w j = w j − η ∣ B ∣ ∑ i ∈ B ( ∑ j = 1 m x j ( i ) w j + b − y ( i ) ) ∗ x j w_j= w_j-\dfrac{\eta}{\vert \Beta\vert}\sum_{i\in \Beta}\dfrac{l^{(i)}(\hat{y}^{(i)},{y}^{(i)})}{\partial w_j}=w_j-\dfrac{\eta}{\vert \Beta\vert}\sum_{i\in \Beta}(\sum^{m}_{j=1}x^{(i)}_{j}w_{j}+b-y^{(i)})*x_j wj=wjBηiBwjl(i)(y^(i),y(i))=wjBηiB(j=1mxj(i)wj+by(i))xj

b = b − η ∣ B ∣ ∑ i ∈ B l ( i ) ( y ^ ( i ) , y ( i ) ) ∂ b = b − η ∣ B ∣ ∑ i ∈ B ( ∑ j = 1 m x j ( i ) w j + b − y ( i ) ) b= b-\dfrac{\eta}{\vert \Beta\vert}\sum_{i\in \Beta}\dfrac{l^{(i)}(\hat{y}^{(i)},{y}^{(i)})}{\partial b}=b-\dfrac{\eta}{\vert \Beta\vert}\sum_{i\in \Beta}(\sum^{m}_{j=1}x^{(i)}_{j}w_{j}+b-y^{(i)}) b=bBηiBbl(i)(y^(i),y(i))=bBηiB(j=1mxj(i)wj+by(i))

θ = θ − η ∣ B ∣ ∑ i ∈ B ∇ ∂ l ( i ) ( θ ) ∂ θ \theta=\theta-\dfrac{\eta}{\vert \Beta\vert}\sum_{i \in \Beta}\nabla\dfrac{\partial l^{(i)}(\theta)}{\partial\theta} θ=θBηiBθl(i)(θ)

∇ ∂ l ( i ) ( θ ) ∂ θ = ( ( ∑ j = 1 m x j ( i ) w j + b − y ( i ) ) ∗ x 1 ( ∑ j = 1 m x j ( i ) w j + b − y ( i ) ) ∗ x 2 ⋮ ( ∑ j = 1 m x j ( i ) w j + b − y ( i ) ) ∗ x m ∑ j = 1 m x j ( i ) w j + b − y ( i ) ) = ( x 1 x 2 ⋮ x m 1 ) ( ∑ j = 1 m x j ( i ) w j + b − y ( i ) ) = ( x 1 x 2 ⋮ x m 1 ) ( y ^ ( i ) − y ( i ) − ) \begin{aligned} \nabla\dfrac{\partial l^{(i)}(\theta)}{\partial\theta}&=\begin{pmatrix} (\sum^{m}_{j=1}x^{(i)}_{j}w_{j}+b-y^{(i)})*x_1 \\ (\sum^{m}_{j=1}x^{(i)}_{j}w_{j}+b-y^{(i)})*x_2 \\ \vdots \\ (\sum^{m}_{j=1}x^{(i)}_{j}w_{j}+b-y^{(i)})*x_{m} \\ \sum^{m}_{j=1}x^{(i)}_{j}w_{j}+b-y^{(i)} \end{pmatrix} =\begin{pmatrix} x_1\\ x_2 \\ \vdots \\ x_{m} \\ 1 \end{pmatrix}(\sum^{m}_{j=1}x^{(i)}_{j}w_{j}+b-y^{(i)}) \\ &=\begin{pmatrix} x_1 \\ x_2 \\ \vdots \\ x_{m} \\ 1 \end{pmatrix}(\hat{y}^{(i)}-y^{(i)}-) \end{aligned} θl(i)(θ)=(j=1mxj(i)wj+by(i))x1(j=1mxj(i)wj+by(i))x2(j=1mxj(i)wj+by(i))xmj=1mxj(i)wj+by(i)=x1x2xm1(j=1mxj(i)wj+by(i))=x1x2xm1(y^(i)y(i))

三 线性回归 Pytorch 实现

%matplotlib inline
import torch
from torch import nn,autograd
from matplotlib import pyplot as plt
import numpy as np
import random
from IPython import display

生成数据集

nnum_inputs = 2
num_examples = 1000
true_w = torch.tensor([[2],[-3.4]]) #tensor默认类型是torch.float32
true_b = torch.tensor([4.2])
features = torch.randn(num_examples,num_inputs)   #torch.Size([1000, 2]) 
labels = torch.mm(features,true_w)+true_b
labels+=torch.normal(0,0.01,size=labels.size()) #torch.Size([1000, 1]) 

观察数据集

print(features[0], labels[0])
# tensor([ 0.6882, -1.0295]) tensor([9.0837])
def use_svg_display():
    # 用矢量图显示
    display.set_matplotlib_formats('svg')
 def set_figure(figsize=(3.5,2.5)):
    use_svg_display()
    # 设置图的尺寸
    plt.rcParams['figure.figsize']=figsize
set_figure()
plt.scatter(features[:,1].numpy(),labels.numpy(),1)

在这里插入图片描述

读取数据

def data_iter(batch_size,features,labels):
    #每次返回一个大小为batch_size 的 (features,labels)
    num_examples = len(labels)
    indices = list(range(num_examples))
    random.shuffle(indices)
    for i in range(0,num_examples,batch_size):
        j = torch.LongTensor(indices[i:min(i+batch_size,num_examples)])
        yield features.index_select(0,j),labels.index_select(0,j)    #index_select 的 index 需要 LongTensor torch.int64

初始化模型参数

w = torch.normal(0,0.01,size=(num_inputs,1)) #torch.float32
b = torch.zeros(1)
w.requires_grad = True
b.requires_grad = True

定义模型

def linreq(X,w,b):
    return torch.mm(X,w)+b

定义损失函数

def squared_loss(y,y_hat):
    return (y-y_hat.view(y.size()))**2/2

定义优化算法

def SGD(params,lr,batch_size):
    for param in params:
        param.data -= lr*param.grad/batch_size

训练模型

# 超参数
lr = 0.03
num_epochs = 10
net=linreq
loss = squared_loss

#开始训练
for epoch in range(num_epochs):
    for X,y in data_iter(batch_size,features,labels):
    	y_hat = net(X,w,b)
        l=loss(y,y_hat).sum()
        l.backward()
        SGD([w,b],lr,batch_size)
        # 不要忘了梯度清零
        w.grad.data.zero_()
        b.grad.data.zero_()
        
    train_l = loss(labels,net(features,w,b)).sum().item()
    print("epochs:{} loss:{}".format(epoch+1,train_l))
print("w:{} b:{}".format(w,b))

Out

epochs:1 loss:33.648372650146484
epochs:2 loss:0.12297464162111282
epochs:3 loss:0.051339346915483475
epochs:4 loss:0.05117468535900116
epochs:5 loss:0.051132265478372574
epochs:6 loss:0.05133380368351936
epochs:7 loss:0.0512143149971962
epochs:8 loss:0.05118824541568756
epochs:9 loss:0.05112023279070854
epochs:10 loss:0.05111289396882057
w:tensor([[ 2.0003],
        [-3.3996]], requires_grad=True) b:tensor([4.2001], requires_grad=True)
深度学习可以用于非线性回归预测。在深度学习中,可以使用神经网络来建模非线性关系。通过输入特征数据,神经网络可以学习到输入和输出之间的复杂映射关系,从而进行预测。 在非线性回归预测中,首先需要定义生成测试数据。一种常见的方法是使用平方函数加上噪声来生成数据。例如,可以使用公式y_data = np.square(x_data) + noise来生成非线性的数据。 接下来,可以使用神经网络来进行非线性回归预测。将x_data作为输入,通过神经网络得到预测值。然后,将预测值与真实值y_data进行比较,通过调整神经网络的参数,使预测值与真实值之间的差异最小化。这个过程称为训练神经网络。 通过深度学习的非线性回归预测,可以更好地建模复杂的数据关系,并且具有较好的预测性能。<span class="em">1</span><span class="em">2</span><span class="em">3</span> #### 引用[.reference_title] - *1* [Matlab算法学习指南(数值计算、机器学习、信号处理、图像处理)](https://download.csdn.net/download/weixin_41784475/88221221)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_2"}}] [.reference_item style="max-width: 50%"] - *2* *3* [Tensorflow深度学习笔记(四)-利用神经网络预测非线性回归示例](https://blog.csdn.net/juyin2015/article/details/78679707)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_2"}}] [.reference_item style="max-width: 50%"] [ .reference_list ]
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值