# Deep Learning With Pytorch - 最基本的感知机、贯序模型/分类、拟合

14 篇文章 1 订阅

### 如何利用pytorch创建一个简单的网络模型？

#### Step1. 感知机，多层感知机（MLP）的基本结构

a 1 [ 1 ] = s i g m o i d ( ω 1 , 1 [ 1 ] ⋅ x 1 + ω 2 , 1 [ 1 ] ⋅ x 2 + b 1 [ 1 ] ) a_1^{[1]}=sigmoid(\omega_{1,1}^{[1]}·x_1+\omega_{2,1}^{[1]}·x_2+b_1^{[1]})

a 2 [ 1 ] = s i g m o i d ( ω 2 , 1 [ 1 ] ⋅ x 1 + ω 2 , 2 [ 1 ] ⋅ x 2 + b 2 [ 1 ] ) a_2^{[1]}=sigmoid(\omega_{2,1}^{[1]}·x_1+\omega_{2,2}^{[1]}·x_2+b_2^{[1]})

a 1 [ 2 ] = s i g m o i d ( ω 1 , 1 [ 2 ] ⋅ a 1 [ 1 ] + ω 2 , 1 [ 2 ] ⋅ a 2 [ 1 ] + b [ 2 ] ) a_1^{[2]}=sigmoid(\omega_{1,1}^{[2]}·a_1^{[1]}+\omega_{2,1}^{[2]}·a_2^{[1]}+b^{[2]}) 【逻辑值】

#### Step2. 超平面 ω T ⋅ x + b = 0 \omega^{T}·x+b=0 or ω T ⋅ x = b \omega^{T}·x=b

( A , B , C ) (A, B, C) 为平面的法向量，亦可写为点法式： A ( x − x 0 ) + B ( y − y 0 ) + C ( z − z 0 ) = 0 A(x-x_0)+B(y-y_0)+C(z-z_0)=0

( x 0 , y 0 , z 0 ) (x_0, y_0, z_0) 是平面上的一个点，将点法式拆开： A x + B y + C z = A x 0 + B y 0 + C z 0 Ax+By+Cz=Ax_0+By_0+Cz_0

A x 1 + B x 2 + C x 3 + . . . N x n = A x 0 + B x 0 + C x 0 + . . . N x 0 Ax_1+Bx_2+Cx_3+...Nx_n=Ax_0+Bx_0+Cx_0+...Nx_0

##### 感知机函数

Step1中我们见过了感知机加激活函数得到与门的效果：

S i g m o i d ( ω T ⋅ x + b ) = { 1 , ω T x + b ≥ 0 0 , ω T x + b < 0 Sigmoid(\omega^{T}·x+b) = \begin{cases} 1, \qquad \omega^{T}x+b≥0\\ 0,\qquad \omega^{T}x+b<0\end{cases}

S i g n ( ω T ⋅ x + b ) = { + 1 , ω T x + b ≥ 0 − 1 , ω T x + b < 0 Sign(\omega^{T}·x+b) = \begin{cases} +1, \qquad \omega^{T}x+b≥0\\ -1,\qquad \omega^{T}x+b<0\end{cases}

{ ∂ L ∂ ω = L ( ω , b ) = − ∑ i = 1 n y i ⋅ x ∂ L ∂ b = L ( ω , b ) = − ∑ i = 1 n y i \begin{cases}\frac{\partial{L}}{\partial{\omega}}=\mathcal{L}(\omega,b)=-\sum_{i=1}^{n}y_i·x\\\frac{\partial{L}}{\partial{b}}=\mathcal{L}(\omega,b)=-\sum_{i=1}^{n}y_i\end{cases}

{ ω = ω + α ∂ L ∂ ω b = b + α ∂ L ∂ b \begin{cases}\omega = \omega + \alpha\frac{\partial{L}}{\partial{\omega}}\\ b = b + \alpha\frac{\partial{L}}{\partial{b}}\end{cases}

#### Step3. 利用感知机进行决策分类的训练过程 -【Matlab代码】

clc,clear,close all;
%% 定义变量
n = 50;        % 正负样本的个数，总样本数为2n
r = 0.5;       % 学习率
m = 2;         % 样本的维数
i_max = 100;  % 最大迭代次数

%% 生成样本（以二维为例）
pix = linspace(-pi,pi,n);
randx = 2*pix.*rand(1,n) - pi;
x1 = [cos(randx) + 2*rand(1,n); 3+sin(randx) + 2*rand(1,n)];
x2 = [3+cos(randx) + 2*rand(1,n); sin(randx) + 2*rand(1,n)];
x = [x1'; x2'];  % 一共2n个点
y = [ones(n,1); -ones(n,1)];  %添加标签
figure(1)
hold on;
plot(x1(1,:),x1(2,:),'rx');
plot(x2(1,:),x2(2,:),'go');

%% 训练感知机
x = [ones(2*n,1) x];    % 增加一个常数偏置项 [1, x1;x2]
w = zeros(1,m+1);       % 初始化权值 [w0, w1, w2]
flag = true;            % 退出循环的标志，为true时退出循环
for i=1:i_max
for j=1:2*n
if sign(x(j,:)*w') ~= y(j)  % 超平面加激活函数：sign(w'x+w0)
disp(num2str(sign(x(j,:)*w')))
disp(y(j))
flag = false;
w = w + r*y(j)*x(j,:);  % 利用SGD算法更新参数
% begin
pause(0.3);
cla('reset');
axis([-1,6,-1,6]);
hold on
plot(x1(1,:),x1(2,:),'rx');
plot(x2(1,:),x2(2,:),'go');
x_test = linspace(0,5,20);
y_test = -w(2)/w(3).*x_test-w(1)/w(3);
plot(x_test,y_test,'m-.');
% end
M=getframe(gcf);
nn=frame2im(M);
[nn,cm]=rgb2ind(nn,256);
if i==1
imwrite(nn,cm,'out.gif','gif','LoopCount',inf,'DelayTime',0.1);
else
imwrite(nn,cm,'out.gif','gif','WriteMode','append','DelayTime',0.5)
end
end
end
if flag
disp(num2str(sign(x(j,:)*w')))
disp(y(j))
break;
end
end
disp(num2str(sign(x(j,:)*w')))
disp(y(j))

%% 画分割线
cla('reset');
hold on
axis([-1,6,-1,6]);
plot(x1(1,:),x1(2,:),'rx');
plot(x2(1,:),x2(2,:),'go');
x_test = linspace(0,5,20);
y_test = -w(2)/w(3).*x_test-w(1)/w(3);
plot(x_test,y_test,'linewidth',2);
legend('标签为正的样本','标签为负的样本','分类超平面');
M=getframe(gcf);
nn=frame2im(M);
[nn,cm]=rgb2ind(nn,256);
imwrite(nn,cm,'out.gif','gif','WriteMode','append','DelayTime',0.5)


disp(num2str(sign(x(j,:)*w')))
disp(y(j))

0
1

1
-1

-1
1

1
-1

-1
1

1
-1

-1
1

1
-1

-1
1

1
-1

-1
1

1
-1

-1
1

1
-1

-1
1

1
-1

-1
-1


### 从线性回归到贯序模型

# 添加激活层后的贯序模型
seq_model = nn.Sequential(OrderedDict([
('input_linear', nn.Linear(1, 12)),
('hidden_activation', nn.Tanh()),
('output_linear', nn.Linear(12, 1))
]))


#### nn.Linear(_, _)

nn.Linear 是 PyTorch 中的一个类，也可以理解为一个函数，用于定义一个线性变换（也称为全连接层或仿射变换），将输入特征映射到输出特征。它是神经网络模块 nn 提供的一个常用函数之一。

nn.Linear(in_features, out_features)中的第一个参数为in_features: 输入特征的数量（维度）。这个参数决定了输入的大小，通常也就是数据集中的特征数，即输入张量的最后一维大小；

nn.Linear(in_features, out_features)中的第二个参数为out_features: 输出特征的数量（维度）。这个参数决定了输出的大小，即输出张量的最后一维大小。

bias: 是否在变换中使用偏置项（偏置向量）。默认为 True，表示会使用偏置项；设置为 False 则不使用偏置项。

output = input × weight ⊤ + bias \text{output} = \text{input} \times \text{weight}^{\top} + \text{bias}

0-505就是数据集的batch_size，batch_size表示一次选多少行数据进行训练，而crim, age, tax…这类的特征一共14个特征数量，就是nn.Linear中的in_features了，当然大小也是根据你需要哪几个特征参与训练确定的。

nn,Linear线性层计算的输入和输出格式：

nn.Linear 在神经网络中非常常见，它可以用于构建模型的一层或多层，实现从输入到输出的特征变换。通过多层的堆叠和非线性激活函数的引入，可以构建出更复杂的神经网络模型，适用于各种任务。

#### 模型训练

def training_loop(n_epochs, optimizer, model, loss_fn, t_u_train, t_u_val,
t_c_train, t_c_val):
for epoch in range(1, n_epochs + 1):
t_p_train = model(t_u_train)
loss_train = loss_fn(t_p_train, t_c_train)
t_p_val = model(t_u_val)
loss_val = loss_fn(t_p_val, t_c_val)
loss_train.backward()   # 后向传播计算新梯度
optimizer.step()        # 根据梯度进行SGD优化

if epoch == 1 or epoch % 500 == 0:
print(f"Epoch {epoch}, Training loss {loss_train.item():.4f},"
f" Validation loss {loss_val.item():.4f}")

def loss_fn(t_p, t_c):
squared_diffs = (t_p - t_c) ** 2
return squared_diffs.mean()

linear_model = nn.Linear(1, 1)
optimizer = optim.SGD(linear_model.parameters(), lr=1e-2)

# 尝试线性模型训练
training_loop(
n_epochs=3000,
optimizer=optimizer,
model=linear_model,
# loss_fn=loss_fn,
loss_fn=nn.MSELoss(),
t_u_train=t_un_train,
t_u_val=t_un_val,
t_c_train=t_c_train,
t_c_val=t_c_val)


#### 贯序模型例程 -【Pytorch完整代码】

import torch
import torch.optim as optim
import torch.nn as nn
from collections import OrderedDict
from matplotlib import pyplot as plt

torch.set_printoptions(edgeitems=2, linewidth=75)

# 数据集准备
t_c = [0.5, 14.0, 15.0, 28.0, 11.0, 8.0, 3.0, -4.0, 6.0, 13.0, 21.0]
t_u = [35.7, 55.9, 58.2, 81.9, 56.3, 48.9, 33.9, 21.8, 48.4, 60.4, 68.4]
t_c = torch.tensor(t_c).unsqueeze(1)
t_u = torch.tensor(t_u).unsqueeze(1)

n_samples = t_u.shape[0]  # 获取数据集的样本数量（数据集中元素的数量）
n_val = int(0.2 * n_samples)  # 计算验证集的样本数量。这里使用了0.2作为验证集的比例，将数据集中的20%作为验证集

# 生成一个长度为n_samples的随机排列的索引数组。这里使用torch.rand-perm函数生成一个随机排列的整数数组，用于打乱原始数据集的索引顺序
shuffled_indices = torch.randperm(n_samples)
train_indices = shuffled_indices[:-n_val]
val_indices = shuffled_indices[-n_val:]
t_u_train = t_u[train_indices]
t_c_train = t_c[train_indices]
t_u_val = t_u[val_indices]
t_c_val = t_c[val_indices]
t_un_train = 0.1 * t_u_train
t_un_val = 0.1 * t_u_val

linear_model = nn.Linear(1, 1)
linear_model(t_un_val)

def training_loop(n_epochs, optimizer, model, loss_fn, t_u_train, t_u_val,
t_c_train, t_c_val):
for epoch in range(1, n_epochs + 1):
t_p_train = model(t_u_train)
loss_train = loss_fn(t_p_train, t_c_train)
t_p_val = model(t_u_val)
loss_val = loss_fn(t_p_val, t_c_val)
loss_train.backward()   # 后向传播计算新梯度
optimizer.step()        # 根据梯度进行SGD优化

if epoch == 1 or epoch % 500 == 0:
print(f"Epoch {epoch}, Training loss {loss_train.item():.4f},"
f" Validation loss {loss_val.item():.4f}")

def loss_fn(t_p, t_c):
squared_diffs = (t_p - t_c) ** 2
return squared_diffs.mean()

linear_model = nn.Linear(1, 1)
optimizer = optim.SGD(linear_model.parameters(), lr=1e-2)

# 尝试线性模型训练
training_loop(
n_epochs=3000,
optimizer=optimizer,
model=linear_model,
# loss_fn=loss_fn,
loss_fn=nn.MSELoss(),
t_u_train=t_un_train,
t_u_val=t_un_val,
t_c_train=t_c_train,
t_c_val=t_c_val)

print()
print(linear_model.weight)
print(linear_model.bias)

# 添加激活层后的贯序模型
seq_model = nn.Sequential(OrderedDict([
('input_linear', nn.Linear(1, 12)),
('hidden_activation', nn.Tanh()),
('output_linear', nn.Linear(12, 1))
]))

print(seq_model)
print([param.shape for param in seq_model.parameters()])

for name, param in seq_model.named_parameters():
print(name, param.shape)

optimizer = optim.SGD(seq_model.parameters(), lr=1e-3)

training_loop(
n_epochs=5000,
optimizer=optimizer,
model=seq_model,        # 使用贯序模型重新训练
loss_fn=nn.MSELoss(),
t_u_train=t_un_train,
t_u_val=t_un_val,
t_c_train=t_c_train,
t_c_val=t_c_val)

# 打印模型参数训练结果：
# print('output', seq_model(t_un_val))

t_range = torch.arange(20., 90.).unsqueeze(1)

fig = plt.figure(dpi=100)
plt.xlabel("Fahrenheit")
plt.ylabel("Celsius")
plt.plot(t_u.numpy(), t_c.numpy(), 'o')
plt.plot(t_range.numpy(), seq_model(0.1 * t_range).detach().numpy(), 'c-')
plt.plot(t_u.numpy(), seq_model(0.1 * t_u).detach().numpy(), 'kx')
plt.show()


Epoch 1, Training loss 287.7947, Validation loss 243.3686
Epoch 500, Training loss 6.3782, Validation loss 5.3946
Epoch 1000, Training loss 2.9283, Validation loss 6.1271
Epoch 1500, Training loss 2.4918, Validation loss 6.4090
Epoch 2000, Training loss 2.4366, Validation loss 6.5120
Epoch 2500, Training loss 2.4296, Validation loss 6.5489
Epoch 3000, Training loss 2.4288, Validation loss 6.5621

Parameter containing:
Parameter containing:
Sequential(
(hidden_linear): Linear(in_features=1, out_features=12, bias=True)
(hidden_activation): Tanh()
(output_linear): Linear(in_features=12, out_features=1, bias=True)
)
[torch.Size([12, 1]), torch.Size([12]), torch.Size([1, 12]), torch.Size([1])]
hidden_linear.weight torch.Size([12, 1])
hidden_linear.bias torch.Size([12])
output_linear.weight torch.Size([1, 12])
output_linear.bias torch.Size([1])
Epoch 1, Training loss 200.8066, Validation loss 149.6482
Epoch 500, Training loss 8.0419, Validation loss 6.9692
Epoch 1000, Training loss 2.8967, Validation loss 9.0610
Epoch 1500, Training loss 1.7860, Validation loss 8.2857
Epoch 2000, Training loss 1.4266, Validation loss 7.6947
Epoch 2500, Training loss 1.3101, Validation loss 7.3714
Epoch 3000, Training loss 1.2710, Validation loss 7.2340
Epoch 3500, Training loss 1.2550, Validation loss 7.2035
Epoch 4000, Training loss 1.2451, Validation loss 7.2175
Epoch 4500, Training loss 1.2367, Validation loss 7.2404
Epoch 5000, Training loss 1.2289, Validation loss 7.2637


[1] Pytorch官方 - Deep Learning With Pytorch
[2] 《零基础学机器学习》
[3] 感知机 - 谢晋- 算法与数学之美
[4] 感知机w·x+b=0怎么理解？数学推导是什么样的？
[5] 啥都断更的小c-从0开始深度学习：什么是线性层？pytorch中nn.linear怎么用？
[6] 机器学习| 算法笔记-线性回归（Linear Regression）

• 2
点赞
• 2
收藏
觉得还不错? 一键收藏
• 打赏
• 0
评论

### “相关推荐”对你有帮助么？

• 非常没帮助
• 没帮助
• 一般
• 有帮助
• 非常有帮助

¥1 ¥2 ¥4 ¥6 ¥10 ¥20

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。