深度学习原理和Pytorch基础

最新推荐文章于 2024-07-25 22:26:10 发布

pinecypress

最新推荐文章于 2024-07-25 22:26:10 发布

阅读量51

点赞数

文章标签：深度学习 pytorch 人工智能 python 机器学习

本文链接：https://blog.csdn.net/pinecypress/article/details/131440155

版权

深度学习原理和Pytorch基础

本篇为读书笔记，《深度学习原理与Pytorch实战》张伟振，清华大学出版社。

文章目录

@[toc]

深度学习原理和Pytorch基础
深度学习三部曲
梯度下降方法
反向传播算法

Pytorch基础
使用Pytorch进行矩阵运算
使用Pytorch定义神经网络模型

神经网络的调优
数据与模型的规模匹配
特征缩放

深度学习原理和Pytorch基础

本章节介绍深度学习基本原理，掌握Pytorch的基本使用方法

深度学习三部曲

准备数据
定义模型、损失函数和优化器
训练模型

梯度下降方法

# 梯度下降方法

import math
def f(x):
  return math.pow(x,6)-x+1
def df(x):
  return 6*math.pow(x,5)-1
x = 0
learning_rate = 0.001
for i in range(10000):
  x = x-learning_rate*df(x)

print("The minimization of f(x) is f({})={}".format(x,f(x)))

The minimization of f(x) is f(0.6988271187715716)=0.4176440676903507

反向传播算法

# 定义一个线性运算类，forward计算f(x), backward计算partial f/partial x
class LinearLayer:
  def __init__(self, index):
    self.index = index
    self.w = index
    self.b = 0
    self.x = 0

  def forward(self, x) -> int:
    self.x = x
    return self.w * self.x + self.b

  def backward(self, grad):
    print("w{}.grad:{}".format(self.index, grad*self.x))
    return grad * self.w

# 将这种线性运算叠5层，正向传播获得输出：
input_data = 1
models = []
for i in [1,2,3,4,5]:
  model = LinearLayer(i)
  input_data = model.forward(input_data)
  models.append(model)
grad = 1

# 反向传播计算梯度
for i in [4,3,2,1,0]:
  grad = models[i].backward(grad)

w5.grad:24
w4.grad:30
w3.grad:40
w2.grad:60
w1.grad:120

Pytorch基础

import torch

使用Pytorch进行矩阵运算

x = torch.tensor(1.5, dtype = torch.float)
x

tensor(1.5000)

当数组的维度超过2时，我们把它叫做张量。

W = torch.randn(784, 1000)
W.requires_grad = True
print(W,W.shape)

tensor([[-1.0966, -1.1496, -0.1923,  ..., -0.1416,  0.5489, -0.0825],
        [-0.1091,  0.7622, -1.1342,  ..., -0.4880,  0.8766, -0.5395],
        [ 0.5576, -0.8388, -0.2424,  ...,  0.1547, -0.5483,  0.4185],
        ...,
        [-0.2393,  0.7847,  0.2732,  ...,  0.0349,  1.1623, -0.6032],
        [ 0.2384,  0.7915,  1.3915,  ..., -0.1151,  1.3205, -1.4021],
        [ 1.8065,  0.6703,  0.5813,  ..., -2.2780, -1.4849,  2.1285]],
       requires_grad=True) torch.Size([784, 1000])

W1 = torch.zeros(1000)
W2 = torch.ones(100,1)
W3 = torch.eye(3)

# 张量乘法

a = torch.randn(3,2)
b = torch.randn(2,3)
print(torch.matmul(a,b))

tensor([[1.0118, 0.4389, 1.2995],
        [1.8889, 0.4002, 0.2843],
        [0.8712, 0.2845, 0.6419]])

使用Pytorch定义神经网络模型

自动微分

自动求微分是深度学习框架最基础也是必不可少的功能。当表达式构建好以后，Pytorch可以使用.backward()方法反向传播求梯度。

x = torch.tensor(5.)
w = torch.tensor(2., requires_grad = True)
b = torch.tensor(3., requires_grad = True)
y = w*x + b# y = 2*x+3
y.backward()
print(W.grad)
print(b.grad)

None
tensor(1.)

线性模型

W*x+b线性模型可以通过torch.nn.Linear(input_size, output_size)方法获得，其中的参数表示传入该层的数据维度和该层传出的数据的维度。

dummy_input = torch.randn(784)
linear_layer = torch.nn.Linear(784,1000)# 全连接层
output = linear_layer(dummy_input)
print(output.shape)

torch.Size([1000])

print(dummy_input.shape)

torch.Size([784])

linear_layer是一个可调用的对象，我们通过torch.nn.Linear类的构造函数获得了它。

损失函数和优化器

最常用的损失函数有torch.nn.MSELoss()和torch.nn.CrossEntropyLoss()；MSELoss()常用于回归任务，CrossEntropyLoss()常用于分类任务。可以通过.item()方法打印出来。

y_predict = torch.tensor(0.3)
y_label = torch.tensor(1.)

criterion = torch.nn.MSELoss()
loss = criterion(y_predict, y_label)
print(loss.item())

0.4899999797344208

神经网络的优化过程是梯度下降，Pytorch的优化器可以通过.step()的方式对网络中的每个参数统一进行 $W=W-\eta\frac{dLoss}{dW}$
自动调节学习率：随着迭代次数增加减小 $\eta$ 。
避免模型停在局部最低点：即便梯度为0，也会继续往前“划一段”，看看前方是不是平坦一会以后会继续下降。

常用的优化器为自适应学习率优化器torch.optim.Adam和torch.optim.SGD。

需要注意的是优化器调用.step前应该调用optimizer.zero_grad()清空上一轮的梯度，调用loss.backward()计算梯度。再调用optimizer.step()优化参数。

import torch
import matplotlib.pyplot as plt

x_data = torch.tensor([[131.0],[132.38],[198.0],[134.0],[81.0],[53.0],[73.0],[161.55],[48.0],[68.0],[266.0],[48.0],[238.0],
                       [97.7],[80.13],[59.4],[178.96],[64.28],[111.3],[52.0],[308.41],[60.69],[59.65],[210.67],[218.35],
                       [58.49],[53.0],[52.0],[205.0],[159.99]], dtype = torch.float32)
y_label = torch.tensor([[415.0],[575.0],[1030.0],[297.5],[392.0],[275.6],[275.0],[800.0],[134.0],[380.0],[840.0],[126.0],
                        [948.0],[896.0],[285.0],[360.0],[700.0],[212.0],[336.0],[174.6],[1950.0],[176.0],[520.0],[1580.0],
                        [1150.0],[213.0],[160.0],[210.0],[1750.0],[630.0]], dtype = torch.float32)

# 定义线性模型v
linear_model = torch.nn.Linear(1,1)

# 定义损失函数
criterion = torch.nn.MSELoss()

# 定义优化器
optimizer = torch.optim.Adam(linear_model.parameters(), lr = 0.1)

# 遍历数据进行训练
for epoch in range(50):
  y_predict = linear_model(x_data)
  loss = criterion(y_predict, y_label)
  optimizer.zero_grad() # 清空上一轮参数
  loss.backward() # 计算梯度
  optimizer.step() # 优化参数
print(linear_model.weight)
print(linear_model.bias)

predicted = linear_model(x_data).detach().numpy()

plt.plot(x_data, y_label, 'ro', label = 'Data')
plt.plot(x_data, predicted, label='Linear Model Predict')

plt.legend()
plt.show()

Parameter containing:
tensor([[4.6414]], requires_grad=True)
Parameter containing:
tensor([2.9697], requires_grad=True)

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-plMSk6GW-1687941057565)(output_27_1.png)]

超参数

import torch
import matplotlib.pyplot as plt

# 设定超参数
input_size = 1
output_size = 1
num_epochs = 10000
learning_rate = 0.001

x_data = torch.tensor([[131.0],[132.38],[198.0],[134.0],[81.0],[53.0],[73.0],[161.55],[48.0],[68.0],[266.0],[48.0],[238.0],
                       [97.7],[80.13],[59.4],[178.96],[64.28],[111.3],[52.0],[308.41],[60.69],[59.65],[210.67],[218.35],
                       [58.49],[53.0],[52.0],[205.0],[159.99]], dtype = torch.float32)
y_label = torch.tensor([[415.0],[575.0],[1030.0],[297.5],[392.0],[275.6],[275.0],[800.0],[134.0],[380.0],[840.0],[126.0],
                        [948.0],[896.0],[285.0],[360.0],[700.0],[212.0],[336.0],[174.6],[1950.0],[176.0],[520.0],[1580.0],
                        [1150.0],[213.0],[160.0],[210.0],[1750.0],[630.0]], dtype = torch.float32)

# 定义线性模型v
linear_model = torch.nn.Linear(input_size,output_size)

# 定义损失函数
criterion = torch.nn.MSELoss()

# 定义优化器
optimizer = torch.optim.Adam(linear_model.parameters(), lr = learning_rate)

# 遍历数据进行训练
for epoch in range(num_epochs):
  y_predict = linear_model(x_data)
  loss = criterion(y_predict, y_label)
  optimizer.zero_grad() # 清空上一轮参数
  loss.backward() # 计算梯度
  optimizer.step() # 优化参数
print(linear_model.weight)
print(linear_model.bias)

predicted = linear_model(x_data).detach().numpy()

plt.plot(x_data, y_label, 'ro', label = 'Data')
plt.plot(x_data, predicted, label='Linear Model Predict')

plt.legend()
plt.show()

Parameter containing:
tensor([[5.0478]], requires_grad=True)
Parameter containing:
tensor([2.2756], requires_grad=True)

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-NIqy0TmB-1687941057566)(output_29_1.png)]

搭建全连接神经网络

import torch
class NeuralNetwork(torch.nn.Module):
  def __init__(self):
    super().__init__()# 继承了torch.nn.Module
    self.linear_layer1 = torch.nn.Linear(1,1)
    self.active_function1 = torch.nn.ReLU()
  def forward(self,x):
    x = self.linear_layer1(x)
    x = self.active_function1(x)
    return x

# 实例化网络
model = NeuralNetwork()
test_data = torch.randn(1)

output = model(test_data)
print(output)

tensor([0.], grad_fn=<ReluBackward0>)

使用Pytorch进行MINIST数据集分类

pip install torchvision

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Requirement already satisfied: torchvision in /usr/local/lib/python3.10/dist-packages (0.15.2+cu118)
Requirement already satisfied: numpy in /usr/local/lib/python3.10/dist-packages (from torchvision) (1.22.4)
Requirement already satisfied: requests in /usr/local/lib/python3.10/dist-packages (from torchvision) (2.27.1)
Requirement already satisfied: torch==2.0.1 in /usr/local/lib/python3.10/dist-packages (from torchvision) (2.0.1+cu118)
Requirement already satisfied: pillow!=8.3.*,>=5.3.0 in /usr/local/lib/python3.10/dist-packages (from torchvision) (8.4.0)
Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from torch==2.0.1->torchvision) (3.12.2)
Requirement already satisfied: typing-extensions in /usr/local/lib/python3.10/dist-packages (from torch==2.0.1->torchvision) (4.6.3)
Requirement already satisfied: sympy in /usr/local/lib/python3.10/dist-packages (from torch==2.0.1->torchvision) (1.11.1)
Requirement already satisfied: networkx in /usr/local/lib/python3.10/dist-packages (from torch==2.0.1->torchvision) (3.1)
Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from torch==2.0.1->torchvision) (3.1.2)
Requirement already satisfied: triton==2.0.0 in /usr/local/lib/python3.10/dist-packages (from torch==2.0.1->torchvision) (2.0.0)
Requirement already satisfied: cmake in /usr/local/lib/python3.10/dist-packages (from triton==2.0.0->torch==2.0.1->torchvision) (3.25.2)
Requirement already satisfied: lit in /usr/local/lib/python3.10/dist-packages (from triton==2.0.0->torch==2.0.1->torchvision) (16.0.6)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests->torchvision) (1.26.16)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests->torchvision) (2023.5.7)
Requirement already satisfied: charset-normalizer~=2.0.0 in /usr/local/lib/python3.10/dist-packages (from requests->torchvision) (2.0.12)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests->torchvision) (3.4)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->torch==2.0.1->torchvision) (2.1.3)
Requirement already satisfied: mpmath>=0.19 in /usr/local/lib/python3.10/dist-packages (from sympy->torch==2.0.1->torchvision) (1.3.0)

# 从TorchVision下载Minist数据集
import torchvision
train_dataset = torchvision.datasets.MNIST(root = "./",download=True)
sample = train_dataset[0]
print(sample)

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to ./MNIST/raw/train-images-idx3-ubyte.gz


100%|██████████| 9912422/9912422 [00:00<00:00, 84643173.78it/s]


Extracting ./MNIST/raw/train-images-idx3-ubyte.gz to ./MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to ./MNIST/raw/train-labels-idx1-ubyte.gz


100%|██████████| 28881/28881 [00:00<00:00, 77850702.97it/s]


Extracting ./MNIST/raw/train-labels-idx1-ubyte.gz to ./MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to ./MNIST/raw/t10k-images-idx3-ubyte.gz


100%|██████████| 1648877/1648877 [00:00<00:00, 28535025.26it/s]


Extracting ./MNIST/raw/t10k-images-idx3-ubyte.gz to ./MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to ./MNIST/raw/t10k-labels-idx1-ubyte.gz


100%|██████████| 4542/4542 [00:00<00:00, 14038709.48it/s]

Extracting ./MNIST/raw/t10k-labels-idx1-ubyte.gz to ./MNIST/raw






(<PIL.Image.Image image mode=L size=28x28 at 0x7F97A3EDF970>, 5)

sample = train_dataset[0]
plt.imshow(sample[0])
plt.show()

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-Ht7JqU0J-1687941057566)(output_35_0.png)]

ToTensor = torchvision.transforms.ToTensor()
tensor = ToTensor(sample[0])
print(tensor)

tensor([[[0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000],
         [0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000],
         [0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000],
         [0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000],
         [0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000],
         [0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000, 0.0118, 0.0706, 0.0706, 0.0706,
          0.4941, 0.5333, 0.6863, 0.1020, 0.6510, 1.0000, 0.9686, 0.4980,
          0.0000, 0.0000, 0.0000, 0.0000],
         [0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.1176, 0.1412, 0.3686, 0.6039, 0.6667, 0.9922, 0.9922, 0.9922,
          0.9922, 0.9922, 0.8824, 0.6745, 0.9922, 0.9490, 0.7647, 0.2510,
          0.0000, 0.0000, 0.0000, 0.0000],
         [0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.1922,
          0.9333, 0.9922, 0.9922, 0.9922, 0.9922, 0.9922, 0.9922, 0.9922,
          0.9922, 0.9843, 0.3647, 0.3216, 0.3216, 0.2196, 0.1529, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000],
         [0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0706,
          0.8588, 0.9922, 0.9922, 0.9922, 0.9922, 0.9922, 0.7765, 0.7137,
          0.9686, 0.9451, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000],
         [0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.3137, 0.6118, 0.4196, 0.9922, 0.9922, 0.8039, 0.0431, 0.0000,
          0.1686, 0.6039, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000],
         [0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0549, 0.0039, 0.6039, 0.9922, 0.3529, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000],
         [0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.5451, 0.9922, 0.7451, 0.0078, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000],
         [0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0431, 0.7451, 0.9922, 0.2745, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000],
         [0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000, 0.1373, 0.9451, 0.8824, 0.6275,
          0.4235, 0.0039, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000],
         [0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.3176, 0.9412, 0.9922,
          0.9922, 0.4667, 0.0980, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000],
         [0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.1765, 0.7294,
          0.9922, 0.9922, 0.5882, 0.1059, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000],
         [0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0627,
          0.3647, 0.9882, 0.9922, 0.7333, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000],
         [0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.9765, 0.9922, 0.9765, 0.2510, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000],
         [0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.1804, 0.5098,
          0.7176, 0.9922, 0.9922, 0.8118, 0.0078, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000],
         [0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000, 0.1529, 0.5804, 0.8980, 0.9922,
          0.9922, 0.9922, 0.9804, 0.7137, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000],
         [0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0941, 0.4471, 0.8667, 0.9922, 0.9922, 0.9922,
          0.9922, 0.7882, 0.3059, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000],
         [0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0902, 0.2588, 0.8353, 0.9922, 0.9922, 0.9922, 0.9922, 0.7765,
          0.3176, 0.0078, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000],
         [0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0706, 0.6706,
          0.8588, 0.9922, 0.9922, 0.9922, 0.9922, 0.7647, 0.3137, 0.0353,
          0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000],
         [0.0000, 0.0000, 0.0000, 0.0000, 0.2157, 0.6745, 0.8863, 0.9922,
          0.9922, 0.9922, 0.9922, 0.9569, 0.5216, 0.0431, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000],
         [0.0000, 0.0000, 0.0000, 0.0000, 0.5333, 0.9922, 0.9922, 0.9922,
          0.8314, 0.5294, 0.5176, 0.0627, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000],
         [0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000],
         [0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000],
         [0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
          0.0000, 0.0000, 0.0000, 0.0000]]])

import torch
import torchvision

# 设置超参数
input_size = 784
hidden_size = 1000
num_classes = 10
num_epochs = 5
batch_size = 100
learning_rate = 0.001

# 下载数据
train_dataset = torchvision.datasets.MNIST(root = "./",
                                           train = True,
                                           transform = torchvision.transforms.ToTensor(),
                                           download=True)
test_dataset = torchvision.datasets.MNIST(root = "./",
                                           train = False,
                                           transform = torchvision.transforms.ToTensor())
                                          #  download=True
train_loader = torch.utils.data.DataLoader(dataset=train_dataset,
                                           batch_size = batch_size,
                                           shuffle = True)
test_loader = torch.utils.data.DataLoader(dataset=test_dataset,
                                           batch_size = batch_size,
                                           shuffle = False)

# 构建全连接神经网络
class NeuralNetwork(torch.nn.Module):
  def __init__(self, input_size, hidden_size, num_classes):
    super(NeuralNetwork,self).__init__()
    self.fc1 = torch.nn.Linear(input_size, hidden_size)
    self.relu = torch.nn.ReLU()
    self.fc2 = torch.nn.Linear(hidden_size, num_classes)
  def forward(self,x):
    x = self.fc1(x)
    x = self.relu(x)
    x = self.fc2(x)
    return x

# 实例化类型
model = NeuralNetwork(input_size, hidden_size, num_classes)

# 设置损失函数和优化器
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr = learning_rate)

#  训练模型
for epoch in range(num_epochs):
  for images, labels in train_loader:
    images = images.reshape(-1, 28*28)

    # 前向传播获得模型预测值
    outputs = model(images)
    loss = criterion(outputs, labels)

    # 反向传播算出Loss对各参数的梯度
    optimizer.zero_grad()
    loss.backward()

    # 更新参数
    optimizer.step()

# 为了统计与答案的相符合程度，测试模型代码如下：
correct = 0
total = 0
for images, labels in test_loader:
    images = images.reshape(-1, 28*28)
    outputs = model(images)
    _, predicted = torch.max(outputs,1)
    total += labels.size(0)
    correct += (predicted == labels).sum().item()
print('Accuracy on test_set:{} %'.format(100*correct/total))

Accuracy on test_set:97.83 %

神经网络的调优

数据与模型的规模匹配

当数据过少，而采用非常巨大模型，可能出现过拟合。数据过多，特征非常复杂而采用小规模模型时，可能出现欠拟合情况。

import torch
import torchvision

# 设置超参数
input_size = 784
hidden_size = 1000
num_classes = 10
num_epochs = 5
batch_size = 100
learning_rate = 0.001

# 下载数据
train_dataset = torchvision.datasets.MNIST(root = "./",
                                           train = True,
                                           transform = torchvision.transforms.ToTensor(),
                                           download=True)

# 只取训练集前100个数据进行训练
train_dataset = list(train_dataset)[:100]
test_dataset = torchvision.datasets.MNIST(root = "./",
                                           train = False,
                                           transform = torchvision.transforms.ToTensor())
                                          #  download=True
train_loader = torch.utils.data.DataLoader(dataset=train_dataset,
                                           batch_size = batch_size,
                                           shuffle = True)
test_loader = torch.utils.data.DataLoader(dataset=test_dataset,
                                           batch_size = batch_size,
                                           shuffle = False)

# 构建全连接神经网络
class NeuralNetwork(torch.nn.Module):
  def __init__(self, input_size, hidden_size, num_classes):
    super(NeuralNetwork,self).__init__()
    self.fc1 = torch.nn.Linear(input_size, hidden_size)
    self.relu = torch.nn.ReLU()
    self.fc2 = torch.nn.Linear(hidden_size, num_classes)
  def forward(self,x):
    x = self.fc1(x)
    x = self.relu(x)
    x = self.fc2(x)
    return x

# 实例化类型
model = NeuralNetwork(input_size, hidden_size, num_classes)

# 设置损失函数和优化器
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr = learning_rate)

#  训练模型
for epoch in range(num_epochs):
  for images, labels in train_loader:
    images = images.reshape(-1, 28*28)

    # 前向传播获得模型预测值
    outputs = model(images)
    loss = criterion(outputs, labels)

    # 反向传播算出Loss对各参数的梯度
    optimizer.zero_grad()
    loss.backward()

    # 更新参数
    optimizer.step()


# 为了统计与答案的相符合程度，测试模型代码如下：
correct = 0
total = 0
for images, labels in train_loader:
    images = images.reshape(-1, 28*28)
    outputs = model(images)
    _, predicted = torch.max(outputs,1)
    total += labels.size(0)
    correct += (predicted == labels).sum().item()
print('Accuracy on train_set:{} %'.format(100*correct/total))

# 为了统计与答案的相符合程度，测试模型代码如下：
correct = 0
total = 0
for images, labels in test_loader:
    images = images.reshape(-1, 28*28)
    outputs = model(images)
    _, predicted = torch.max(outputs,1)
    total += labels.size(0)
    correct += (predicted == labels).sum().item()
print('Accuracy on test_set:{} %'.format(100*correct/total))

Accuracy on train_set:91.0 %
Accuracy on test_set:57.19 %

特征缩放

归一化

$\frac{x-min(x)}{max(x)-min(x)}$

def rescaling(x):
  max_value = x[0]
  min_value = x[0]
  for value in x:
    if value>max_value:
      max_value = value
    if value<min_value:
      min_value = value
  for i in range(len(x)):
    x[i] = (x[i]-min_value)/(max_value-min_value)
  return x

标准化

$\frac{x - \overline x}{\sigma}$

def mean(x):
  return x.sum()/len(x)
def std(x):
  return ((abs(x-mean(x))**2).sum()/(len(x)-1))**0.5

def standardization(x):
  x_mean = x.mean()
  x_std = x.std()

  x = (x-x_mean)/x_std
  return x

再pytorch中，在神经网络中插入一个BatchNorm1d便可以完成标准化。

x_data = torch.randn(100,3)

norm_layer = torch.nn.BatchNorm1d(3)
print(norm_layer(x))

pinecypress

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
深度学习原理和Pytorch基础

深度学习原理和Pytorch基础深度学习三部曲梯度下降方法反向传播算法Pytorch基础使用Pytorch进行矩阵运算使用Pytorch定义神经网络模型神经网络的调优数据与模型的规模匹配特征缩放
复制链接

扫一扫