小土堆。python深度学习入门学习。

Pytorch入门学习

观看b站 小土堆的Pytorch深度学习入门教程 总结的笔记。

Pytorch的官方文档

中文版文档

1.Pytorch环境的配置及安装

  • 安装Anaconda

视频里面很详细

传送门


问题:

如果没有Nvidia显卡的话,torch.cuda.is_available() 就是False。我用的mac系统,所以也为false,不影响后面学习。

2.Python编译器配置及安装

在这里插入图片描述

  • Pycharm
  • Jupyter
  • python控制台

Pycharm上配置Anaconda环境:

参考博客

3.python学习中两个法宝函数

在这里插入图片描述

  • dir():能让我们知道,工具箱以及工具箱中的分隔区有什么东西。

  • help():能让我们知道每个工具是如何使用的,工具的使用方法。

4.Pytorch加载数据初认识

在这里插入图片描述

  • 用到两个库函数:

    • Dataset()
    • Dataloader()
  • pytorch如何加载数据?

__init__方法可以用来设置读取数据集等初始化数据集的基本操作,即根据这个类创建实例的时候自动调用的函数。作用是为class提供全局变量,为后面的函数提供他们所需要的量。

4.1 Dataset()

  • 表示Dataset的抽象类.
  • 所有其他数据集都应该进行子类化。所有子类应该override__len____getitem__,前者提供了数据集的大小,后者支持整数索引,范围从0到len(self)。

测试代码:

from torch.utils.data import Dataset
from PIL import Image
import os

class MyData(Dataset):

    def __init__(self,root_dir,label_dir):
        self.root_dir = root_dir
        self.label_dir = label_dir
        # 拼接路径
        self.path = os.path.join(self.root_dir,self.label_dir)
        # 变成列表
        self.img_path = os.listdir(self.path)

    def __getitem__(self,idx):
        img_name = self.img_path[idx]
        img_item_path = os.path.join(self.root_dir,self.label_dir,img_name)
        img = Image.open(img_item_path)
        label = self.label_dir
        return img,label

    def __len__(self):
        return len(self.img_path)


root_dir = "dataSet/train"
ants_label_dir = "ants_image"
bees_label_dir = "bees_image"
ants_dataset = MyData(root_dir,ants_label_dir)
bees_dataset = MyData(root_dir,bees_label_dir)

train_dataset = ants_dataset + bees_dataset

然后控制台输入:

img,label = ants_dataset[0]
img.show()
img,label = bees_dataset[0]
img.show()
img,label = train_dataset[123]
img.show()
img,label = train_dataset[124]
img.show()

查看结果!

4.2 Dataloader()

  • 数据加载器。组合数据集和采样器,并在数据集上提供单进程或多进程迭代器。

参数:

DataLoader(dataset, batch_size=1, shuffle=False, num_workers=0, drop_last=False)

  • dataset (Dataset) – 加载数据的数据集。

  • batch_size (int, optional) – 每个batch加载多少个样本(默认: 1)。

  • shuffle (bool, optional) – 设置为True时会在每个epoch重新打乱数据(默认: False).

  • num_workers (int, optional) – 用多少个子进程加载数据。0表示数据将在主进程中加载(默认: 0)

  • drop_last (bool, optional) – 如果数据集大小不能被batch size整除,则设置为True后可删除最后一个不完整的batch。如果设为False并且数据集的大小不能被batch size整除,则最后一个batch将更小。(默认: False)

5.TensorBoard的使用

  • TensorBoard 是 TensorFlow 提供的实用工具(utility),可以图形化的显示 computational graph。

测试代码:

from torch.utils.tensorboard import SummaryWriter
import numpy as np
from PIL import Image

writer = SummaryWriter("logs")

image_path = "dataSet/train/ants_image/6743948_2b8c096dda.jpg"
img_PIL = Image.open(image_path)
# add_img只能读tensor型或numpy.array型的图片,.img要先转换。
img_array = np.array(img_PIL)
print(type(img_array))
print(img_array.shape)

writer.add_image("test2",img_array,1,dataformats='HWC') #HWC:高度,宽度,通道

# y=x
for i in range(100):
    writer.add_scalar("y=2x",2*i,i)

writer.close()

可以发现在我们的项目中到了一个logs文件。

如何打开tensorboard的事件文件(logs):

在Terminal输入:

tensorboard --logdir=logs

然后点击网址:

在这里插入图片描述

可以查看结果!

在这里插入图片描述

在这里插入图片描述

我们还可以修改端口号为6007:

tensorboard --logdir=logs --port=6007

6.Tranforms的使用

下载opencv:pip install opencv-python

在这里插入图片描述

  • ToTensor():把一个取值范围是[0,255]PIL.Image或者shape(H,W,C)numpy.ndarray,转换成形状为[C,H,W],取值范围是[0,1.0]torch.FloadTensor

测试代码:

from PIL import Image
from torch.utils.tensorboard import SummaryWriter
from torchvision import transforms

# python的用法 ————》tensor数据类型
# 通过 transforms.ToTensor()去看两个问题。
# 1。transforms被如何使用(python)
# 2。为什么我们需要Tensor数据类型。

# 绝对路径:
# /Users/wangcheng/PycharmProjects/protorchLearn/dataSet/train/ants_image/0013035.jpg
# 相对路径:dataSet/train/ants_image/0013035.jpg
img_path = "dataSet/train/ants_image/0013035.jpg"
img = Image.open(img_path)

tensor_trans = transforms.ToTensor()
tensor_img = tensor_trans(img)
# print(tensor_img)

writer = SummaryWriter("logs")
writer.add_image("Tesnor_img",tensor_img)

writer.close()

查看结果!


6.1 常见的Tranform

先简单介绍def __call__与普通def的区别:

class Person:
    def __call__(self,name):
        print("_call__"+"Hello"+name)

    def hello(self,name):
        print("hello"+name)

person = Person()
person("zhangsan")
person.hello("list")

  • Normalize(mean, std)

    给定均值:(R,G,B) 方差:(R,G,B),将会把Tensor正则化。

    即:Normalized_image=(image-mean)/std

  • Resize

    改变图片大小(H,W)。

  • Compose(transforms)

    几个图像变换组合在一起,按照给出的transform顺序进行计算。

  • RandomCrop(size, padding=0)

    切割中心点的位置随机选取。size可以是tuple也可以是Integer

测试代码:

from PIL import Image
from torch.utils.tensorboard import SummaryWriter
from torchvision import transforms

writer = SummaryWriter("logs")
img = Image.open("imgs/airplane.png")
print(img)

# ToTesnor
# tensor将图片压缩到了[0,1]
tran_totensor = transforms.ToTensor()
img_tensor = tran_totensor(img)
writer.add_image("ToTensor",img_tensor)

# Normalize 归一化
print(img_tensor[0][0][0])
trans_norm = transforms.Normalize([1,3,5],[3,2,1])
img_norm = trans_norm(img_tensor)
print(img_norm[0][0][0])
writer.add_image("Normalize",img_norm,1)

# Resize 改变尺寸
print(img.size)
trans_resize = transforms.Resize((512,512))
# img PIL -> resize -> img_resize PIL
img_resize = trans_resize(img)
# img_resize PIL -> totensor -> img_resize tensor
img_resize = tran_totensor(img_resize)
print(img_resize)
writer.add_image("Resize",img_resize,0)

# Compose 改变尺寸,宽和高的尺寸相对关系不变。
trans_resize_2 = transforms.Resize(512)
# PIL -> PIL -> tensor
trans_compose = transforms.Compose([trans_resize_2,tran_totensor])
img_resize_2 = trans_compose(img)
writer.add_image("Resize",img_resize_2,1)

# RandomCrop 随机裁剪
trans_random = transforms.RandomCrop(256)
trans_compose_2 = transforms.Compose([trans_random,tran_totensor])
for i in range(10):
    img_crop = trans_compose_2(img)
    writer.add_image("RandomCrop",img_crop,i)

writer.close()

查看结果!

7.torchvision中数据集的使用

torchvision包,包含了目前流行的数据集,模型结构和常用的图片转换工具。

pytorch官网下载数据集(以CIFAR为例)

torchvision.datasets.CIFAR10(root, train=True, transform=None, target_transform=None, download=False)

参数说明:

  • root : 根目录
  • train : True = 训练集, False = 测试集
  • download : True = 从互联上下载数据,并将其放在root目录下。如果数据集已经下载,什么都不干。
  • target_transform : 一个接收目标并对其进行转换的函数/转换.

测试代码:

import torchvision
from torch.utils.tensorboard import SummaryWriter

dataset_transform = torchvision.transforms.Compose([
    torchvision.transforms.ToTensor()
])

train_set = torchvision.datasets.CIFAR10(root="./dataset02",train=True,transform=dataset_transform,download=True)
test_set = torchvision.datasets.CIFAR10(root="./dataset02",train=False,transform=dataset_transform,download=True)

# print(test_set[0])
# print(test_set.classes)
  
# img,target = test_set[0]
# print(img)
# print(target)
# print(test_set.classes[target])
# img.show()

# print(test_set[0])

writer = SummaryWriter('logs2')
for i in range(10):
    img,target = test_set[i]
    writer.add_image("test_set",img,i)
writer.close()

查看结果!

8.Dataloader的使用

第四节有简单介绍!

测试代码:

import torchvision

# 准备的测试数据集
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter

test_data = torchvision.datasets.CIFAR10("./dataset02",train=False,transform=torchvision.transforms.ToTensor())
# dataset  – 从中加载数据的数据集.
# batch_size  – 每批要加载多少样本(默认值:1)。
# shuffle  – 设置为True在每个 epoch 重新洗牌数据(默认值:False)。
# num_workers  – 用于数据加载的子进程数。0表示数据将在主进程中加载。(默认:0)
# drop_last  –True如果数据集大小不能被批次大小整除,则设置为丢弃最后一个不完整的批次。如果False数据集的大小不能被批大小整除,那么最后一批将更小。(默认:False)
test_loader = DataLoader(dataset=test_data,batch_size=64,shuffle=True,num_workers=0,drop_last=False)

# 测试数据集中第一张图片及target
img,target = test_data[0]
print(img.shape)
print(target)

writer = SummaryWriter("dataloader")
step = 0
for epoch in range(2):
    for data in test_loader:
        imgs, target = data
        # print(imgs.shape)
        # print(target)
        writer.add_images("Epoch:{}".format(epoch), imgs, step)
        step = step + 1

writer.close()

查看结果!

9.神经网络的基本骨架nn.Module

class torch.nn.Module : 所有网络的基类。你的模型也应该继承这个类。

在这里插入图片描述

在这里插入图片描述

测试代码:

import torch
from torch import nn

class kuang(nn.Module):

    def __init__(self):
        super().__init__()

    def forward(self,input):
        output = input + 1
        return output

kuang = kuang()
x = torch.tensor(1.0)
output = kuang(x)
print(output)

查看结果!

10.神经网络-卷积层conv2d

卷积核与input矩阵的各个元素乘积和生成一个output矩阵。

class torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True)

torch.nn.functional.conv2d(input, weight, bias=None, stride=1, padding=0, dilation=1, groups=1)

参数

  • in_channels(int) – 输入信号的通道
  • out_channels(int) – 卷积产生的通道
  • kerner_size(int or tuple) - 卷积核的尺寸
  • stride(int or tuple, optional) - 卷积步长
  • padding(int or tuple, optional) - 输入的每一条边补充0的层数
  • dilation(int or tuple, optional) – 卷积核元素之间的间距
  • groups(int, optional) – 从输入通道到输出通道的阻塞连接数
  • bias(bool, optional) - 如果bias=True,添加偏置

Shape:(写论文用到)

在这里插入图片描述

变量:

  • weight(tensor) - 卷积的权重,shape是(out_channels, in_channels,kernel_size)`
  • bias(tensor) - 卷积的偏置系数,shape是(out_channel

在这里插入图片描述

padding = 1时.

在这里插入图片描述

测试代码1:

import torch
import torch.nn.functional as F

input = torch.tensor([[1,2,0,3,1],
                      [0,1,2,3,1],
                      [1,2,1,0,0],
                      [5,2,3,1,1],
                      [2,1,0,1,1]])

kernel = torch.tensor([[1,2,1],
                      [0,1,0],
                      [2,1,0]])

input = torch.reshape(input,(1,1,5,5))
kernel = torch.reshape(kernel,(1,1,3,3))

print(input.shape)
print(kernel.shape)

output = F.conv2d(input,kernel,stride=1)
print(output)

output2 = F.conv2d(input,kernel,stride=2)
print(output2)

output3 = F.conv2d(input,kernel,stride=1,padding=1)
print(output3)

输出:

torch.Size([1, 1, 5, 5])
torch.Size([1, 1, 3, 3])
tensor([[[[10, 12, 12],
          [18, 16, 16],
          [13,  9,  3]]]])
tensor([[[[10, 12],
          [13,  3]]]])
tensor([[[[ 1,  3,  4, 10,  8],
          [ 5, 10, 12, 12,  6],
          [ 7, 18, 16, 16,  8],
          [11, 13,  9,  3,  4],
          [14, 13,  9,  7,  4]]]])

测试代码2:

import torch
import torchvision
from torch import nn
from torch.nn import Conv2d
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter

dataset = torchvision.datasets.CIFAR10("data",train=False,transform=torchvision.transforms.ToTensor(),download=True)
dataloader = DataLoader(dataset,batch_size = 64)

# in_channels ( int ) – 输入图像中的通道数
# out_channels ( int ) – 卷积产生的通道数
# kernel_size ( int or tuple ) – 卷积核的大小
# stride ( int or tuple , optional ) -- 卷积的步幅。默认值:1
# padding ( int , tuple或str , optional ) – 添加到输入的所有四个边的填充。默认值:0
class Kuang(nn.Module):
    def __init__(self):
        super(Kuang, self).__init__()
        self.conv1 = Conv2d(in_channels=3, out_channels=6, kernel_size=3, stride=1, padding=0)

    def forward(self,x):
        x = self.conv1(x)
        return x

kuang = Kuang()

writer = SummaryWriter("logs03")

step = 0
for data in dataloader:
    imgs,targets = data
    output = kuang(imgs)
    print(imgs.shape)
    print(output.shape)
    # torch.Size([64,3,32,32])
    writer.add_images("input",imgs,step)
    # torch.Size([64, 6, 30, 30])  -> [xxx, 3, 30, 30]
    output = torch.reshape(output,(-1,3,30,30)) # -1表示不知道多少,会根据后面尺寸进行计算。
    writer.add_images("output",output,step)
    step = step+1

writer.close()

查看结果:

在这里插入图片描述

11.神经网络-最大池化的使用

池化:在保留数据特征的情况下减小数据量。

对于输入信号的输入通道,提供2维最大池化(max pooling)操作。

class torch.nn.MaxPool2d(kernel_size, stride=None, padding=0, dilation=1, return_indices=False, ceil_mode=False)

参数:

  • kernel_size(int or tuple) - max pooling的窗口大小
  • stride(int or tuple, optional) - max pooling的窗口移动的步长。默认值是kernel_size
  • padding(int or tuple, optional) - 输入的每一条边补充0的层数
  • dilation(int or tuple, optional) – 一个控制窗口中元素步幅的参数
  • return_indices - 如果等于True,会返回输出最大值的序号,对于上采样操作会有帮助
  • ceil_mode - 如果为True,不足square_size的边给保留下来,单独另算;如果为False,不足square_size的边给舍弃了.

shape:

在这里插入图片描述


在这里插入图片描述

测试代码2:

import torch
import torchvision
from torch import nn
from torch.nn import MaxPool2d
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter

dataset = torchvision.datasets.CIFAR10("data", train=False, download=True,
                                       transform=torchvision.transforms.ToTensor())
dataloader = DataLoader(dataset, batch_size=64)

input = torch.tensor([[1,2,0,3,1],
                      [0,1,2,3,1],
                      [1,2,1,0,0],
                      [5,2,3,1,1],
                      [2,1,0,1,1]],dtype=torch.float32)
input = torch.reshape(input,(-1,1,5,5))
print(input.shape)

# ceil_mode – 如果为True,不足square_size的边给保留下来,单独另算;如果为False,不足square_size的边给舍弃了.
class Kuang(nn.Module):
    def __init__(self):
        super(Kuang, self).__init__()
        self.maxpoll1 = MaxPool2d(kernel_size=3,ceil_mode=True)

    def forward(self,input):
        output = self.maxpoll1(input)
        return output

kuang = Kuang()
output = kuang(input)
print(output)

输出:

torch.Size([1, 1, 5, 5])
tensor([[[[2., 3.],
          [5., 1.]]]])

测试代码2:

import torch
import torchvision
from torch import nn
from torch.nn import MaxPool2d
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter

dataset = torchvision.datasets.CIFAR10("data", train=False, download=True,
                                       transform=torchvision.transforms.ToTensor())
dataloader = DataLoader(dataset, batch_size=64)

# ceil_mode – 如果为True,不足square_size的边给保留下来,单独另算;如果为False,不足square_size的边给舍弃了.
class Kuang(nn.Module):
    def __init__(self):
        super(Kuang, self).__init__()
        self.maxpoll1 = MaxPool2d(kernel_size=3,ceil_mode=True)

    def forward(self,input):
        output = self.maxpoll1(input)
        return output

kuang = Kuang()

writer = SummaryWriter("logs03")
step = 0
for data in dataloader:
    imgs,targets = data
    writer.add_images("input",imgs,step)
    output = kuang(imgs)
    writer.add_images("output",output,step)
    step = step+1
writer.close()

查看结果:

在这里插入图片描述

12.神经网络-非线性激活

在这里插入图片描述

在这里插入图片描述

class torch.nn.ReLU(inplace=False)

参数:

  • inplace-选择是否进行覆盖运算。默认:False

在这里插入图片描述

shape:

  • 输入:(∗),*代表任意数目附加维度.
  • 输出:(∗),与输入拥有同样的shape属性.

测试代码1:

import torch
import torchvision.datasets
from torch import nn
from torch.nn import ReLU
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter

input = torch.tensor([[1,0.5],
                      [-1,3]])
input = torch.reshape(input,(-1,1,2,2))
print(input.shape)

dataset = torchvision.datasets.CIFAR10("data",train=False,download=True,transform=torchvision.transforms.ToTensor())
dataloader = DataLoader(dataset,batch_size=64)

class Kuang(nn.Module):
    def __init__(self):
        super(Kuang, self).__init__()
        self.relu1 = ReLU() # inplace - 可以选择就地执行操作。默认:False

    def forward(self,input):
        output = self.relu1(input)
        return output

kuang = Kuang()
output = kuang(input)
print(output)

输出:

torch.Size([1, 1, 2, 2])
tensor([[[[1.0000, 0.5000],
          [0.0000, 3.0000]]]])

测试代码2:

import torch
import torchvision.datasets
from torch import nn
from torch.nn import ReLU
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter

dataset = torchvision.datasets.CIFAR10("data",train=False,download=True,transform=torchvision.transforms.ToTensor())
dataloader = DataLoader(dataset,batch_size=64)

class Kuang(nn.Module):
    def __init__(self):
        super(Kuang, self).__init__()
        self.relu1 = ReLU() # inplace - 可以选择就地执行操作。默认:False

    def forward(self,input):
        output = self.relu1(input)
        return output

kuang = Kuang()

writer = SummaryWriter("logs03")
step = 0
for data in dataloader:
    imgs,targets = data
    writer.add_images("input",imgs,global_step=step)
    output = kuang(imgs)
    writer.add_images("output",output,step)
    step += 1
writer.close()

查看结果:

在这里插入图片描述

13.神经网络-线性层及其他层介绍

在这里插入图片描述

对输入数据做线性变换:y=Ax+b.

class torch.nn.Linear(in_features, out_features, bias=True)

参数:

  • in_features - 每个输入样本的大小
  • out_features - 每个输出样本的大小
  • bias - 若设置为False,这层不会学习偏置。默认值:True

shape:

  • 输入: (N,in_features)
  • 输出: (N,out_features)

变量:

  • weight -形状为(out_features,in_features)的模块中可学习的权值.
  • bias -形状为(out_features)的模块中可学习的偏置.

在这里插入图片描述

测试代码:

import torch
import torchvision
from torch import nn
from torch.nn import Linear
from torch.utils.data import DataLoader

dataset = torchvision.datasets.CIFAR10("data", train=False, transform=torchvision.transforms.ToTensor(),
                                       download=True)
dataloader = DataLoader(dataset, batch_size=64, drop_last=True)

class Tudui(nn.Module):
    def __init__(self):
        super(Tudui, self).__init__()
        self.linear1 = Linear(196608, 10)

    def forward(self, input):
        output = self.linear1(input)
        return output

tudui = Tudui()

for data in dataloader:
    imgs, targets = data
    print(imgs.shape)
    output = torch.flatten(imgs) # 把输入展平成一行。
    print(output.shape)
    output = tudui(output)
    print(output.shape)

输出:(只写了一个)

torch.Size([64, 3, 32, 32])
torch.Size([196608])
torch.Size([10])

14.神经网络-Sequential的使用

CIFAR10模型结构:

在这里插入图片描述

class torch.nn.Sequential(* args)

一个时序容器。Modules 会以他们传入的顺序被添加到容器中。当然,也可以传入一个OrderedDict

测试代码:

import torch
from torch import nn
 
class test(nn.Module):
    def __init__(self):
        super(test, self).__init__()
        #因为size_in和size_out都是32,经过公式计算得出padding=2,stride=1.
        self.conv1=nn.Conv2d(3,32,5,padding=2,stride=1)
        self.pool1=nn.MaxPool2d(2)
        #尺寸不变,和上面一样
        self.conv2=nn.Conv2d(32,32,5,stride=1,padding=2)
        self.pool2=nn.MaxPool2d(2)
        # 尺寸不变,和上面一样
        self.conv3=nn.Conv2d(32,64,5,stride=1,padding=2)
        self.pool3 = nn.MaxPool2d(2)
        self.flatten=nn.Flatten()
        #in_feature:64*4*4,out_feature:64
        self.linear1=nn.Linear(1024,64)
        self.linear2=nn.Linear(64,10)
    def forward(self,x):
        x = self.conv1(x)
        x = self.pool1(x)
        x = self.conv2(x)
        x = self.pool2(x)
        x = self.conv3(x)
        x = self.pool3(x)
        x =s elf.flatten(x)
        x = self.linear1(x)
        x = self.linear2(x)
        return x
      
test1=test()
#对网络结构进行检验
input=torch.ones((64,3,32,32))
output=test1(input)
print(output.shape)

用nn.Sequential模块:

import torch
from torch import nn
from torch.nn import Conv2d, MaxPool2d, Flatten, Linear, Sequential
from torch.utils.tensorboard import SummaryWriter

class Tudui(nn.Module):
    def __init__(self):
        super(Tudui, self).__init__()
        self.model1 = Sequential(
            Conv2d(3, 32, 5, padding=2),
            MaxPool2d(2),
            Conv2d(32, 32, 5, padding=2),
            MaxPool2d(2),
            Conv2d(32, 64, 5, padding=2),
            MaxPool2d(2),
            Flatten(),
            Linear(1024, 64),
            Linear(64, 10)
        )

    def forward(self, x):
        x = self.model1(x)
        return x

tudui = Tudui()
print(tudui)
input = torch.ones((64, 3, 32, 32)) # 创建的数全是1
output = tudui(input)
print(output.shape)

writer = SummaryWriter("logs03")
writer.add_graph(tudui,input)
writer.close()

输出:

Tudui(
  (model1): Sequential(
    (0): Conv2d(3, 32, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
    (1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (2): Conv2d(32, 32, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
    (3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (4): Conv2d(32, 64, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
    (5): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (6): Flatten(start_dim=1, end_dim=-1)
    (7): Linear(in_features=1024, out_features=64, bias=True)
    (8): Linear(in_features=64, out_features=10, bias=True)
  )
)
torch.Size([64, 10])

结果查看!

在这里插入图片描述

15.损失函数与反向传播

torch.nn.CrossEntropyLoss(weight=None, size_average=None, ignore_index=-100, reduce=None, reduction='mean')

参数:

  • weight – 权重。
  • ignore_index– 指定一个被忽略且不影响输入梯度的目标值。 ignore_index仅当目标包含类索引时才适用。
  • reduction – 指定要应用于输出的reduction'none'| 'mean'| 'sum'. 'none'mean:取输出的加权平均值, sum:输出将被求和。默认:mean.

Loss可以表述为以下形式:

在这里插入图片描述

测试代码1:

import torch
from torch import nn
from torch.nn import L1Loss

inputs = torch.tensor([1,2,3],dtype=torch.float32)
targets = torch.tensor([1,2,5],dtype=torch.float32)

inputs = torch.reshape(inputs,(1,1,1,3))
targets = torch.reshape(targets,(1,1,1,3))

loss = L1Loss(reduction="sum")
result = loss(inputs,targets)

loss_mse = nn.MSELoss()
result_mse = loss_mse(inputs,targets)

print(result)
print(result_mse)


x = torch.tensor([0.1,0.2,0.3])
y = torch.tensor([1])
x = torch.reshape(x,(1,3))
loss_cross = nn.CrossEntropyLoss()
result_cross = loss_cross(x,y)
print(result_cross)

输出:

tensor(2.)
tensor(1.3333)
tensor(1.1019)

测试代码2:

import torch
import torchvision.datasets
from torch import nn
from torch.nn import Conv2d, MaxPool2d, Flatten, Linear, Sequential
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter

dataset = torchvision.datasets.CIFAR10("data",train=False,transform=torchvision.transforms.ToTensor(),download=True)
dataloader = DataLoader(dataset,batch_size=1)

class Tudui(nn.Module):
    def __init__(self):
        super(Tudui, self).__init__()
        self.model1 = Sequential(
            Conv2d(3, 32, 5, padding=2),
            MaxPool2d(2),
            Conv2d(32, 32, 5, padding=2),
            MaxPool2d(2),
            Conv2d(32, 64, 5, padding=2),
            MaxPool2d(2),
            Flatten(),
            Linear(1024, 64),
            Linear(64, 10)
        )

    def forward(self, x):
        x = self.model1(x)
        return x

loss = nn.CrossEntropyLoss()
tudui = Tudui()
for data in dataloader:
    imgs,targets = data
    outputs = tudui(imgs)
    result_loss = loss(outputs,targets)
    print(result_loss)
    # result_loss.backward()

部分输出:

tensor(2.2556, grad_fn=<NllLossBackward0>)
tensor(2.3297, grad_fn=<NllLossBackward0>)
tensor(2.3425, grad_fn=<NllLossBackward0>)

16.优化器

import torch
import torchvision.datasets
from torch import nn
from torch.nn import Conv2d, MaxPool2d, Flatten, Linear, Sequential
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter

dataset = torchvision.datasets.CIFAR10("data", train=False, transform=torchvision.transforms.ToTensor(), download=True)
dataloader = DataLoader(dataset, batch_size=1)

class Tudui(nn.Module):
    def __init__(self):
        super(Tudui, self).__init__()
        self.model1 = Sequential(
            Conv2d(3, 32, 5, padding=2),
            MaxPool2d(2),
            Conv2d(32, 32, 5, padding=2),
            MaxPool2d(2),
            Conv2d(32, 64, 5, padding=2),
            MaxPool2d(2),
            Flatten(),
            Linear(1024, 64),
            Linear(64, 10)
        )

    def forward(self, x):
        x = self.model1(x)
        return x


loss = nn.CrossEntropyLoss()
tudui = Tudui()
optim = torch.optim.SGD(tudui.parameters(),lr=0.01)
for epoch in range(20):
    running_loss = 0.0
    for data in dataloader:
        imgs, targets = data
        outputs = tudui(imgs)
        result_loss = loss(outputs, targets)
        # print(result_loss)
        # result_loss.backward()
        
        # 梯度清零
        optim.zero_grad()
        # 反向传播,得到每个可调节参数对应的梯度(grad不为none)
        result_loss.backward()
        # 优化,对每个参数进行改变。
        optim.step()
        
        running_loss = running_loss + result_loss
    print(running_loss)

17.现有网络模型使用及修改

对vgg16进行修改:vgg16最后输出1000个类,想要将这1000个类改成10个类.

测试代码:

import torchvision
from torch import nn

vgg16_false = torchvision.models.vgg16(pretrained=False) # 参数是初始化的,没有进行过训练。
vgg16_true = torchvision.models.vgg16(pretrained=True) # 参数是训练好的。

# print(vgg16_false)
# print(vgg16_true)

train_data = torchvision.datasets.CIFAR10("data",train=True,transform=torchvision.transforms.ToTensor(),download=True)

# vgg16最后输出1000个类,想要将这1000个类改成10个类:
vgg16_true.classifier.add_module('add_linear',nn.Linear(1000,10))
print(vgg16_true)

vgg16_false.classifier[6] = nn.Linear(4096,10)
print(vgg16_false)

输出:

VGG(
  (features): Sequential(
    (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): ReLU(inplace=True)
    (2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (3): ReLU(inplace=True)
    (4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (6): ReLU(inplace=True)
    (7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (8): ReLU(inplace=True)
    (9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (11): ReLU(inplace=True)
    (12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (13): ReLU(inplace=True)
    (14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (15): ReLU(inplace=True)
    (16): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (17): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (18): ReLU(inplace=True)
    (19): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (20): ReLU(inplace=True)
    (21): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (22): ReLU(inplace=True)
    (23): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (24): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (25): ReLU(inplace=True)
    (26): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (27): ReLU(inplace=True)
    (28): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (29): ReLU(inplace=True)
    (30): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (avgpool): AdaptiveAvgPool2d(output_size=(7, 7))
  (classifier): Sequential(
    (0): Linear(in_features=25088, out_features=4096, bias=True)
    (1): ReLU(inplace=True)
    (2): Dropout(p=0.5, inplace=False)
    (3): Linear(in_features=4096, out_features=4096, bias=True)
    (4): ReLU(inplace=True)
    (5): Dropout(p=0.5, inplace=False)
    (6): Linear(in_features=4096, out_features=1000, bias=True)
    (add_linear): Linear(in_features=1000, out_features=10, bias=True)
  )
)
VGG(
  (features): Sequential(
    (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): ReLU(inplace=True)
    (2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (3): ReLU(inplace=True)
    (4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (6): ReLU(inplace=True)
    (7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (8): ReLU(inplace=True)
    (9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (11): ReLU(inplace=True)
    (12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (13): ReLU(inplace=True)
    (14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (15): ReLU(inplace=True)
    (16): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (17): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (18): ReLU(inplace=True)
    (19): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (20): ReLU(inplace=True)
    (21): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (22): ReLU(inplace=True)
    (23): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (24): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (25): ReLU(inplace=True)
    (26): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (27): ReLU(inplace=True)
    (28): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (29): ReLU(inplace=True)
    (30): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (avgpool): AdaptiveAvgPool2d(output_size=(7, 7))
  (classifier): Sequential(
    (0): Linear(in_features=25088, out_features=4096, bias=True)
    (1): ReLU(inplace=True)
    (2): Dropout(p=0.5, inplace=False)
    (3): Linear(in_features=4096, out_features=4096, bias=True)
    (4): ReLU(inplace=True)
    (5): Dropout(p=0.5, inplace=False)
    (6): Linear(in_features=4096, out_features=10, bias=True)
  )
)

18.网络模型的保存与读取

vgg16模型的保存和读取。

保存:

from torch import nn
import torch
import torchvision

vgg16 = torchvision.models.vgg16(pretrained=False)
# 保存方式1,模型结构+模型参数
torch.save(vgg16,"vgg16_method1.pth")

# 保存方式1,模型参数(官方推荐)
torch.save(vgg16.state_dict(),"vgg16_method2.pth")

# 陷阱
class Kuang(nn.Module):
    def __init__(self):
        super(Kuang, self).__init__()
        self.model1 = nn.Conv2d(3,64,kernel_size=3)

    def forward(self, x):
        x = self.model1(x)
        return x

kuang = Kuang()
torch.save(kuang,"kuang_method1.pth")

读取:

from torch import nn
import torch
import torchvision

# 方式1:保存方式1,加载模型
model = torch.load("vgg16_method1.pth")
# print(model)

# 方式2:加载模型
vgg16 = torchvision.models.vgg16(pretrained=False)
vgg16.load_state_dict(torch.load("vgg16_method2.pth"))
# model2 = torch.load("vgg16_method2.pth")
# print(model2)
# print(vgg16)

# 陷阱
# 要把类引用过来。
class Kuang(nn.Module):
    def __init__(self):
        super(Kuang, self).__init__()
        self.model1 = nn.Conv2d(3,64,kernel_size=3)

    def forward(self, x):
        x = self.model1(x)
        return x
model = torch.load('kuang_method1.pth')
print(model)

19.完整的模型训练套路

模型训练步骤:

  1. 准备加载数据
  2. 创建网络模型
  3. 初始化损失函数/优化器
  4. 训练循环

测试代码:

import torch
import torchvision.datasets
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter

# 把模型导入
from model import *

# 准备数据集
train_data = torchvision.datasets.CIFAR10(root="data",train=True,transform=torchvision.transforms.ToTensor(),download=True)
test_data = torchvision.datasets.CIFAR10(root="data",train=False,transform=torchvision.transforms.ToTensor(),download=True)

# length 长度
train_data_size = len(train_data)
test_data_size = len(test_data)

# 如果train_data_size=10, 训练数据集的长度为:10
print("训练数据集的长度为:{}".format(train_data_size))
print("测试数据集的长度为:{}".format(test_data_size))

# 利用 DataLoader 来加载数据集
train_dataloader = DataLoader(train_data,batch_size=16)
test_dataloader = DataLoader(test_data,batch_size=16)

# 创建网络模型
tudui = Tudui()

# 损失函数
loss_fn = nn.CrossEntropyLoss()

# 优化器
# learning_rate = 0.01
# 1e-2=1 x (10)^(-2) = 1 /100 = 0.01
learning_rate = 1e-2
optimizer = torch.optim.SGD(tudui.parameters(),lr=learning_rate)

# 设置训练网络的一些参数
# 记录训练的次数
total_train_step = 0
# 记录测试的次数
total_test_step = 0
# 训练的轮数
epoch = 10

# 添加tensorboard
writer = SummaryWriter("dataloader")

for i in range(epoch):
    print("--------第{}轮训练开始----------".format(i+1))

    # 训练步骤开始
    # tudui.train()
    for data in train_dataloader:
        imgs,targets = data
        outputs = tudui(imgs)
        loss = loss_fn(outputs,targets)

        # 优化器优化模型
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        total_train_step = total_train_step+1
        # 每100轮记录一次。
        if total_train_step % 100 == 0:
            print("训练次数:{},Loss:{}".format(total_train_step,loss))
            writer.add_scalar("train_loss",loss.item(),total_train_step)

    #测试步骤开始
    # tudui.eval()
    total_test_loss = 0
    total_accuracy = 0
    with torch.no_grad(): # 不计算梯度。
        for data in test_dataloader:
            imgs,targets = data
            output = tudui(imgs)
            loss = loss_fn(outputs,targets)
            total_test_loss = total_test_loss + loss.item()
            accuracy = (outputs.argmax(1) == targets).sum() # 计算正确的总数。
            total_accuracy = total_accuracy + accuracy

    print("整体测试集上的Loss:{}".format(total_test_loss))
    print("整体测试集上的正确率:{}".format(total_accuracy/test_data_size))
    writer.add_scalar("test_loss", total_test_loss, total_test_step)
    writer.add_scalar("test_accuracy",total_accuracy/test_data_size,total_test_step)
    total_test_loss = total_test_step + 1

    torch.save(tudui,"tudui_{}.pth".format(i))
    # torch.save(tudui.load_state_dict(),"tudui_{}.pth".format(i))
    print("模型已保存!")

writer.close()

model:

import torch
from torch import nn

# 搭建神经网络
class Tudui(nn.Module):
    def __init__(self):
        super(Tudui, self).__init__()
        self.model1 = nn.Sequential(
            nn.Conv2d(3, 32, 5, 1, 2),
            nn.MaxPool2d(2),
            nn.Conv2d(32, 32, 5, 1, 2),
            nn.MaxPool2d(2),
            nn.Conv2d(32, 64, 5, 1, 2),
            nn.MaxPool2d(2),
            nn.Flatten(),
            nn.Linear(64*4*4, 64),
            nn.Linear(64, 10)
        )

    def forward(self, x):
        x = self.model1(x)
        return x

# 主方法
if __name__ == '__main__':
    tudui = Tudui()
    input = torch.ones((64,3,32,32))
    output = tudui(input)
    print(output.shape)

部分输出:

训练数据集的长度为:50000
测试数据集的长度为:10000
--------1轮训练开始----------
训练次数:100,Loss:2.277003049850464
训练次数:200,Loss:2.284518241882324
训练次数:300,Loss:2.2926323413848877
训练次数:400,Loss:2.233567714691162
训练次数:500,Loss:1.9439252614974976
训练次数:600,Loss:2.029322624206543
...
训练次数:2900,Loss:1.5494470596313477
训练次数:3000,Loss:1.2179138660430908
训练次数:3100,Loss:1.4762766361236572
整体测试集上的Loss:2178.9752430915833
整体测试集上的正确率:0.09480000287294388
模型已保存!
--------2轮训练开始----------

argmax函数测试:

import torch

# argmax():求最大值。参数为0按列算,参数为1按行算。
outputs = torch.tensor([[0.1,0.2],
                        [0.3,0.4]])
print(outputs.argmax(1))
preds = outputs.argmax(1)
targets = torch.tensor([0,1])
print((preds == targets).sum())


# 输出:
# tensor([1, 1])
# tensor(1)

20.利用gpu训练

方式一

利用gpu计算时的改动:

  • 在网络模型创建实例、损失函数、分割数据集(输入和标注)时加.cuda()
# 创建网络模型
tudui = Tudui()
if torch.cuda.is_available():
    tudui = tudui.cuda()
# 损失函数
loss_fn = nn.CrossEntropyLoss()
if torch.cuda.is_available():
    loss_fn = loss_fn.cuda()
# 分割数据集(输入和标注)
imgs, targets = data
if torch.cuda.is_available():
    imgs = imgs.cuda()
    targets = targets.cuda()

方式二:

# 定义训练的设备
device = torch.device("cuda")
# 创建网络模型
tudui = Tudui()
tudui = tudui.to(device)
# 损失函数
loss_fn = nn.CrossEntropyLoss()
loss_fn = loss_fn.to(device)
# 分割数据集(输入和标注)
imgs, targets = data
imgs = imgs.to(device)
targets = targets.to(device)

21.完整模型验证套路

import torch
import torchvision
from PIL import Image
from torch import nn

image_path = "imgs/airplane.png"
image = Image.open(image_path)
print(image)
# 如果不使用.convert('RGB')进行转换的话,读出来的图像是RGBA四通道的,
# A通道为透明通道,该对深度学习模型训练来说暂时用不到,因此使用convert('RGB')进行通道转换。
image = image.convert('RGB')
transform = torchvision.transforms.Compose([torchvision.transforms.Resize((32, 32)),
                                            torchvision.transforms.ToTensor()])

image = transform(image)
print(image.shape)

class Tudui(nn.Module):
    def __init__(self):
        super(Tudui, self).__init__()
        self.model = nn.Sequential(
            nn.Conv2d(3, 32, 5, 1, 2),
            nn.MaxPool2d(2),
            nn.Conv2d(32, 32, 5, 1, 2),
            nn.MaxPool2d(2),
            nn.Conv2d(32, 64, 5, 1, 2),
            nn.MaxPool2d(2),
            nn.Flatten(),
            nn.Linear(64*4*4, 64),
            nn.Linear(64, 10)
        )

    def forward(self, x):
        x = self.model(x)
        return x

# model = torch.load("tudui_29_gpu.pth", map_location=torch.device('cpu'))
model = torch.load("tudui_0.pth", map_location=torch.device('cpu'))
print(model)
image = torch.reshape(image, (1, 3, 32, 32))
model.eval()
with torch.no_grad():
    output = model(image)
print(output)

print(output.argmax(1))

输出:

<PIL.PngImagePlugin.PngImageFile image mode=RGB size=478x354 at 0x7FC138437D90>
torch.Size([3, 32, 32])
Tudui(
  (model1): Sequential(
    (0): Conv2d(3, 32, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
    (1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (2): Conv2d(32, 32, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
    (3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (4): Conv2d(32, 64, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
    (5): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (6): Flatten(start_dim=1, end_dim=-1)
    (7): Linear(in_features=1024, out_features=64, bias=True)
    (8): Linear(in_features=64, out_features=10, bias=True)
  )
)
tensor([[-1.4530,  0.8039, -0.5174,  0.0565, -1.4587, -0.7926,  2.2505, -2.7748,
          0.7415,  1.2671]])
tensor([6])
  • 5
    点赞
  • 6
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值