pytorch整理

安装anaconda

查看历史版本 https://repo.anaconda.com/archive/ 选择需要安装的版本

在开始菜单搜索Anaconda Prompt打开anaconda控制台

创建隔离环境,然后再隔离环境中安装pytorch包!!!

conda create -n pytorch python==3.6    # -n就是重命名环境名字为pytorch,里面有包python3.6
conda activate pytorch	# 来选择需要用的环境,默认是在base环境上面

安装结束后,检测是否成功

import torch	# 看pytorch是否安装成功
torch.cuda.is_available()	# 查看是否能够使用GPU,返回true表明能够使用GPU

anaconda中安装jupyter

conda install jupyter notebook		# 安装jupyternotebook

torch教学

两个方便学习torch的函数

dir(torch.cuda)		# 打开torch.cuda这个盒子,看看下面有什么工具函数。
help(torch.cuda.is_available)		# 函数说明书,查看这个函数是什么功能。

pytorch中成熟的模型

类似于model zoo功能,一行代码调用一个模型!!!

github地址:https://github.com/pytorch/hub

hub_list = torch.hub.list("pytorch/vision:v0.8.2")		# 已有模型列表

pytorch基本操作

创建一个与已知张量同维度的张量

x = torch.rand(4, 4)
x = torch.randn_like(x, dtype=torch.float)		# 随机创建已知相同维度的张量

view操作可以改变矩阵维度

x = torch.rand(4, 4)
x.view(16)		# 相当于reshape,将x拉升为一维向量
x.view(-1,8)	# 将x改变为2*8的矩阵,-1为自动计算

tensor与array互转

tensor转array

a = torch.ones(5)	# 生成tensor数据
b = a.numpy()		# 将tensor转换为了array

array转tensor

a = np.ones(5)
b = torch.from_numpy(a)		# 转化为tensor数据类型

几种常见的tensor

Scalar标量

通常是一个数字

from torch import tensor
x = tensor(42,)		# x为标量
x.dim()		# 查看x的维度信息,1维
x.item()	# 查看值
Victor向量

在深度学习中通常指特征,例如词向量特征

x = tensor([11, 12, 13, 14, 15, 16])	# x为向量
x.dim()		# 向量的维度,1维
x.shape		# x的各个维度信息
Matrix矩阵
x = tensor([[1,2,3],[4,5,6],[7,8,9]])	# 创建一个矩阵
x.dim()		# 矩阵维度,2维

SummaryWriter

也可以使用tensorboardx工具进行展示(https://github.com/lanpa/tensorboardX)!!!

通过命令:tensorboard --logdir=11-09_13.03 指定文件夹去运行保存下来的数据图。

代码

画scalar图

from torch.utils.tensorboard import SummaryWriter
import cv2

# 生成的数据放在logs文件夹中
writer = SummaryWriter("logs")	

for i in range(100):
    # 第一个参数是图名,名字相同的会画在一个图中,第二个参数是Y轴,第三个参数是X轴
    writer.add_scalar("y=2x", 3 * i, i)		

img_data = cv2.imread("D:\\proj\\myself\\pytorch_learn\\datas\\hymen\\train\\ants\\0013035.jpg")

# 在tensorboard上面画图,global_step表示这个图片是第几步的图片,dataformats是数据shape
writer.add_image("img1", img_tensor=img_data, global_step=1, dataformats="HWC")		
    
writer.close()

shell

# 运行tensorboard,通过web方式展示图
tensorboard --logdir=logs --port=6607		

transform

主要是对图片进行一些变换

ToTensor
from torchvision import transforms

# ToTensor类
transT = transforms.ToTensor()
# 通过call函数,将图片转化为tensoor
tensor_data = transT(img)
Normalize
# 归一化类,第一个参数是每个通道的平均值,第二个参数是每个通道的标准差,归一化主要是为了统一量纲,同时加速网络的收敛。
transN = transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
# call传入的参数必须是tensor类型的。
img_data1 = transN(img_tensor)
RandomRotation

随机旋转

transforms.RandomRotation(45)		# 随机旋转,在-45°与45°之间旋转
RandomHorizontalFlip

水平翻转

transforms.RandomHorizontalFlip(p = 0.5)		# 执行翻转的概率为0.5
Resize
# 重新剪切图片,两个参数就是指定H,W的值,一个参数就是按照比例缩放
transR = transforms.Resize((512, 512))
# 传入的可以是PIL类型数据,也可以是tensor类型数据
img_resize = transR(img_tensor)
CenterCrop

中心裁剪,以中心为标准进行裁剪

transforms.CenterCrop(224)		# 中心裁剪224像素
RandomCrop

随机裁剪图片

# 随机裁剪到256*128大小
transR = transforms.RandomCrop((256, 128))
# 将裁剪后数据转化为tensor再输出
transT = transforms.ToTensor()
# 将两个动作组合起来
transC = transforms.Compose([transR, transT])
writer = SummaryWriter("logs")
for i in range(10):
    img_compose = transC(img_data)
    writer.add_image("compose", img_compose, i)

writer.close()
Compose

将几个变换动作组合在一起

transT = transforms.ToTensor()
transR = transforms.Resize(512)
# 传入参数必须是transforms类型,将数据resize后,再转换为tensor类型数据,再输出
# 参数列表中,后一个参数输入需要是前一个参数的输入!!!
transC = transforms.Compose([transR, transT])
# 传入参数必须是PIL类型数据
img_compose = transC(img_data)
transpose改变通道位置

将图像的通道位置放在第一位,形成(c, h, w)的数据形式

img = img_data		# 图像数据
img.transpose((2,0,1))		# 将第二位的通道放在第一位,其他依次顺移!!!

公开数据集

CIFAR10数据集

from torchvision import datasets	# 公开数据集
from torchvision import transforms	# 转换
from torch.utils.tensorboard import SummaryWriter	# 在tensorboard画图

dataset_trans = transforms.Compose([transforms.ToTensor()])
# root是数据存放的地址,train是否是训练集,transform是经过拿些转换,download为下载数据
train_set = datasets.CIFAR10(root="./datas/cr10", train=True, transform=dataset_trans, download=True)
test_set = datasets.CIFAR10(root="./datas/cr10", train=False, transform=dataset_trans, download=True)

# 将图片用tensorboard进行展示
writer = SummaryWriter("logs")	
for i in range(10):
    img, target = train_set[i]
    writer.add_image("cifar", img, global_step=i)

writer.close()

Dataset

处理每个数据

提供一种方式去获取数据及其对应的 label

from torch.utils.data import Dataset

class MyData(Dataset):
    def __init__(self, root_dir, label_dir):
        self.root_dir = root_dir
        self.label_dir = label_dir
        self.path = os.path.join(self.root_dir, self.label_dir)
        self.img_path = os.listdir(self.path)
        
	# 获取每个数据及其label
    def __getitem__(self, idx):
        img_name = self.img_path[idx]
        img_item_path = os.path.join(self.root_dir, self.label_dir, img_name)
        img = Image.open(img_item_path)
        label = self.label_dir

        return img, label
    
	# 获取样本数据总量
    def __len__(self):
        return len(self.img_path)
    
ants_dataset = MyData("D:\\proj\\myself\\pytorch_learn\\datas\\hymen\\train", "ants")
bees_dataset = MyData("D:\\proj\\myself\\pytorch_learn\\datas\\hymen\\train", "bees")

# 将两个数据集拼接起来
train_dataset = ants_dataset + bees_dataset		
img, label = train_dataset[123]

图像数据处理ImageFolder

# 定义不同类型数据的不同处理方式
data_transforms = {
    "train": transforms.Compose([transforms.CenterCrop(224),
                                 transforms.RandomHorizontalFlip(p=0.5),
                                 transforms.RandomVerticalFlip(p=0.5),
                                 transforms.RandomGrayscale(p=0.025), transforms.ToTensor(),
                                 transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])]),
    "valid": transforms.Compose([transforms.Resize(256),
                                 transforms.CenterCrop(224),
                                 transforms.ToTensor(),
                                 transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])])
}
# 用ImageFolder处理原始数据集
image_ds = {x: datasets.ImageFolder(root="./datas", transform=data_transforms["train"]) for x in ["train", "valid"]}
image_dl = {x: DataLoader(image_ds[x], batch_size=8, shuffle=True) for x in ["train", "valid"]}

Dataloader

NLP获取batch数据

# batches为集合中的所有样本,device为训练环境(GPU或者CPU)
class DatasetIterater(object):
    def __init__(self, batches, batch_size, device):
        self.batch_size = batch_size
        self.batches = batches
        # 计算出拥有的批次数量,取整!
        self.n_batches = len(batches) // batch_size
        # 记录batch数量是否为整数,false为没有剩余数
        self.residue = False  
        if len(batches) % self.n_batches != 0:
            self.residue = True
        self.index = 0
        self.device = device

    def _to_tensor(self, datas):
        x = torch.LongTensor([_[0] for _ in datas]).to(self.device)
        y = torch.LongTensor([_[1] for _ in datas]).to(self.device)

        # pad前的长度(超过pad_size的设为pad_size)
        seq_len = torch.LongTensor([_[2] for _ in datas]).to(self.device)
        return (x, seq_len), y

    def __next__(self):
        # 获取最后不满一个batch的剩余数据集
        if self.residue and self.index == self.n_batches:
            batches = self.batches[self.index * self.batch_size: len(self.batches)]
            self.index += 1
            batches = self._to_tensor(batches)
            return batches

        elif self.index > self.n_batches:
            self.index = 0
            raise StopIteration
        else:
            batches = self.batches[self.index * self.batch_size: (self.index + 1) * self.batch_size]
            self.index += 1
            batches = self._to_tensor(batches)
            return batches

    def __iter__(self):
        return self

    def __len__(self):
        if self.residue:
            return self.n_batches + 1
        else:
            return self.n_batches

公开数据集获取batch数据

from torchvision import datasets
from torchvision import transforms
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter

# 通过Compose将图片数据组成tensor返回
transT = transforms.Compose([transforms.ToTensor()])
train_set = datasets.CIFAR10("./dataset", train=True, transform=transT, download=True)

# dataset为生成的tensor类型的数据集,batch_size为每一批次添加的样本个数,num_works为是否多进程进行加载,drop_last为最后样本不到一个batch时是否删除掉
train_loader = DataLoader(dataset=train_set, batch_size=64, shuffle=True, num_workers=0, drop_last=False)

writer = SummaryWriter("dataloader")
step = 0
for imgs, target in train_loader:
    writer.add_images("train_set", img_tensor=imgs, global_step=step)
    step += 1

writer.close()

Neural network

pytorch 模型搭建参考文档 https://pytorch.org/docs/stable/index.html

卷积层

import torch
from torchvision import datasets
from torchvision import transforms
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter
from torch import nn
from torch.nn import Conv2d

transT = transforms.Compose([transforms.ToTensor()])
dataset = datasets.CIFAR10("./dataset", train=False, transform=transT, download=True)
# 利用dataloader生成每批次数据集
dataloader = DataLoader(dataset=dataset, batch_size=64, shuffle=True)

class ZJModel(nn.Module):
    def __init__(self):
        super(ZJModel, self).__init__()
        # 定义需要用到的模型层
        # in_channels为输入的样本通道数,out_channels为想要输出的样本通道数,stride步子(sH,sW)H方向与W方向
        # dilation为空洞卷积!!!一般不用
        self.conv1 = Conv2d(in_channels=3, out_channels=6, kernel_size=(3, 3), stride=(1, 1))

    def forward(self, x):
        # 输入经过卷积后就输出
        x = self.conv1(x)
        return x

# 定义自己的模型
zm = ZJModel()
writer = SummaryWriter("./dataloader")
step = 0
for data in dataloader:
    imgs, target = data
    
    # 将dataloader的打包生成的批次数据放入神经网络中!!!
    output = zm(imgs)
    writer.add_images("input", imgs, global_step=step)
    # -1就是根据后面的数据自己推算出batch数量
    output = torch.reshape(output, (-1, 3, 30, 30))
    
    # 将输出的通过tensorboard画出来
    writer.add_images("output", output, step)
    step += 1

writer.close()

池化层

相当于将1024P的视频转换成了512p的视频,清晰度低了,但是特征还是有。

ceil_model:True的时候就是最后没有3*3的时候也会进行最大池化,false的时候就是会舍去不满足大小的数据

from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision import transforms
from torch.utils.tensorboard import SummaryWriter

tranT = transforms.Compose([transforms.ToTensor()])
dataset = datasets.CIFAR10("./dataset", train=False, transform=tranT, download=True)
dataloader = DataLoader(dataset, batch_size=64, shuffle=True)


class MyModel(nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()
        # 自定义最大池化层,kernel_size为3*3,stride默认跟kernel一样为3,ceil_mode为true表示最后没有kernel大小时候也进行最大池化
        self.maxpool1 = nn.MaxPool2d(kernel_size=3, ceil_mode=True)

    def forward(self, x):
        x = self.maxpool1(x)
        return x


mm = MyModel()
writer = SummaryWriter("maxpoolTB")
step = 0
for data in dataloader:
    imgs, targets = data
    writer.add_images("old_imgs", imgs, global_step=step)
    # 输出也是3通道的,只是像素大小变了
    output = mm(imgs)
    writer.add_images("new_imgs", output, global_step=step)
    step += 1

writer.close()

非线性激活

# inplace – can optionally do the operation in-place. Default: False
# inplace为true的时候,就会用变换后的值替换输入值
# false的时候,就会生成一个变换后output值,同时保留原有输入值
ReLU(inplace = True)	

批量正则化

能够加快网络的训练速度!!!

# num_features对应input中的C通道数量。
BatchNorm2d(num_features=100)

损失函数

交叉熵 CrossEntropyLoss

  • Input: (N, C) N为每个批次样本数量, C为分的类别数量
  • Target: 就是实际的类别,也是N维的(每批次样本数量)
from torch.nn import CrossEntropyLoss

# 对应三个类别的概率
x = torch.tensor([0.05, 0.9, 0.05])	
# 实际属于第一个类别
y = torch.tensor([1])	
x = torch.reshape(x, (1, 3))
ce = CrossEntropyLoss()
# 获取对应的交叉熵值,-x[class]+log(∑exp(第i个x的值))
res = ce(x, y)	

优化器

# 定义损失函数
loss = nn.CrossEntropyLoss()
mm = MyModel()
# 定义优化器,传入模型参数
optim = torch.optim.SGD(mm.parameters(), lr=0.00001)
# 定义20个批次进行训练
for epoch in range(20):
    # 每个批次的损失值初始为0
    epoch_loss = 0.0
    for data in dataloader:
        imgs, targets = data
        outputs = mm(imgs)
        # 求出交叉熵的损失值
        result_loss = loss(outputs, targets)
        # 将优化器的梯度设为0
        optim.zero_grad()
        # 通过反向传播计算每个参数的梯度
        result_loss.backward()
        # 用梯度下降对每个参数进行调优
        optim.step()
        # 加上反向传播调优后的损失值!!!
        epoch_loss += result_loss
    print(epoch_loss)

LSTM

输出结果:batch_first为false时

  1. output是一个三维的张量(序列长度(内部通过几个连续样本来预测下一个输入,5或者4等),batch_size,hidden_size* num_directions ),保存了最后一层每个time step的输出h,如果是双向LSTM,每个time step的输出是正向和逆向的h拼接起来。
  2. h_n是一个三维张量(num_layers × num_directions,batch_size,隐层大小(128等)),保存了每一层最后一个time step的输出h,如果是双向LSTM,单独保存前向和后向的最后一个time step的输出h。举个例子,一个num_layers=3的双向LSTM,h_n第一个维度的大小就等于 6 (2 × 3),h_n[0]表示第一层前向传播最后一个time_step的输出,h_n[1]表示第一层后向传播最后一个time_step的输出,h_n[2]表示第二层前向传播最后一个time_step的输出,h_n[3]表示第二层后向传播最后一个time_step的输出,h_n[4]和h_n[5]分别表示第三层前向和后向传播时最后一个time_step的输出。
  3. c_n与h_n一致,只是它保存的是c的值
class Model(nn.Module):
    def __init__(self, config):
        super(Model, self).__init__()
        if config.embedding_pretrained is not None:
            self.embedding = nn.Embedding.from_pretrained(config.embedding_pretrained, freeze=False)
        else:
            self.embedding = nn.Embedding(config.n_vocab, config.embed, padding_idx=config.n_vocab - 1)
        # config.embed为输入的每个词向量的维度,config.hidden_size为LSTM的隐层神经元个数,config.num_layers为LSTM的层数,batch_first为指定第一个维度是batch维度,dropout为指定dropout的概率。
        self.lstm = nn.LSTM(config.embed, config.hidden_size, config.num_layers,
                            bidirectional=True, batch_first=True, dropout=config.dropout)
        self.fc = nn.Linear(config.hidden_size * 2, config.num_classes)

    def forward(self, x):
        x, _ = x
        out = self.embedding(x)  # [batch_size, seq_len, embeding]=[128, 32, 300]
        out, _ = self.lstm(out)
        # 取句子最后时刻的输出h
        out = self.fc(out[:, -1, :])  
        return out

linear layers

# in_features为输入的维度,out_features为输出的维度,一般前面接一个flatten层,对数据进行拉伸后传入
linear1 = Linear(in_features=196608, out_features=10)

数据增强模块

主要对文本进行Embedding

搭建神经网络

自己定义计算搭建法

x = torch.tensor(input_features, dtype=float)		# 输入转化为tensor类型,数据维度为 348*14
y = torch.tensor(labels, dtype=float)				# 输出转化为tensor类型

# 权重用随机的标准正态分布!!!第一层:输入 * weights + biases
weights = torch.randn((14, 128), dtype=float, requires_grad=True)		# 用128个神经元表示隐层特征!!!
biases = torch.randn(128, dtype=float, requires_grad=True)				# 偏置参数对结果进行微调!!!
weights1 = torch.randn((128, 1), dtype=float, requires_grad=True)		# 每个输入对应一个输出结果,所以需要输出为1
biases1 = torch.randn(1, dtype=float, requires_grad=True)				# 偏置对最终结果进行微调!

learning_rate = 0.001
losses = []

for i in range(1000):
    hidden = x.mm(weights) + biases
    hidden = torch.relu(hidden)
    prodictions = hidden.mm(weights1) + biases1
    
    loss = torch.mean((prodictions - y) ** 2)
    losses.append(loss.data.numpy())
    
    loss.backward()
    
    # 更新参数
    weights.data.add_(- learning_rate * weights.grad.data)
    biases.data.add_(- learning_rate * biases.grad.data)
    weights1.data.add_(- learning_rate * weights1.grad.data)
    biases2.data.add_(- learning_rate * biases1.grad.data)
    
    # 梯度清零
    weights.grad.data.zero_()
    biases.grad.data.zero_()
    weights1.grad.data.zero_()
    biases1.grad.data.zero_()

每层堆叠实现

import torch
from torchvision import datasets
from torchvision import transforms
from torch import nn
from torch.nn import Conv2d, MaxPool2d, Linear, Flatten, Softmax, Sequential

transT = transforms.Compose([transforms.ToTensor()])
dataset = datasets.CIFAR10("./dataset", train=True, transform=transT, download=True)

class MyModel(nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()
        # 通过官方的conv计算公式去计算padding值的大小
        self.conv1 = Conv2d(3, 32, (5, 5), padding=2)
        self.maxpool1 = MaxPool2d(kernel_size=(2, 2))
        self.conv2 = Conv2d(32, 32, (5, 5), padding=2)
        self.maxpool2 = MaxPool2d(kernel_size=(2, 2))
        self.conv3 = Conv2d(32, 64, (5, 5), padding=2)
        self.maxpool3 = MaxPool2d(kernel_size=(2, 2))
        self.flatten1 = Flatten()
        # 通过打印可以看到flatten后数据的维度,就是后面全连接层输入的维度
        self.linear1 = Linear(1024, 64)
        self.linear2 = Linear(64, 10)

    def forward(self, x):
        x = self.conv1(x)
        x = self.maxpool1(x)
        x = self.conv2(x)
        x = self.maxpool2(x)
        x = self.conv3(x)
        x = self.maxpool3(x)
        x = self.flatten1(x)
        x = self.linear1(x)
        x = self.linear2(x)
        return x

mm = MyModel()
# 自动生成64个,3通道,32*32大小值全为1的数据
input = torch.ones((64, 3, 32, 32))
output = mm(input)
print(output.shape)

sequential方式实现

import torch
from torchvision import datasets
from torchvision import transforms
from torch import nn
from torch.nn import Conv2d, MaxPool2d, Linear, Flatten, Softmax, Sequential

transT = transforms.Compose([transforms.ToTensor()])
dataset = datasets.CIFAR10("./dataset", train=True, transform=transT, download=True)


class MyModel(nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()
        # 顺序模型,将所有层全部加进去就行
        self.model1 = Sequential(
            Conv2d(3, 32, (5, 5), padding=2),
            MaxPool2d(kernel_size=(2, 2)),
            Conv2d(32, 32, (5, 5), padding=2),
            MaxPool2d(kernel_size=(2, 2)),
            Conv2d(32, 64, (5, 5), padding=2),
            MaxPool2d(kernel_size=(2, 2)),
            Flatten(),
            Linear(1024, 10),
            # softmax的参数为dim=1,所有概率的总和为1,理论上可以-2-1的范围取值
            Softmax(1)
        )

    def forward(self, x):
        x = self.model1(x)
        return x

mm = MyModel()
print(mm)
input = torch.ones((64, 3, 32, 32))
output = mm(input)
print(output.shape)

多层sequential方式

import torch
from torchvision import datasets
from torchvision import transforms
from torch.utils.data import DataLoader
from torch import optim
from torch import nn

input_size = 28 * 28
num_classes = 10
num_epoch = 3
batch_size = 64
learning_rate = 1e-2

trans = transforms.Compose([transforms.ToTensor()])
train_ds = datasets.MNIST("./datas", train=True, transform=trans, download=True)
test_ds = datasets.MNIST("./datas", train=False, transform=trans, download=True)
train_dl = DataLoader(dataset=train_ds, batch_size=batch_size, shuffle=True)
test_dl = DataLoader(dataset=test_ds, batch_size=batch_size, shuffle=True)


class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        # 每个Sequential代表一个整体层,多个整体层组成了网络完整架构
        self.conv1 = nn.Sequential(
            nn.Conv2d(in_channels=1, out_channels=16, kernel_size=(5, 5), padding=2),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2)
        )
        self.conv2 = nn.Sequential(
            nn.Conv2d(in_channels=16, out_channels=32, kernel_size=(5, 5), padding=2),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2)
        )
        self.out = nn.Sequential(
            nn.Flatten(),
            nn.Linear(in_features=32 * 7 * 7, out_features=num_classes)
        )

    def forward(self, x):
        x = self.conv1(x)
        x = self.conv2(x)
        x = self.out(x)

        return x

net = CNN()
loss_func = nn.CrossEntropyLoss()
optimizer = optim.Adam(net.parameters(), lr=learning_rate)

for epoch in range(1, num_epoch + 1):
    train_rights = []
    total_train_loss = 0
    for train_data in train_dl:
        train_inputs, train_targets = train_data
        net.train()
        train_outputs = net(train_inputs)
        train_loss = loss_func(train_outputs, train_targets)
        optimizer.zero_grad()
        train_loss.backward()
        optimizer.step()

        total_train_loss += train_loss.item()

    print("第{}个epoch,总体loss值为:{}".format(epoch, total_train_loss))

    net.eval()
    totoal_test_loss = 0
    test_total_accuracy = 0
    with torch.no_grad():
        for test_data in test_dl:
            test_inputs, test_targets = test_data
            test_outputs = net(test_inputs)
            test_loss = loss_func(test_outputs, test_targets)

            totoal_test_loss += test_loss.item()
            accuracy = (test_outputs.argmax(1) == test_targets).sum()
            test_total_accuracy += accuracy

    print("验证集整体loss为:{},正确率为:{}".format(totoal_test_loss, test_total_accuracy / len(test_ds)))

relu使用F函数

没有参数的层就使用F函数!!!

from torch.nn import functional as F

class MyModel(nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()
        
        # 128个神经元,每个神经元都要处理所有的样本数据,样本的维度是784维。
        self.liner1 = nn.Linear(784, 128)
        self.liner2 = nn.Linear(128, 256)
        self.liner3 = nn.Linear(256, 10)

    def forward(self, x):
        x = F.relu(self.liner1(x))
        x = F.relu(self.liner2(x))
        x = self.liner3(x)
        return x

查看每层的名字和参数

for name, parameter in mm.named_parameters():
    # name是层的名字,parameter是层的参数!!!初始化的 w 和 bias
    print(name, parameter, len(parameter), parameter.size())

基于经典网络的迁移学习运用

注意:一般先保持模型原有一些层参数不变,训练自己的层。当自己的层训练得差不多了,就保持下来现有参数,再将整体全部层的参数在前面的基础上继续训练更新!!!

添加、修改pytorch中的网络结构

import torchvision.models
from torch import nn

# 获取在imgnet上面预训练的vgg16网络模型
vgg16 = torchvision.models.vgg16(pretrained=True)
# 模型最后添加一个名为“add_linear”的全连接层
vgg16.add_module(name="add_linear", module=nn.Linear(1000, 10, True))
# 将vgg16的classifier中添加一个名为7的为全连接层
vgg16.classifier.add_module(name="7", module=nn.Linear(1000, 10, True))

# 将vgg16的第6层修改为输出特征为10的全连接层
vgg16.classifier[6] = nn.Linear(in_features=4096, out_features=10, bias=True)
print(vgg16)

冻结需要的参数

from torchvision import models
from torch import nn

# 常见的模型:["resnet", "alexnet", "vgg", "squeezenet", "densenet", "inception"]
model_name = "resnet"
# 是否用别人训练好得特征来搭建
if_retain_param = True

# 该函数将模型所有层的参数都冻住
def set_parameter_requires_grad(pretrain_model, if_retain_param):
    if if_retain_param:
        for param in pretrain_model.parameters():
            # 通过requires_grad=False来控制反向过程中就不会计算这些参数对应的梯度
            param.requires_grad = False

# 将指定模型需要的参数冻住,并且将需要修改的层进行修正!!!
def initialize_model(model_name, num_classes, if_retain_param):
    pretrain_model = None
    input_size = 0
    if model_name == "resnet":
        pretrain_model = models.resnet152(pretrained=True)
        set_parameter_requires_grad(pretrain_model, if_retain_param=if_retain_param)
        model_linear_in_features = pretrain_model.fc.in_features
        pretrain_model.fc = nn.Linear(in_features=model_linear_in_features, out_features=num_classes, bias=True)
        pretrain_model.add_module(name="softmax", module=nn.LogSoftmax(dim=1))
        input_size = 224

    return pretrain_model, input_size

pretrain_model, input_size = initialize_model(model_name=model_name, num_classes=102, if_retain_param=if_retain_param)
file_name = "checkpoint.pth"

if if_retain_param:
    # 需要自己更新的参数存入list中!!!
    param_to_update = []
    for name, param in pretrain_model.named_parameters():
        if param.requires_grad == True:
            param_to_update.append(param)
            print("learn:", name)
            
optimizer = optim.Adam(param_to_update, lr=1e-2)
# step_size为迭代多少epoch后衰减,gamma为衰减为原来学习率的多少
scheduler = optim.lr_scheduler.StepLR(optimizer=optimizer, step_size=7, gamma=0.1)
# 最后一层为logSoftmax()后就不能用nn.CrossEntropyLoss()了,nn.CrossEntropyLoss()相当于logSoftmax()与nn.NLLLoss()整合起来
loss_func = nn.NLLLoss()

# 接着开始训练模型!!!
# 深度拷贝模型参数进行保存,将最优的模型参数保存下来!!!
copy.deepcopy(model.state_dict())
# 优化器状态和一些超参信息!!!
optimizer.state_dict()

加载模型参数、优化器参数

全部参数都要进行学习更新!!!

# 加载已有的模型和参数
checkpoint = torch.load("./model.pth")
best_acc = checkpoint["best_acc"]
# 加载模型参数
pretrain_model.load_state_dict(checkpoint["state_dict"])
# 加载优化器参数
optimizer.load_state_dict(checkpoint["optimizer"])

模型的保存与加载

保存参数和模型框架

自定义模型的加载,必须在前面定义自定义模型才行!!!

import torch
from torchvision import models

vgg16 = models.vgg16(pretrained=False)
# 保存模型的框架和参数,f为保存的路径
torch.save(vgg16, f="./vgg16.pth")
# 加载模型框架和参数
vgg16 = torch.load("./vgg16.pth")
print(vgg16)

只保存参数字典

import torch
from torchvision import models

vgg16 = models.vgg16(pretrained=False)
# 保存vgg16模型的参数
torch.save(vgg16.state_dict(), f="vgg162.pth")

# 加载前申明一下模型的框架
vgg16 = models.vgg16(pretrained=False)
# 在模型中加载模型参数
vgg16.load_state_dict(torch.load("./vgg162.pth"))
print(vgg16)

完整的模型训练

在CPU上面训练

import torch
from model import MyModel
from torchvision import datasets
from torchvision import transforms
from torch.utils.data import DataLoader
from torch.nn import CrossEntropyLoss
from torch.optim import SGD
from torch.utils.tensorboard import SummaryWriter

transT = transforms.Compose([transforms.ToTensor()])
train_set = datasets.CIFAR10(root="./dataset", train=True, transform=transT, download=True)
test_set = datasets.CIFAR10(root="./dataset", train=False, transform=transT, download=True)

train_dataloader = DataLoader(dataset=train_set, batch_size=64, shuffle=True)
test_dataloader = DataLoader(dataset=test_set, batch_size=64, shuffle=True)

mm = MyModel()
# 定义损失函数
loss_func = CrossEntropyLoss()
# 定义优化器
optim = SGD(mm.parameters(), 1e-2)

total_train_step = 0
total_test_step = 0
epoch = 20
# 将每轮的损失值绘制到tensorboard上面
writer = SummaryWriter("tb_logs")
for i in range(epoch):
    print("------第{}轮训练开始------".format(i + 1))
    # 一般在训练模型的时候加,这样会正常使用dropout,batchnormal
    mm.train()
    for data in train_dataloader:
        imgs, targets = data
        output = mm(imgs)
        loss_val = loss_func(output, targets)
        
		# 优化器优化模型
        # 将优化器的梯度设为0
        optim.zero_grad()
        # 通过反向传播计算每个参数的梯度
        loss_val.backward()
        # 用梯度下降对每个参数进行调优
        optim.step()

        total_train_step += 1
        if total_train_step % 100 == 0:
            # loss_val.item()获取损失值!!!
            print("训练次数:{},loss:{}".format(total_train_step, loss_val.item()))
            # 通过tensorboard展示每100次的loss值
            writer.add_scalar("train_loss", loss_val.item(), global_step=total_train_step)

    # 用测试集进行测试
    # 一般在验证模型的时候加,这样就不会使用dropout,batchnormal
    mm.eval()
    total_test_loss = 0
    total_accuracy = 0
    # 不需要进行梯度下降优化
    with torch.no_grad():
        for data_test in test_dataloader:
            imgs_test, targets_test = data_test
            outputs_test = mm(imgs_test)
            loss_val_test = loss_func(outputs_test, targets_test)

            total_test_loss += loss_val_test.item()
            # 求预测正确的个数!!!
            accuracy = (outputs_test.argmax(1) == targets_test).sum()
            total_accuracy += accuracy

    print("整体测试集上的loss:{}".format(total_test_loss))
    print("整体测试集的正确率:{}".format(total_accuracy / len(test_set)))

    writer.add_scalar("test_loss", total_test_loss, global_step=total_test_step)
    writer.add_scalar("test_accuracy", total_accuracy / len(test_set), global_step=total_test_step)
    total_test_step += 1
	# 保存模型参数
    torch.save(mm.state_dict(), f="mymodel_{}.pth".format(i + 1))
    print("模型保存成功!")

writer.close()

在GPU上面训练

第一种:

对三个地方调用cuda()然后进行返回后就能在GPU上面训练。模型和损失函数可以不重新赋值,数据必须重新赋值!!!

模型,数据(输入和标注),损失函数三个地方调用cuda()函数。

第二种:

# 先定义一个驱动,可以是cpu也可以是cuda,cuda第一块GPU运行
device = torch.device("cuda:0")
# 然后在模型,数据(输入和标注),损失函数三个地方调用to(device)
# 将模型转换到cuda上运行
mm.to(device)
# 损失函数切换到cuda上面运行
loss.to(device)
# 数据切换到cuda上面运行
imgs = imgs.to(device)

模型测试

import torch
from PIL import Image
from torchvision import transforms
from model import MyModel

image_path = "images/1.jpg"
img_data = Image.open(image_path)
# 将图片resize到32*32的大小,转化为tensor类型数据
transC = transforms.Compose([transforms.Resize((32, 32)), transforms.ToTensor()])
img_tensor = transC(img_data)
# 模型需要四维数据,所以需要reshape
img_tensor = torch.reshape(img_tensor, (1, 3, 32, 32))

mm = MyModel()
# 如果GPU训练的,就需要map_location指定运用cpu驱动来运行
mm.load_state_dict(torch.load("./mymodel_13.pth", map_location=torch.device("cpu")))
# 将模块设置为评估模式
mm.eval()
# 不需要进行梯度下降优化
with torch.no_grad():
    output = mm(img_tensor)
    
# 得出最终的预测结果
print(output.argmax(1))
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值