PyTorch入门——60分钟闪电战-3

参考教程:https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html

参考翻译:https://blog.csdn.net/scutjy2015/article/details/71214928(最近才发现的中文版,感谢前人分享)

训练分类器

处理图像,文本,音频或视频数据时,可以使用标准的python包将数据加载到numpy数组中,将这个数组转换成一个torch.*Tensor

  • 图像,Pillow,OpenCV
  • 音频,scipy和librosa
  • 文本,原始Python,基于Cython的加载,NLTK和SpaCy

视觉包 torchvision,其中包含用于常见数据集的数据加载器,如Imagenet,CIFAR10,MNIST等,以及用于图像的数据转换器,即 torchvision.datasetstorch.utils.data.DataLoader

这里下载也出现了问题:

安装后导入出现错误表示找不到指定的模块,卸载后重新安装发现缺少一个依赖项,pip install PyHamcrest搞定!

同类参考:https://blog.csdn.net/qq_36048987/article/details/89452858

(base) C:\Users\Administrator>python
Python 3.7.0 (default, Jun 28 2018, 08:04:48) [MSC v.1912 64 bit (AMD64)] :: Ana
conda, Inc. on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import torchvision
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "D:\Anaconda3\lib\site-packages\torchvision\__init__.py", line 1, in <mod
ule>
    from torchvision import models
  File "D:\Anaconda3\lib\site-packages\torchvision\models\__init__.py", line 11,
 in <module>
    from . import detection
  File "D:\Anaconda3\lib\site-packages\torchvision\models\detection\__init__.py"
, line 1, in <module>
    from .faster_rcnn import *
  File "D:\Anaconda3\lib\site-packages\torchvision\models\detection\faster_rcnn.
py", line 7, in <module>
    from torchvision.ops import misc as misc_nn_ops
  File "D:\Anaconda3\lib\site-packages\torchvision\ops\__init__.py", line 1, in
<module>
    from .boxes import nms, box_iou
  File "D:\Anaconda3\lib\site-packages\torchvision\ops\boxes.py", line 2, in <mo
dule>
    from torchvision import _C
ImportError: DLL load failed: 找不到指定的模块。
>>> exit()

(base) C:\Users\Administrator>pip uninstall torchvision
Uninstalling torchvision-0.3.0:
  Would remove:
    d:\anaconda3\lib\site-packages\torchvision-0.3.0.dist-info\*
    d:\anaconda3\lib\site-packages\torchvision\*
Proceed (y/n)? y
  Successfully uninstalled torchvision-0.3.0
You are using pip version 10.0.1, however version 19.2.3 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' comm
and.

(base) C:\Users\Administrator>pip install https://download.pytorch.org/whl/cpu/t
orchvision-0.3.0-cp37-cp37m-win_amd64.whl
Collecting torchvision==0.3.0 from https://download.pytorch.org/whl/cpu/torchvis
ion-0.3.0-cp37-cp37m-win_amd64.whl
  Using cached https://download.pytorch.org/whl/cpu/torchvision-0.3.0-cp37-cp37m
-win_amd64.whl
Requirement already satisfied: numpy in d:\anaconda3\lib\site-packages (from tor
chvision==0.3.0) (1.16.0)
Requirement already satisfied: torch>=1.1.0 in d:\anaconda3\lib\site-packages (f
rom torchvision==0.3.0) (1.1.0)
Requirement already satisfied: pillow>=4.1.1 in d:\anaconda3\lib\site-packages (
from torchvision==0.3.0) (5.2.0)
Requirement already satisfied: six in d:\anaconda3\lib\site-packages (from torch
vision==0.3.0) (1.11.0)
twisted 18.7.0 requires PyHamcrest>=1.9.0, which is not installed.
Installing collected packages: torchvision
Successfully installed torchvision-0.3.0
You are using pip version 10.0.1, however version 19.2.3 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' comm
and.

(base) C:\Users\Administrator>PIP install PyHamcrest
Collecting PyHamcrest
  Downloading https://files.pythonhosted.org/packages/9a/d5/d37fd731b7d0e91afcc8
4577edeccf4638b4f9b82f5ffe2f8b62e2ddc609/PyHamcrest-1.9.0-py2.py3-none-any.whl (
52kB)
    39% |████████████▌                   | 20kB 191kB/s eta 0:00:01
    58% |██████████████████▉             | 30kB 264kB/s eta 0
    78% |█████████████████████████       | 40kB 186kB/s
    97% |███████████████████████████████▎| 51kB
    100% |████████████████████████████████| 61kB
 257kB/s
Requirement already satisfied: six in d:\anaconda3\lib\site-packages (from PyHam
crest) (1.11.0)
Requirement already satisfied: setuptools in d:\anaconda3\lib\site-packages (fro
m PyHamcrest) (40.2.0)
Installing collected packages: PyHamcrest
Successfully installed PyHamcrest-1.9.0
You are using pip version 10.0.1, however version 19.2.3 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' comm
and.

(base) C:\Users\Administrator>python
Python 3.7.0 (default, Jun 28 2018, 08:04:48) [MSC v.1912 64 bit (AMD64)] :: Ana
conda, Inc. on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import torchvision
>>>

torchvision独立于Pytorch,需通过pip install torchvision 安装。

这提供了极大的便利并避免编写样板代码。

实验数据集:CIFAR10数据集

类:'飞机','汽车','鸟','猫','鹿','狗','青蛙','马','船','卡车'。

图像尺寸为3x32x32,即尺寸为32x32像素的3通道彩色图像。

cifar10

训练图像分类器

我们将按顺序执行以下步骤:

  1. torchvision读取和预处理CIFAR10训练和测试数据集 
  2. 定义卷积神经网络
  3. 定义损失函数
  4. 在神经网络中训练训练集数据
  5. 使用测试集数据测试神经网络

1.加载和标准化CIFAR10

使用torchvision,加载{读取}CIFAR10非常容易。

import torch
import torchvision
import torchvision.transforms as transforms

torchvision数据集的输出是范围[0,1]的PILImage图像。我们将它们转换为归一化范围的张量[-1,1]。

transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
                                        download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4,
                                          shuffle=True, num_workers=2)

testset = torchvision.datasets.CIFAR10(root='./data', train=False,
                                       download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=4,
                                         shuffle=False, num_workers=2)

classes = ('plane', 'car', 'bird', 'cat',
           'deer', 'dog', 'frog', 'horse', 'ship', 'truck')

代码运行间,会出现疯狂下载的进度条。

100%|███████████████████████▉| 170467328/170498071 [24:03<00:00, 176975.53it/s]
>>> trainloader = torch.utils.data.DataLoader(trainset, batch_size=4,
...                                           shuffle=True, num_workers=2)
>>>
>>> testset = torchvision.datasets.CIFAR10(root='./data', train=False,
...                                        download=True, transform=transform)
Files already downloaded and verified
>>> testloader = torch.utils.data.DataLoader(testset, batch_size=4,
...                                          shuffle=False, num_workers=2)
>>>
>>> classes = ('plane', 'car', 'bird', 'cat',  'deer', 'dog', 'frog', 'horse', 'ship', 'truck')

展示一些训练图像:

import matplotlib.pyplot as plt
import numpy as np

# functions to show an image


def imshow(img):
    img = img / 2 + 0.5     # unnormalize
    npimg = img.numpy()
    plt.imshow(np.transpose(npimg, (1, 2, 0)))
    plt.show()


# get some random training images
dataiter = iter(trainloader)
images, labels = dataiter.next()

# show images
imshow(torchvision.utils.make_grid(images))
# print labels
print(' '.join('%5s' % classes[labels[j]] for j in range(4)))

定义卷积神经网络

从神经网络部分复制神经网络并修改它以获取3通道图像(而不是定义的1通道图像)。

 

import torch.nn as nn
import torch.nn.functional as F


class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)# Max pooling over a (2, 2) window(pool作为self的一个属性了)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 16 * 5 * 5)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x


net = Net()

定义Loss函数和优化器

使用分类交叉熵损失和SGD动量。

import torch.optim as optim

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

培训网络

只需循环遍历数据迭代器,并将输入提供给网络并进行优化。

for epoch in range(2):  # loop over the dataset multiple times

    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        # get the inputs; data is a list of [inputs, labels]
        inputs, labels = data

        # zero the parameter gradients
        optimizer.zero_grad()

        # forward + backward + optimize
        outputs = net(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        # print statistics
        running_loss += loss.item()
        if i % 2000 == 1999:    # print every 2000 mini-batches
            print('[%d, %5d] loss: %.3f' %
                  (epoch + 1, i + 1, running_loss / 2000))
            running_loss = 0.0

print('Finished Training')

综合:

import torch
import torchvision
import torchvision.transforms as transforms
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim



class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 16 * 5 * 5)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
                                        download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4,
                                          shuffle=True, num_workers=0)

testset = torchvision.datasets.CIFAR10(root='./data', train=False,
                                       download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=4,
                                         shuffle=False, num_workers=0)

classes = ('plane', 'car', 'bird', 'cat',
           'deer', 'dog', 'frog', 'horse', 'ship', 'truck')


net = Net()


criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

for epoch in range(2):  # loop over the dataset multiple times

    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        # get the inputs; data is a list of [inputs, labels]
        inputs, labels = data

        # zero the parameter gradients
        optimizer.zero_grad()

        # forward + backward + optimize
        outputs = net(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        # print statistics
        running_loss += loss.item()
        if i % 2000 == 1999:    # print every 2000 mini-batches
            print('[%d, %5d] loss: %.3f' %
                  (epoch + 1, i + 1, running_loss / 2000))
            running_loss = 0.0

print('Finished Training')

输出结果:

运行代码出现的问题:

首先应当将整个代码按照加载数据集启动网络定义损失函数训练网路的顺序来执行

BrokenPipeError: [Errno 32] Broken pipe
这是因为windows下多线程的问题,和DataLoader类有关

官方解答:https://github.com/pytorch/pytorch/pull/5585

官方方案:https://github.com/pytorch/pytorch/pull/5585/files/1dfcd28a04e5a1db16da7894f2e2305a5bc41935

参考解决方案:https://blog.csdn.net/qq_33666011/article/details/81873217

解决方案:

    修改调用torch.utils.data.DataLoader()函数时的 num_workers 参数。该参数官方API解释如下: 

  • num_workers (int, optional) – how many subprocesses to use for data loading. 0 
    means that the data will be loaded in the main process. (default: 0)

    该参数是指在进行数据集加载时,启用的线程数目。截止当前2018年5月9日11:15:52,如官方未解决该BUG,则可以通过修改num_works参数为 ,只启用一个主进程加载数据集,避免在windows使用多线程即可。

即:

trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
                                        download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4,
                                          shuffle=True, num_workers=0)

testset = torchvision.datasets.CIFAR10(root='./data', train=False,
                                       download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=4,
                                         shuffle=False, num_workers=0)
 

错误输出:


(base) C:\Users\Administrator>python C:\Users\Administrator\Desktop\CNet.py
Files already downloaded and verified
Files already downloaded and verified
Files already downloaded and verified
Files already downloaded and verified
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "D:\Anaconda3\lib\multiprocessing\spawn.py", line 105, in spawn_main
    exitcode = _main(fd)
  File "D:\Anaconda3\lib\multiprocessing\spawn.py", line 114, in _main
    prepare(preparation_data)
  File "D:\Anaconda3\lib\multiprocessing\spawn.py", line 225, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "D:\Anaconda3\lib\multiprocessing\spawn.py", line 277, in _fixup_main_fro
m_path
    run_name="__mp_main__")
  File "D:\Anaconda3\lib\runpy.py", line 263, in run_path
    pkg_name=pkg_name, script_name=fname)
  File "D:\Anaconda3\lib\runpy.py", line 96, in _run_module_code
    mod_name, mod_spec, pkg_name, script_name)
  File "D:\Anaconda3\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\Users\Administrator\Desktop\CNet.py", line 56, in <module>
    for i, data in enumerate(trainloader, 0):
  File "D:\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 193
, in __iter__
    return _DataLoaderIter(self)
  File "D:\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 469
, in __init__
    w.start()
  File "D:\Anaconda3\lib\multiprocessing\process.py", line 112, in start
    self._popen = self._Popen(self)
  File "D:\Anaconda3\lib\multiprocessing\context.py", line 223, in _Popen
Traceback (most recent call last):
  File "C:\Users\Administrator\Desktop\CNet.py", line 56, in <module>
        for i, data in enumerate(trainloader, 0):
return _default_context.get_context().Process._Popen(process_obj)  File "D:\Anac
onda3\lib\site-packages\torch\utils\data\dataloader.py", line 193, in __iter__

    return _DataLoaderIter(self)  File "D:\Anaconda3\lib\multiprocessing\context
.py", line 322, in _Popen

  File "D:\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 469
, in __init__
        w.start()
return Popen(process_obj)  File "D:\Anaconda3\lib\multiprocessing\process.py", l
ine 112, in start

  File "D:\Anaconda3\lib\multiprocessing\popen_spawn_win32.py", line 33, in __in
it__
    self._popen = self._Popen(self)
prep_data = spawn.get_preparation_data(process_obj._name)
  File "D:\Anaconda3\lib\multiprocessing\context.py", line 223, in _Popen
  File "D:\Anaconda3\lib\multiprocessing\spawn.py", line 143, in get_preparation
_data
    return _default_context.get_context().Process._Popen(process_obj)
_check_not_importing_main()  File "D:\Anaconda3\lib\multiprocessing\context.py",
 line 322, in _Popen

  File "D:\Anaconda3\lib\multiprocessing\spawn.py", line 136, in _check_not_impo
rting_main
    return Popen(process_obj)
is not going to be frozen to produce an executable.''')  File "D:\Anaconda3\lib\
multiprocessing\popen_spawn_win32.py", line 65, in __init__

    RuntimeErrorreduction.dump(process_obj, to_child):

        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.  File "D:\Anaconda3\
lib\multiprocessing\reduction.py", line 60, in dump

    ForkingPickler(file, protocol).dump(obj)
BrokenPipeError: [Errno 32] Broken pipe

(base) C:\Users\Administrator>

测试网络

我们已经在训练数据集上训练了2次。

将通过预测神经网络输出的类标签来检查,并根据真值进行检查。如果预测正确,将样本添加到正确预测列表中。

在展示了真值图像之后,我们令神经网络去预测结果,对于一张图片,神经网络有10个分类预测,将预测可能性最高的一类输出。

dataiter = iter(testloader)
images, labels = dataiter.next()

# print images
imshow(torchvision.utils.make_grid(images))
print('GroundTruth: ', ' '.join('%5s' % classes[labels[j]] for j in range(4)))

# predicted images

outputs = net(images)

# the outputs are energies for the 10 classes. 
# Higher the energy for a class, the more the network 
# thinks that the image is of the particular class 
# So, let's get the index of the highest energy

_, predicted = torch.max(outputs, 1)
print('Predicted: ', ' '.join('%5s' % classes[predicted[j]]
                              for j in range(4)))

应当注意imshow是我们之前定义过的函数,如果在同一个py文件中应当预先定义其细节。

整体运行输出结果:

预测结果:

看看网络如何在整个数据集上执行

correct = 0
total = 0
with torch.no_grad():
    for data in testloader:
        images, labels = data
        outputs = net(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print('Accuracy of the network on the 10000 test images: %d %%' % (
    100 * correct / total))

Accuracy of the network on the 10000 test images: 54 %
 

这个结果大于10%(从十个结果当中随机选择一个的概率),说明网络还是学习到了些

我们再每一个类都看一看,学习率是多少

class_correct = list(0. for i in range(10))
class_total = list(0. for i in range(10))
with torch.no_grad():
    for data in testloader:
        images, labels = data
        outputs = net(images)
        _, predicted = torch.max(outputs, 1)
        c = (predicted == labels).squeeze()
        for i in range(4):
            label = labels[i]
            class_correct[label] += c[i].item()
            class_total[label] += 1


for i in range(10):
    print('Accuracy of %5s : %2d %%' % (
        classes[i], 100 * class_correct[i] / class_total[i]))

Accuracy of plane : 68 %
Accuracy of   car : 77 %
Accuracy of  bird : 30 %
Accuracy of   cat : 62 %
Accuracy of  deer : 50 %
Accuracy of   dog : 32 %
Accuracy of  frog : 61 %
Accuracy of horse : 54 %
Accuracy of  ship : 64 %
Accuracy of truck : 57 %

根据结果来看汽车的识别率最高,鸟类的识别率最低,但都高于10%说明还是有所学习的。

  • 2
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
### 回答1: pytorch-multi-label-classifier-master是一个基于PyTorch的多标签分类器项目。该项目旨在使用PyTorch框架构建一个能够对具有多个标签的数据样本进行分类的模型。 PyTorch是一个流行的深度学习框架,能够实现多种深度学习模型的构建与训练。它提供了丰富的工具和函数,简化了模型的复杂性,并提供了高效的计算能力。 在这个项目中,multi-label指的是数据样本可以被分为多个标签。与传统的单标签分类不同,每个样本可以被分为多个类别,这增加了分类问题的复杂性。模型需要学习如何给每个样本分配正确的标签。 pytorch-multi-label-classifier-master项目提供了一个设置多标签分类模型的基础架构。它包含了数据预处理、模型构建、训练和评估等步骤。用户可以根据自己的数据集和需求,对该项目进行定制。 通过使用pytorch-multi-label-classifier-master项目,用户可以快速搭建一个多标签分类器,用于解决具有多个标签的数据分类问题。同时,该项目还提供了一些示例数据和模型,帮助用户更好地理解和使用多标签分类技术。 总而言之,pytorch-multi-label-classifier-master是一个基于PyTorch框架用于多标签分类的项目,为用户提供了一个简单且灵活的搭建多标签分类器的框架,方便用户解决多标签分类问题。 ### 回答2: pytorch-multi-label-classifier-master是一个基于PyTorch的多标签分类器项目。它提供了一种使用神经网络模型来处理多标签分类任务的解决方案。 该项目的主要目标是通过深度学习技术来提高多标签分类问题的准确度。它使用PyTorch作为深度学习框架,该框架提供了丰富的工具和功能来构建和训练神经网络模型。 在pytorch-multi-label-classifier-master中,你可以找到许多工具和函数来进行数据预处理、模型构建、训练和评估。它支持常见的多标签分类算法,如卷积神经网络(CNN)和递归神经网络(RNN)。你可以根据自己的需求选择合适的模型,并通过简单的配置来进行训练。 该项目还提供了一些示例数据集和预训练模型,以帮助你更快地开始。你可以使用这些数据集来测试和调试你的模型,或者使用预训练模型来进行迁移学习。 pytorch-multi-label-classifier-master还支持一些常见的性能评估指标,如准确率、精确率、召回率和F1值。你可以使用这些指标来评估你的模型在多标签分类任务上的性能。 总的来说,pytorch-multi-label-classifier-master是一个方便易用的项目,旨在帮助你构建和训练用于多标签分类的深度学习模型。它提供了丰富的功能和工具,使你能够快速搭建一个准确度较高的多标签分类器。
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值