Pytorch官方教程学习笔记（5）

最新推荐文章于 2022-11-28 22:19:29 发布

ECODER-MXQ

最新推荐文章于 2022-11-28 22:19:29 发布

阅读量388

点赞数

分类专栏：读书笔记 Pytorch 文章标签： Pytorch

Pytorch 同时被 2 个专栏收录

8 篇文章 0 订阅

订阅专栏

读书笔记

7 篇文章 0 订阅

订阅专栏

数据加载与处理

Author: Sasank Chilamkurthy <https://chsasank.github.io>_

A lot of effort in solving any machine learning problem goes in to
preparing the data. PyTorch provides many tools to make data loading
easy and hopefully, to make your code more readable. In this tutorial,
we will see how to load and preprocess/augment data from a non trivial
dataset.

To run this tutorial, please make sure the following packages are
installed:

scikit-image: For image io and transforms
pandas: For easier csv parsing

from __future__ import print_function, division
import os
import torch
import pandas as pd
from skimage import io, transform
import numpy as np
import matplotlib.pyplot as plt
from torch.utils.data import Dataset, DataLoader
from torchvision import transforms, utils

# Ignore warnings
import warnings
warnings.filterwarnings("ignore")

plt.ion()   # interactive mode

The dataset we are going to deal with is that of facial pose.
This means that a face is annotated like this:

… figure:: /_static/img/landmarked_face2.png
:width: 400

Over all, 68 different landmark points are annotated for each face.

Note

Download the dataset from `here

Dataset comes with a csv file with annotations which looks like this:

image_name,part_0_x,part_0_y,part_1_x,part_1_y,part_2_x, ... ,part_67_x,part_67_y
0805personali01.jpg,27,83,27,98, ... 84,134
1084239450_e76e00b7e7.jpg,70,236,71,257, ... ,128,312

从CSV文件中读入标定数据，并将其存储大小为（N,2）的数组中，其中N表示landmarks的数目。

landmarks_frame = pd.read_csv('F:/工作学习/编程与操作系统/Pytorch/datasets/faces/faces/face_landmarks.csv')

n = 65
img_name = landmarks_frame.iloc[n, 0]# 第0列为图片名称
landmarks = landmarks_frame.iloc[n, 1:].as_matrix()# 将标定数据转换为矩阵的形式
landmarks = landmarks.astype('float').reshape(-1, 2)

print('Image name: {}'.format(img_name))
print('Landmarks shape: {}'.format(landmarks.shape))
print('First 4 Landmarks: {}'.format(landmarks[:4]))

Image name: person-7.jpg
Landmarks shape: (68, 2)
First 4 Landmarks: [[32. 65.]
 [33. 76.]
 [34. 86.]
 [34. 97.]]

Let’s write a simple helper function to show an image and its landmarks
and use it to show a sample.

def show_landmarks(image, landmarks):
    """Show image with landmarks"""
    plt.imshow(image)
    plt.scatter(landmarks[:, 0], landmarks[:, 1], s=10, marker='o', c='g')# 绘制散点
    plt.pause(0.001)  # pause a bit so that plots are updated


plt.figure()
image = io.imread(os.path.join('F:/工作学习/编程与操作系统/Pytorch/datasets/faces/faces/', img_name))

show_landmarks(image, landmarks)

在这里插入图片描述

Dataset class

torch.utils.data.Dataset是一个用来对数据集进行抽象表示的类。自己定义的数据集应该对Dataset
进行继承，并对以下两个方法进行重载。

__len__：len(dataset) 返回数据集的size。
__getitem__ ：用于支持下标索引操作，例如：可以使用 dataset[i]得到第 $i$ 个样本。

下面将针对人脸标定数据集创建一个类。在该类中， __init__将完成csv文件的读取操作，
__getitem__将完成样本图片的读取操作。这样做将获得较高的内存使用效率，因为这种做法不会一次性将
所有图片都读入内存中，只会读取所需要的图片。

在类中，样本以字典的方式存储，即：{'image': image, 'landmarks': landmarks}。数据集有一个可选
参数transform，该参数用于决定对样本进行的操作。transform的使用将在下一节介绍。

class FaceLandmarksDataset(Dataset):
    """Face Landmarks dataset."""

    def __init__(self, csv_file, root_dir, transform=None):
        """
        Args:
            csv_file (string): Path to the csv file with annotations.
            root_dir (string): Directory with all the images.
            transform (callable, optional): Optional transform to be applied
                on a sample.
        """
        self.landmarks_frame = pd.read_csv(csv_file)# 读取csv文件
        self.root_dir = root_dir
        self.transform = transform

    def __len__(self):
        return len(self.landmarks_frame)

    def __getitem__(self, idx):
        img_name = os.path.join(self.root_dir,
                                self.landmarks_frame.iloc[idx, 0])
        image = io.imread(img_name)# 读取图片
        # 读取当前图片对应的标定         
        landmarks = self.landmarks_frame.iloc[idx, 1:].as_matrix()
        landmarks = landmarks.astype('float').reshape(-1, 2)
        sample = {'image': image, 'landmarks': landmarks}

        # 依据transform决定是否执行相应操作
        if self.transform:
            sample = self.transform(sample)

        return sample

接下来实例化一个该类的对象，并在数据样本上进行迭代，打印出前四个样本以及其标定。

face_dataset = FaceLandmarksDataset(csv_file='F:/工作学习/编程与操作系统/Pytorch/datasets/faces/faces/face_landmarks.csv',
                                    root_dir='F:/工作学习/编程与操作系统/Pytorch/datasets/faces/faces/')

fig = plt.figure()

for i in range(len(face_dataset)):
    sample = face_dataset[i]# 索引操作

    print(i, sample['image'].shape, sample['landmarks'].shape)

    ax = plt.subplot(1, 4, i + 1)
    plt.tight_layout()
    ax.set_title('Sample #{}'.format(i))
    ax.axis('off')
    show_landmarks(**sample)

    if i == 3:
        plt.show()
        break

0 (324, 215, 3) (68, 2)
1 (500, 333, 3) (68, 2)
2 (250, 258, 3) (68, 2)
3 (434, 290, 3) (68, 2)

在这里插入图片描述

转换

One issue we can see from the above is that the samples are not of the
same size. Most neural networks expect the images of a fixed size.
Therefore, we will need to write some prepocessing code.
Let’s create three transforms:

Rescale: 修改图片的大小
RandomCrop: 对图片进行随机裁剪（数据增强方法的一种）。
ToTensor: 将numpy图片转换为torch图片（需要对坐标进行交换）。

我们将使用类实现以上功能，而不是用分离的函数。当类被调用时，需要传入转换的参数。
为了实现这一功能，我们只需要重写 __call__方法，如果需要，再重写__init__方法。
这样，我们就可以使用如下调用方式。


    tsfm = Transform(params)
    transformed_sample = tsfm(sample)

转换需要再图片和标定上同时进行。

class Rescale(object):
    """Rescale the image in a sample to a given size.

    Args:
        output_size（元组或整形）：需要转换成的大小。如果是元组，输出大小将和output_size匹配。
        如果是int，图像较小的边将和output_size匹配，以保持图片的长宽比不变。
    """

    def __init__(self, output_size):
        assert isinstance(output_size, (int, tuple))
        self.output_size = output_size

    def __call__(self, sample):
        image, landmarks = sample['image'], sample['landmarks']

        h, w = image.shape[:2]
        # int         
        if isinstance(self.output_size, int):
            if h > w:
                new_h, new_w = self.output_size * h / w, self.output_size
            else:
                new_h, new_w = self.output_size, self.output_size * w / h
        # tuple         
        else:
            new_h, new_w = self.output_size

        new_h, new_w = int(new_h), int(new_w)

        img = transform.resize(image, (new_h, new_w))

        # h and w are swapped for landmarks because for images,
        # x and y axes are axis 1 and 0 respectively
        landmarks = landmarks * [new_w / w, new_h / h]

        return {'image': img, 'landmarks': landmarks}


class RandomCrop(object):
    """Crop randomly the image in a sample.

    Args:
        output_size (tuple or int): Desired output size. If int, square crop
            is made.
    """

    def __init__(self, output_size):
        assert isinstance(output_size, (int, tuple))
        if isinstance(output_size, int):
            self.output_size = (output_size, output_size)
        else:
            assert len(output_size) == 2
            self.output_size = output_size

    def __call__(self, sample):
        image, landmarks = sample['image'], sample['landmarks']

        h, w = image.shape[:2]
        new_h, new_w = self.output_size
        
        # 确定随机裁剪的左上角坐标         
        top = np.random.randint(0, h - new_h)
        left = np.random.randint(0, w - new_w)

        image = image[top: top + new_h,
                      left: left + new_w]

        landmarks = landmarks - [left, top]

        return {'image': image, 'landmarks': landmarks}


class ToTensor(object):
    """Convert ndarrays in sample to Tensors."""

    def __call__(self, sample):
        image, landmarks = sample['image'], sample['landmarks']

        # swap color axis because
        # numpy image: H x W x C
        # torch image: C X H X W
        image = image.transpose((2, 0, 1))# 进行转置即可
        return {'image': torch.from_numpy(image),
                'landmarks': torch.from_numpy(landmarks)}

组合转换

Now, we apply the transforms on an sample.

Let’s say we want to rescale the shorter side of the image to 256 and
then randomly crop a square of size 224 from it. i.e, we want to compose
Rescale and RandomCrop transforms.
torchvision.transforms.Compose is a simple callable class which allows us
to do this.

scale = Rescale(256)
crop = RandomCrop(128)
composed = transforms.Compose([Rescale(256),
                               RandomCrop(224)])

# Apply each of the above transforms on sample.
fig = plt.figure()
sample = face_dataset[65]
for i, tsfrm in enumerate([scale, crop, composed]):
    transformed_sample = tsfrm(sample)

    ax = plt.subplot(1, 3, i + 1)
    plt.tight_layout()
    ax.set_title(type(tsfrm).__name__)
    show_landmarks(**transformed_sample)

plt.show()

在这里插入图片描述

对数据集进行迭代

Let’s put this all together to create a dataset with composed
transforms.
To summarize, every time this dataset is sampled:

An image is read from the file on the fly
Transforms are applied on the read image
Since one of the transforms is random, data is augmentated on
sampling

We can iterate over the created dataset with a for i in range
loop as before.

transformed_dataset = FaceLandmarksDataset(csv_file='F:/工作学习/编程与操作系统/Pytorch/datasets/faces/faces/face_landmarks.csv', 
                                           root_dir='F:/工作学习/编程与操作系统/Pytorch/datasets/faces/faces/',
                                           transform=transforms.Compose([
                                               Rescale(256),
                                               RandomCrop(224),
                                               ToTensor()
                                           ]))

for i in range(len(transformed_dataset)):
    sample = transformed_dataset[i]

    print(i, sample['image'].size(), sample['landmarks'].size())

    if i == 3:
        break

0 torch.Size([3, 224, 224]) torch.Size([68, 2])
1 torch.Size([3, 224, 224]) torch.Size([68, 2])
2 torch.Size([3, 224, 224]) torch.Size([68, 2])
3 torch.Size([3, 224, 224]) torch.Size([68, 2])

However, we are losing a lot of features by using a simple for loop to
iterate over the data. In particular, we are missing out on:

对数据进行batch处理
对数据进行打乱操作
使用multiprocessing对数据集进行并行加载.

torch.utils.data.DataLoader is an iterator which provides all these
features. Parameters used below should be clear. One parameter of
interest is collate_fn. You can specify how exactly the samples need
to be batched using collate_fn. However, default collate should work
fine for most use cases.

dataloader = DataLoader(transformed_dataset, batch_size=4,
                        shuffle=True, num_workers=4)


# Helper function to show a batch
def show_landmarks_batch(sample_batched):
    """Show image with landmarks for a batch of samples."""
    images_batch, landmarks_batch = \
            sample_batched['image'], sample_batched['landmarks']
    batch_size = len(images_batch)
    im_size = images_batch.size(2)

    grid = utils.make_grid(images_batch)
    plt.imshow(grid.numpy().transpose((1, 2, 0)))

    for i in range(batch_size):
        plt.scatter(landmarks_batch[i, :, 0].numpy() + i * im_size,
                    landmarks_batch[i, :, 1].numpy(),
                    s=10, marker='.', c='r')

        plt.title('Batch from dataloader')

for i_batch, sample_batched in enumerate(dataloader):
    print(i_batch, sample_batched['image'].size(),
          sample_batched['landmarks'].size())

    # observe 4th batch and stop.
    if i_batch == 3:
        plt.figure()
        show_landmarks_batch(sample_batched)
        plt.axis('off')
        plt.ioff()
        plt.show()
        break

这里目前还没有调通
---------------------------------------------------------------------------

BrokenPipeError                           Traceback (most recent call last)

<ipython-input-16-7777157b058e> in <module>()
     20         plt.title('Batch from dataloader')
     21 
---> 22 for i_batch, sample_batched in enumerate(dataloader):
     23     print(i_batch, sample_batched['image'].size(),
     24           sample_batched['landmarks'].size())


E:\Anaconda\envs\python35\lib\site-packages\torch\utils\data\dataloader.py in __iter__(self)
    499 
    500     def __iter__(self):
--> 501         return _DataLoaderIter(self)
    502 
    503     def __len__(self):


E:\Anaconda\envs\python35\lib\site-packages\torch\utils\data\dataloader.py in __init__(self, loader)
    287             for w in self.workers:
    288                 w.daemon = True  # ensure that the worker exits on process exit
--> 289                 w.start()
    290 
    291             _update_worker_pids(id(self), tuple(w.pid for w in self.workers))


E:\Anaconda\envs\python35\lib\multiprocessing\process.py in start(self)
    103                'daemonic processes are not allowed to have children'
    104         _cleanup()
--> 105         self._popen = self._Popen(self)
    106         self._sentinel = self._popen.sentinel
    107         _children.add(self)


E:\Anaconda\envs\python35\lib\multiprocessing\context.py in _Popen(process_obj)
    210     @staticmethod
    211     def _Popen(process_obj):
--> 212         return _default_context.get_context().Process._Popen(process_obj)
    213 
    214 class DefaultContext(BaseContext):


E:\Anaconda\envs\python35\lib\multiprocessing\context.py in _Popen(process_obj)
    311         def _Popen(process_obj):
    312             from .popen_spawn_win32 import Popen
--> 313             return Popen(process_obj)
    314 
    315     class SpawnContext(BaseContext):


E:\Anaconda\envs\python35\lib\multiprocessing\popen_spawn_win32.py in __init__(self, process_obj)
     64             try:
     65                 reduction.dump(prep_data, to_child)
---> 66                 reduction.dump(process_obj, to_child)
     67             finally:
     68                 context.set_spawning_popen(None)


E:\Anaconda\envs\python35\lib\multiprocessing\reduction.py in dump(obj, file, protocol)
     57 def dump(obj, file, protocol=None):
     58     '''Replacement for pickle.dump() using ForkingPickler.'''
---> 59     ForkingPickler(file, protocol).dump(obj)
     60 
     61 #


BrokenPipeError: [Errno 32] Broken pipe

Afterword: torchvision

In this tutorial, we have seen how to write and use datasets, transforms
and dataloader. torchvision package provides some common datasets and
transforms. You might not even have to write custom classes. One of the
more generic datasets available in torchvision is ImageFolder.
It assumes that images are organized in the following way: ::

root/ants/xxx.png
root/ants/xxy.jpeg
root/ants/xxz.png
.
.
.
root/bees/123.jpg
root/bees/nsdf3.png
root/bees/asd932_.png

where ‘ants’, ‘bees’ etc. are class labels. Similarly generic transforms
which operate on PIL.Image like RandomHorizontalFlip, Scale,
are also available. You can use these to write a dataloader like this

import torch
from torchvision import transforms, datasets
data_transform = transforms.Compose([
          transforms.RandomSizedCrop(224),
          transforms.RandomHorizontalFlip(),
          transforms.ToTensor(),
          transforms.Normalize(mean=[0.485, 0.456, 0.406],
                               std=[0.229, 0.224, 0.225])
      ])
hymenoptera_dataset = datasets.ImageFolder(root='hymenoptera_data/train',
                                             transform=data_transform)
dataset_loader = torch.utils.data.DataLoader(hymenoptera_dataset,
                                               batch_size=4, shuffle=True,
                                               num_workers=4)