1 课程规划
1.1 第一部分 pytorch深度学习基础知识
- pytorch简介与安装
- pytorch基础知识
- pytorch 主要组成模块
- 基础实战 Fashion-MNIST时装分类 ## 1.2 第二部分 pytorch进阶操作
- 定义自己的pytorch模型
- pytorch模型保存与加载
- 灵活使用pytorch模型
- pytorch可视化
- pytorch生态 ## 1.3 第三部分 pytorch实战案例
- CV:语义分割的快速实现
- NLP:情感分析
- 图神经网络
- 医学影响(比赛)
2 pytorch深度学习基础
2.1 张量
定义:是pytorch运算的基本单元
In [3]:
import torch ?torch.tensor
Docstring: tensor(data, *, dtype=None, device=None, requires_grad=False, pin_memory=False) -> Tensor Constructs a tensor with :attr:`data`. .. warning:: :func:`torch.tensor` always copies :attr:`data`. If you have a Tensor ``data`` and want to avoid a copy, use :func:`torch.Tensor.requires_grad_` or :func:`torch.Tensor.detach`. If you have a NumPy ``ndarray`` and want to avoid a copy, use :func:`torch.as_tensor`. .. warning:: When data is a tensor `x`, :func:`torch.tensor` reads out 'the data' from whatever it is passed, and constructs a leaf variable. Therefore ``torch.tensor(x)`` is equivalent to ``x.clone().detach()`` and ``torch.tensor(x, requires_grad=True)`` is equivalent to ``x.clone().detach().requires_grad_(True)``. The equivalents using ``clone()`` and ``detach()`` are recommended. Args: data (array_like): Initial data for the tensor. Can be a list, tuple, NumPy ``ndarray``, scalar, and other types. Keyword args: dtype (:class:`torch.dtype`, optional): the desired data type of returned tensor. Default: if ``None``, infers data type from :attr:`data`. device (:class:`torch.device`, optional): the desired device of returned tensor. Default: if ``None``, uses the current device for the default tensor type (see :func:`torch.set_default_tensor_type`). :attr:`device` will be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types. requires_grad (bool, optional): If autograd should record operations on the returned tensor. Default: ``False``. pin_memory (bool, optional): If set, returned tensor would be allocated in the pinned memory. Works only for CPU tensors. Default: ``False``. Example:: >>> torch.tensor([[0.1, 1.2], [2.2, 3.1], [4.9, 5.2]]) tensor([[ 0.1000, 1.2000], [ 2.2000, 3.1000], [ 4.9000, 5.2000]]) >>> torch.tensor([0, 1]) # Type inference on data tensor([ 0, 1]) >>> torch.tensor([[0.11111, 0.222222, 0.3333333]], ... dtype=torch.float64, ... device=torch.device('cuda:0')) # creates a torch.cuda.DoubleTensor tensor([[ 0.1111, 0.2222, 0.3333]], dtype=torch.float64, device='cuda:0') >>> torch.tensor(3.14159) # Create a scalar (zero-dimensional tensor) tensor(3.1416) >>> torch.tensor([]) # Create an empty tensor (of size (0,)) tensor([]) Type: builtin_function_or_method
tensor(data,,dtype=None,device=None,requires_grad=False,pin_memory) data:数据 :其他变量的输入
dtype:数据类型
device:GPU,CPU
requires_grad:是否要求求导
pin_memory:是否
In [4]:
a = torch.tensor(1.0,dtype=torch.float) b = torch.tensor(1,dtype=torch.long) c = torch.tensor(1.0,dtype=torch.int8) //compel convert print(a,b,c)
tensor(1.) tensor(1) tensor(1, dtype=torch.int8)
In [5]:
# random size tensor d = torch.FloatTensor(2,3) e = torch.IntTensor(2) f = torch.IntTensor([1,2,3,4])
In [7]:
import numpy as np g = np.array([[1,2,3],[4,5,6]]) h = torch.tensor(g) print(h) i = torch.from_numpy(g) print(i) j = h.numpy() print(j)
tensor([[1, 2, 3], [4, 5, 6]], dtype=torch.int32) tensor([[1, 2, 3], [4, 5, 6]], dtype=torch.int32) [[1 2 3] [4 5 6]]
In [14]:
# common tensor function k = torch.rand(2,3) l = torch.ones(2,3) m = torch.zeros(2,3) n = torch.arange(0,10,2) print(k)
tensor([[0.2229, 0.6146, 0.3432], [0.3783, 0.5949, 0.4072]])
In [12]:
print(k.shape) print(k.size())
torch.Size([2, 3]) torch.Size([2, 3])
In [15]:
# operation o = torch.add(k,1) print(o)
tensor([[1.2229, 1.6146, 1.3432], [1.3783, 1.5949, 1.4072]])
In [16]:
# index print(o[:,1]) print(o[0,:])
tensor([1.6146, 1.5949]) tensor([1.2229, 1.6146, 1.3432])
In [17]:
# reshape only need one dimension print(o.view(3,2)) print(o.view(-1,2))
tensor([[1.2229, 1.6146], [1.3432, 1.3783], [1.5949, 1.4072]]) tensor([[1.2229, 1.6146], [1.3432, 1.3783], [1.5949, 1.4072]])
In [18]:
# broadcast p = torch.arange(1,3).view(1,2) print(p) q = torch.arange(1,4).view(3,1) print(q) print(p+q)
tensor([[1, 2]]) tensor([[1], [2], [3]]) tensor([[2, 3], [3, 4], [4, 5]])
In [19]:
# unsqueeze on 2th dimension r = o.unsqueeze(1) print(r)
tensor([[[1.2229, 1.6146, 1.3432]], [[1.3783, 1.5949, 1.4072]]])
In [20]:
# squeeze on 1th dimension will be fail s = r.squeeze(0) print(s)
tensor([[[1.2229, 1.6146, 1.3432]], [[1.3783, 1.5949, 1.4072]]])
In [21]:
# squeeze on 2th dimension will be win t = r.squeeze(1) print(t)
tensor([[1.2229, 1.6146, 1.3432], [1.3783, 1.5949, 1.4072]])
2.2 automatic derivation
In [23]:
import torch x1 = torch.tensor(1.0,requires_grad=True) x2 = torch.tensor(2.0,requires_grad=True) y = x1 + 2*x2 print(y)
tensor(5., grad_fn=<AddBackward0>)
In [24]:
# examine each variation if need derivation print(x1.requires_grad) print(x2.requires_grad) print(y.requires_grad)
True True True
In [25]:
# examine each variation's derivation print(x1.grad.data) print(x2.grad.data) print(y.grad.data)
--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) ~\AppData\Local\Temp\ipykernel_17348\4001318552.py in <module> 1 # examine each variation's derivation ----> 2 print(x1.grad.data) 3 print(x2.grad.data) 4 print(y.grad.data) AttributeError: 'NoneType' object has no attribute 'data'
In [26]:
x1
Out[26]:
tensor(1., requires_grad=True)
In [27]:
y = x1+2*x2 y.backward() print(x1.grad.data) print(x2.grad.data)
tensor(1.) tensor(2.)
In [28]:
x1 = torch.tensor(1.0,requires_grad=False) x2 = torch.tensor(2.0,requires_grad=False) y = x1 + 2*x2 y.backward()
--------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) ~\AppData\Local\Temp\ipykernel_17348\1754898500.py in <module> 2 x2 = torch.tensor(2.0,requires_grad=False) 3 y = x1 + 2*x2 ----> 4 y.backward() d:\Anaconda3\envs\my_env\lib\site-packages\torch\_tensor.py in backward(self, gradient, retain_graph, create_graph, inputs) 305 create_graph=create_graph, 306 inputs=inputs) --> 307 torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs) 308 309 def register_hook(self, hook): d:\Anaconda3\envs\my_env\lib\site-packages\torch\autograd\__init__.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables, inputs) 154 Variable._execution_engine.run_backward( 155 tensors, grad_tensors_, retain_graph, create_graph, inputs, --> 156 allow_unreachable=True, accumulate_grad=True) # allow_unreachable flag 157 158 RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
2.3 parallel computing
Features: GPU,speed,big batch How: CUDA,GPU Way: Network Partitioning,Layer-wise Partitioning,Data Parallelism
3 main module and practice
3.1 流程
数据预处理->模型设计->损失函数与优化方案设计->前向传播->反向传播->更新参数
3.2 特殊性
- 样本量大,需要分批加载batch
- 逐层,分模块搭建
- 多样化的损失函数与优化器设计
- GPU
3.3 Practice
In [29]:
import os import numpy as np import pandas as pd import torch import torch.nn as nn import torch.optim as optim from torch.utils.data import Dataset,DataLoader
Out[29]:
False
In [31]:
# GPU/CPU device = torch.device("cuda:1" if torch.cuda.is_available() else "cpu") # variable.to(device)
Out[31]:
device(type='cpu')
In [32]:
# parameters batch_size = 256 num_workers = 4 lr = 1e-4 epochs = 20
In [37]:
from torchvision import transforms image_size = 28 data_transform = transforms.Compose( [transforms.ToPILImage(), transforms.Resize(image_size), transforms.ToTensor()])
In [38]:
# read way 1th torchvision datasset from torchvision import datasets train_data = datasets.FashionMNIST(root='./',train=True,download=True,transform=data_transform) test_data = datasets.FashionMNIST(root='./',train=False,download=True,transform=data_transform)
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz to ./FashionMNIST\raw\train-images-idx3-ubyte.gz
0%| | 0/26421880 [00:00<?, ?it/s]
Extracting ./FashionMNIST\raw\train-images-idx3-ubyte.gz to ./FashionMNIST\raw Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz to ./FashionMNIST\raw\train-labels-idx1-ubyte.gz
0%| | 0/29515 [00:00<?, ?it/s]
Extracting ./FashionMNIST\raw\train-labels-idx1-ubyte.gz to ./FashionMNIST\raw Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to ./FashionMNIST\raw\t10k-images-idx3-ubyte.gz
0%| | 0/4422102 [00:00<?, ?it/s]
Extracting ./FashionMNIST\raw\t10k-images-idx3-ubyte.gz to ./FashionMNIST\raw Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to ./FashionMNIST\raw\t10k-labels-idx1-ubyte.gz
0%| | 0/5148 [00:00<?, ?it/s]
Extracting ./FashionMNIST\raw\t10k-labels-idx1-ubyte.gz to ./FashionMNIST\raw
In [ ]:
# read way 2th csv class FMDataset(Dataset): def __init__(self,df,transform=None): self.df = df self.transform = transform self.images = df.iloc[:,1:].values.astype(np.unit8) self.labels = df.iloc[:,0].values def __len__(self):