pytorch的各种版本下载-以及pytorch的参数可视化

最新推荐文章于 2024-08-11 21:26:28 发布

知识在于分享

最新推荐文章于 2024-08-11 21:26:28 发布

阅读量7.6k

点赞数 1

分类专栏：深度学习 pytorch

本文链接：https://blog.csdn.net/baidu_40840693/article/details/91491315

版权

深度学习同时被 2 个专栏收录

255 篇文章 18 订阅

订阅专栏

pytorch

22 篇文章 1 订阅

订阅专栏

https://download.pytorch.org/whl/cu100/torch_stable.html

Via pip

Download the whl file with the desired version via this command (you can replace 1.0.1 with the version you choose):

export CUDA_HOME=/usr/local/cuda
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH



https://download.pytorch.org/whl/cpu/torch_stable.html # CPU-only build
https://download.pytorch.org/whl/cu80/torch_stable.html # CUDA 8.0 build
https://download.pytorch.org/whl/cu90/torch_stable.html # CUDA 9.0 build
https://download.pytorch.org/whl/cu92/torch_stable.html # CUDA 9.2 build
https://download.pytorch.org/whl/cu100/torch_stable.html # CUDA 10.0 build



pip install torch==1.0.1 -f https://download.pytorch.org/whl/cu100/torch_stable.html

Note: most pytorch versions are available only for specific CUDA versions. For example pytorch=1.0.1 is not available for CUDA 9.2

在终端输入你想要的版本，他会主动显示下载地址并且下载，但是我们不下载，只复制地址，然后用迅雷下载

py2.7-cuda9.0-torch1.1.0

https://files.pythonhosted.org/packages/0f/ff/92aea60792d3b45c44ded21d6248690f69a6153af9685aad1424507ffe84/torch-1.1.0-cp27-cp27mu-manylinux1_x86_64.whl

py2.7-cuda9.0-torch0.4.1

http://61.155.190.114/torch-0.4.1-cp27-cp27mu-manylinux1_x86_64.whl?fid=bsMLDf3UtbOMoljzVM9sgaTS2blfIPceAAAAAOkilp7Hgo81WbO9p2GvEPNStQEH&mid=666&threshold=150&tid=15051D55DC4A804114D209EA1C4133BC&srcid=119&verno=1

torchvision-0.3.0-cp27

https://files.pythonhosted.org/packages/91/ec/3a5bd85c2655f4285b4ffb600fc05a2f6e8b317bcbda00b45688d790b914/torchvision-0.3.0-cp27-cp27mu-manylinux1_x86_64.whl

py2.7- numpy-1.16.4

https://files.pythonhosted.org/packages/1f/c7/198496417c9c2f6226616cff7dedf2115a4f4d0276613bab842ec8ac1e23/numpy-1.16.4-cp27-cp27mu-manylinux1_x86_64.whl

#Please make sure that
# -   PATH includes /usr/local/cuda-8.0/bin
# -   LD_LIBRARY_PATH includes /usr/local/cuda-8.0/lib64, or, add /usr/local/cuda-8.0/lib64 to /#etc/ld.so.conf and run ldconfig as root

export PATH=/usr/local/cuda/bin:$PATH
export CUDA_HOME=/usr/local/cuda
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH




export PYTHONPATH=/home/boyun/software/caffe/caffe-ssd_/python:$PYTHONPATH
#export PATH="/home/boyun/anaconda3/bin:$PATH"  # commented out by conda initialize

https://github.com/lanpa/tensorboardX

本教程代码环境依赖：

python 3.6+

Pytorch 0.4.0+

tensorboardX: pip install tensorboardX、pip install tensorflow

# demo.py

import torch
import torchvision.utils as vutils
import numpy as np
import torchvision.models as models
from torchvision import datasets
from tensorboardX import SummaryWriter

resnet18 = models.resnet18(False)
writer = SummaryWriter()
sample_rate = 44100
freqs = [262, 294, 330, 349, 392, 440, 440, 440, 440, 440, 440]

for n_iter in range(100):

    dummy_s1 = torch.rand(1)
    dummy_s2 = torch.rand(1)
    # data grouping by `slash`
    writer.add_scalar('data/scalar1', dummy_s1[0], n_iter)
    writer.add_scalar('data/scalar2', dummy_s2[0], n_iter)

    writer.add_scalars('data/scalar_group', {'xsinx': n_iter * np.sin(n_iter),
                                             'xcosx': n_iter * np.cos(n_iter),
                                             'arctanx': np.arctan(n_iter)}, n_iter)

    dummy_img = torch.rand(32, 3, 64, 64)  # output from network
    if n_iter % 10 == 0:
        x = vutils.make_grid(dummy_img, normalize=True, scale_each=True)
        writer.add_image('Image', x, n_iter)

        dummy_audio = torch.zeros(sample_rate * 2)
        for i in range(x.size(0)):
            # amplitude of sound should in [-1, 1]
            dummy_audio[i] = np.cos(freqs[n_iter // 10] * np.pi * float(i) / float(sample_rate))
        writer.add_audio('myAudio', dummy_audio, n_iter, sample_rate=sample_rate)

        writer.add_text('Text', 'text logged at step:' + str(n_iter), n_iter)

        for name, param in resnet18.named_parameters():
            writer.add_histogram(name, param.clone().cpu().data.numpy(), n_iter)

        # needs tensorboard 0.4RC or later
        writer.add_pr_curve('xoxo', np.random.randint(2, size=100), np.random.rand(100), n_iter)

dataset = datasets.MNIST('mnist', train=False, download=True)
images = dataset.test_data[:100].float()
label = dataset.test_labels[:100]

features = images.view(100, 784)
writer.add_embedding(features, metadata=label, label_img=images.unsqueeze(1))

# export scalar data to JSON for external processing
writer.export_scalars_to_json("./all_scalars.json")
writer.close()

也可以简单一点：

from tensorboardX import SummaryWriter
writer = SummaryWriter('log')

#new ynh
#每10个batch画个点用于loss曲线
if batch_idx % 10 == 0:
    niter = epoch * len(train_loader) + batch_idx
    writer.add_scalar('Train/Loss', loss.data, niter)

# new ynh
writer.add_scalar('Test/Accu', test_loss, epoch)

会发现刚刚的log文件夹里面有文件了。在命令行输入如下，载入刚刚做图的文件（那个./log要写完整的路径）

tensorboard --logdir=./log

在浏览器输入：

http://0.0.0.0:6006/

就可以看到我们做的两个图了

文档:

中文文档：

https://tensorboard-pytorch.readthedocs.io/en/latest/tutorial_zh.html

https://github.com/lanpa/tensorboardX/blob/master/tensorboardX/writer.py

https://github.com/sksq96/pytorch-summary

import torch
import torch.nn as nn
import torch.nn.functional as F
from torchsummary import summary

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 10, kernel_size=5)
        self.conv2 = nn.Conv2d(10, 20, kernel_size=5)
        self.conv2_drop = nn.Dropout2d()
        self.fc1 = nn.Linear(320, 50)
        self.fc2 = nn.Linear(50, 10)

    def forward(self, x):
        x = F.relu(F.max_pool2d(self.conv1(x), 2))
        x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))
        x = x.view(-1, 320)
        x = F.relu(self.fc1(x))
        x = F.dropout(x, training=self.training)
        x = self.fc2(x)
        return F.log_softmax(x, dim=1)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu") # PyTorch v0.4.0
model = Net().to(device)

summary(model, (1, 28, 28))

>>>>>:
----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1           [-1, 10, 24, 24]             260
            Conv2d-2             [-1, 20, 8, 8]           5,020
         Dropout2d-3             [-1, 20, 8, 8]               0
            Linear-4                   [-1, 50]          16,050
            Linear-5                   [-1, 10]             510
================================================================
Total params: 21,840
Trainable params: 21,840
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.00
Forward/backward pass size (MB): 0.06
Params size (MB): 0.08
Estimated Total Size (MB): 0.15
----------------------------------------------------------------

import torch
from torchvision import models
from torchsummary import summary

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
vgg = models.vgg16().to(device)

summary(vgg, (3, 224, 224))

>>>>>:
----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1         [-1, 64, 224, 224]           1,792
              ReLU-2         [-1, 64, 224, 224]               0
            Conv2d-3         [-1, 64, 224, 224]          36,928
              ReLU-4         [-1, 64, 224, 224]               0
         MaxPool2d-5         [-1, 64, 112, 112]               0
            Conv2d-6        [-1, 128, 112, 112]          73,856
              ReLU-7        [-1, 128, 112, 112]               0
            Conv2d-8        [-1, 128, 112, 112]         147,584
              ReLU-9        [-1, 128, 112, 112]               0
        MaxPool2d-10          [-1, 128, 56, 56]               0
           Conv2d-11          [-1, 256, 56, 56]         295,168
             ReLU-12          [-1, 256, 56, 56]               0
           Conv2d-13          [-1, 256, 56, 56]         590,080
             ReLU-14          [-1, 256, 56, 56]               0
           Conv2d-15          [-1, 256, 56, 56]         590,080
             ReLU-16          [-1, 256, 56, 56]               0
        MaxPool2d-17          [-1, 256, 28, 28]               0
           Conv2d-18          [-1, 512, 28, 28]       1,180,160
             ReLU-19          [-1, 512, 28, 28]               0
           Conv2d-20          [-1, 512, 28, 28]       2,359,808
             ReLU-21          [-1, 512, 28, 28]               0
           Conv2d-22          [-1, 512, 28, 28]       2,359,808
             ReLU-23          [-1, 512, 28, 28]               0
        MaxPool2d-24          [-1, 512, 14, 14]               0
           Conv2d-25          [-1, 512, 14, 14]       2,359,808
             ReLU-26          [-1, 512, 14, 14]               0
           Conv2d-27          [-1, 512, 14, 14]       2,359,808
             ReLU-28          [-1, 512, 14, 14]               0
           Conv2d-29          [-1, 512, 14, 14]       2,359,808
             ReLU-30          [-1, 512, 14, 14]               0
        MaxPool2d-31            [-1, 512, 7, 7]               0
           Linear-32                 [-1, 4096]     102,764,544
             ReLU-33                 [-1, 4096]               0
          Dropout-34                 [-1, 4096]               0
           Linear-35                 [-1, 4096]      16,781,312
             ReLU-36                 [-1, 4096]               0
          Dropout-37                 [-1, 4096]               0
           Linear-38                 [-1, 1000]       4,097,000
================================================================
Total params: 138,357,544
Trainable params: 138,357,544
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.57
Forward/backward pass size (MB): 218.59
Params size (MB): 527.79
Estimated Total Size (MB): 746.96
----------------------------------------------------------------

https://github.com/Swall0w/torchstat

$ torchstat masato$ torchstat -f example.py -m Net
[MAdd]: Dropout2d is not supported!
[Flops]: Dropout2d is not supported!
[Memory]: Dropout2d is not supported!
      module name  input shape output shape     params memory(MB)           MAdd         Flops  MemRead(B)  MemWrite(B) duration[%]   MemR+W(B)
0           conv1    3 224 224   10 220 220      760.0       1.85   72,600,000.0  36,784,000.0    605152.0    1936000.0      57.49%   2541152.0
1           conv2   10 110 110   20 106 106     5020.0       0.86  112,360,000.0  56,404,720.0    504080.0     898880.0      26.62%   1402960.0
2      conv2_drop   20 106 106   20 106 106        0.0       0.86            0.0           0.0         0.0          0.0       4.09%         0.0
3             fc1        56180           50  2809050.0       0.00    5,617,950.0   2,809,000.0  11460920.0        200.0      11.58%  11461120.0
4             fc2           50           10      510.0       0.00          990.0         500.0      2240.0         40.0       0.22%      2280.0
total                                        2815340.0       3.56  190,578,940.0  95,998,220.0      2240.0         40.0     100.00%  15407512.0
===============================================================================================================================================
Total params: 2,815,340
-----------------------------------------------------------------------------------------------------------------------------------------------
Total memory: 3.56MB
Total MAdd: 190.58MMAdd
Total Flops: 96.0MFlops
Total MemR+W: 14.69MB

from torchstat import stat
import torchvision.models as models

model = models.resnet18()
stat(model, (3, 224, 224))

https://github.com/sovrasov/flops-counter.pytorch

Flops counter for convolutional networks in pytorch framework
Pypi version

This script is designed to compute the theoretical amount of multiply-add operations in convolutional neural networks. It also can compute the number of parameters and print per-layer computational cost of a given network.

Supported layers:

Conv1d/2d/3d (including grouping)
ConvTranspose2d (including grouping)
BatchNorm1d/2d/3d
Activations (ReLU, PReLU, ELU, ReLU6, LeakyReLU)
Linear
Upsample
Poolings (AvgPool1d/2d/3d, MaxPool1d/2d/3d and adaptive ones)
Requirements: Pytorch >= 0.4.1, torchvision >= 0.2.1

Thanks to @warmspringwinds for the initial version of script.

Usage tips
This script doesn't take into account torch.nn.functional.* operations. For an instance, if one have a semantic segmentation model and use torch.nn.functional.interpolate to upscale features, these operations won't contribute to overall amount of flops. To avoid that one can use torch.nn.Upsample instead of torch.nn.functional.interpolate.
ptflops launches a given model on a random tensor and estimates amount of computations during inference. Complicated models can have several inputs, some of them could be optional. To construct non-trivial input one can use the input_constructor argument of the get_model_complexity_info. input_constructor is a function that takes the input spatial resolution as a tuple and returns a dict with named input arguments of the model. Next this dict would be passed to the model as keyworded arguments.
Install the latest version
pip install --upgrade git+https://github.com/sovrasov/flops-counter.pytorch.git
Example
import torchvision.models as models
import torch
from ptflops import get_model_complexity_info

with torch.cuda.device(0):
  net = models.densenet161()
  flops, params = get_model_complexity_info(net, (3, 224, 224), as_strings=True, print_per_layer_stat=True)
  print('Flops:  ' + flops)
  print('Params: ' + params)
Benchmark
torchvision
Model	Input Resolution	Params(M)	MACs(G)	Top-1 error	Top-5 error
alexnet	224x224	61.1	0.72	43.45	20.91
vgg11	224x224	132.86	7.63	30.98	11.37
vgg13	224x224	133.05	11.34	30.07	10.75
vgg16	224x224	138.36	15.5	28.41	9.62
vgg19	224x224	143.67	19.67	27.62	9.12
vgg11_bn	224x224	132.87	7.64	29.62	10.19
vgg13_bn	224x224	133.05	11.36	28.45	9.63
vgg16_bn	224x224	138.37	15.53	26.63	8.50
vgg19_bn	224x224	143.68	19.7	25.76	8.15
resnet18	224x224	11.69	1.82	30.24	10.92
resnet34	224x224	21.8	3.68	26.70	8.58
resnet50	224x224	25.56	4.12	23.85	7.13
resnet101	224x224	44.55	7.85	22.63	6.44
resnet152	224x224	60.19	11.58	21.69	5.94
squeezenet1_0	224x224	1.25	0.83	41.90	19.58
squeezenet1_1	224x224	1.24	0.36	41.81	19.38
densenet121	224x224	7.98	2.88	25.35	7.83
densenet169	224x224	14.15	3.42	24.00	7.00
densenet201	224x224	20.01	4.37	22.80	6.43
densenet161	224x224	28.68	7.82	22.35	6.20
inception_v3	224x224	27.16	2.85	22.55	6.44
Top-1 error - ImageNet single-crop top-1 error (224x224)
Top-5 error - ImageNet single-crop top-5 error (224x224)

Pytorch学习第四讲：加载预训练模型

1. 直接加载预训练模型
在训练的时候可能需要中断一下，然后继续训练，也就是简单的从保存的模型中加载参数权重：

net = SNet()
net.load_state_dict(torch.load("model_1599.pkl"))
这种方式是针对于之前保存模型时以保存参数的格式使用的：

torch.save(net.state_dict(), "model/model_1599.pkl")
pytorch官网更推荐上述模型保存方法，也据说这种方式比下一种更快一点。

下面介绍第二种模型保存和加载的方式：

net = SNet()
torch.save(net, "model_1599.pkl")
 
snet = torch.load("model_1599.pkl")

这种方式会将整个网络保存下来，数据量会更大，会消耗更多的时间，占用内存也更高。

2. 加载一部分预训练模型
模型可能是一些经典的模型改掉一部分，比如一般算法中提取特征的网络常见的会直接使用vgg16的features extraction部分，也就是在训练的时候可以直接加载已经在imagenet上训练好的预训练参数，这种方式实现如下：

net = SNet()
model_dict = net.state_dict()
 
vgg16 = models.vgg16(pretrained=True)
pretrained_dict = vgg16.state_dict()
pretrained_dict = {k: v for k, v in pretrained_dict.items() if k in model_dict}
 
model_dict.update(pretrained_dict)
net.load_state_dict(model_dict)
也就是在网络中state_dict部分，属

于vgg16的，替换成vgg16预训练模型里的参数（代码里的k:v for k,v in pretrained_dict.items() if k in model_dict），其他保持不变。
3. 微调经典网络
因为pytorch中的torchvision给出了很多经典常用模型，并附加了预训练模型。利用好这些训练好的基础网络可以加快不少自己的训练速度。

首先比如加载vgg16（带有预训练参数的形式）：

import torchvision.models as models
vgg16 = models.vgg16(pretrained=True)

比如，网络第一层本来是Conv2d(3, 64, 3, 1, 1)，想修改成Conv2d(4, 64, 3, 1 ,1)，那直接赋值就可以了：

import torch.nn as nn
vgg16.features[0]=nn.Conv2d(4, 64, 3, 1, 1)

4. 修改经典网络
这个比上面微调修改的地方要多一些，但是想介绍一下这样的修改方式。

先简单介绍一下我需要需改的部分，在vgg16的基础模型下，每一个卷积都要加一个dropout层，并将ReLU激活函数换成PReLU，最后两层的Pooling层stride改成1。直接上代码：

def feature_layer():
    layers = []
    pool1 = ['4', '9', '16']
    pool2 = ['23', '30']
    vgg16 = models.vgg16(pretrained=True).features
    for name, layer in vgg16._modules.items():
        if isinstance(layer, nn.Conv2d):
            layers += [layer, nn.Dropout2d(0.5), nn.PReLU()]
        elif name in pool1:
            layers += [layer]
        elif name == pool2[0]:
            layers += [nn.MaxPool2d(2, 1, 1)]
        elif name == pool2[1]:
            layers += [nn.MaxPool2d(2, 1, 0)]
        else:
            continue
    features = nn.Sequential(*layers)
    #feat3 = features[0:24]
    return features

大概的思路就是，创建一个新的网络（layers列表）, 遍历vgg16里每一层，如果遇到卷积层（if isinstance(layer, nn.Conv2d）就先把该层（Conv2d）保持原样加进去，随后增加一个dropout层，再加一个PReLU层。然后如果遇到最后两层pool，就修改响应参数加进去，其他的pool正常加载。最后将这个layers列表转成网络的nn.Sequential的形式，最后返回features。然后再你的新的网络层就可以用以下方式来加载：

class SNet(nn.Module):
    def __init__(self):
        super(SNet, self).__init__()
        self.features = feature_layer()
    def forward(self, x):
        x = self.features(x)
        return x