【李沐】深度学习里的参数管理

é«

已于 2022-03-27 20:23:42 修改

阅读量89

点赞数

分类专栏：李沐《动手学深度学习》文章标签：深度学习人工智能

于 2022-03-09 19:47:54 首次发布

本文链接：https://blog.csdn.net/weixin_43476632/article/details/123383240

版权

李沐《动手学深度学习》专栏收录该内容

13 篇文章 4 订阅

订阅专栏

1.一层的参数储存在net[].state_dict()

2.参数初始化。


# -*- coding: utf-8 -*-
"""
Created on 2019

@author: fancp
"""

import torch 
import torch.nn as nn

w = torch.empty(3,5)

#1.均匀分布 - u(a,b)
#torch.nn.init.uniform_(tensor, a=0.0, b=1.0)
print(nn.init.uniform_(w))
# =============================================================================
# tensor([[0.9160, 0.1832, 0.5278, 0.5480, 0.6754],
#         [0.9509, 0.8325, 0.9149, 0.8192, 0.9950],
#         [0.4847, 0.4148, 0.8161, 0.0948, 0.3787]])
# =============================================================================

#2.正态分布 - N(mean, std)
#torch.nn.init.normal_(tensor, mean=0.0, std=1.0)
print(nn.init.normal_(w))
# =============================================================================
# tensor([[ 0.4388,  0.3083, -0.6803, -1.1476, -0.6084],
#         [ 0.5148, -0.2876, -1.2222,  0.6990, -0.1595],
#         [-2.0834, -1.6288,  0.5057, -0.5754,  0.3052]])
# =============================================================================

#3.常数 - 固定值 val
#torch.nn.init.constant_(tensor, val)
print(nn.init.constant_(w, 0.3))
# =============================================================================
# tensor([[0.3000, 0.3000, 0.3000, 0.3000, 0.3000],
#         [0.3000, 0.3000, 0.3000, 0.3000, 0.3000],
#         [0.3000, 0.3000, 0.3000, 0.3000, 0.3000]])
# =============================================================================

#4.全1分布
#torch.nn.init.ones_(tensor)
print(nn.init.ones_(w))
# =============================================================================
# tensor([[1., 1., 1., 1., 1.],
#         [1., 1., 1., 1., 1.],
#         [1., 1., 1., 1., 1.]])
# =============================================================================

#5.全0分布
#torch.nn.init.zeros_(tensor)
print(nn.init.zeros_(w))
# =============================================================================
# tensor([[0., 0., 0., 0., 0.],
#         [0., 0., 0., 0., 0.],
#         [0., 0., 0., 0., 0.]])
# =============================================================================

#6.对角线为 1，其它为 0
#torch.nn.init.eye_(tensor)
print(nn.init.eye_(w))
# =============================================================================
# tensor([[1., 0., 0., 0., 0.],
#         [0., 1., 0., 0., 0.],
#         [0., 0., 1., 0., 0.]])
# =============================================================================

#7.xavier_uniform 初始化
#torch.nn.init.xavier_uniform_(tensor, gain=1.0)
#From - Understanding the difficulty of training deep feedforward neural networks - Bengio 2010
print(nn.init.xavier_uniform_(w, gain=nn.init.calculate_gain('relu')))
# =============================================================================
# tensor([[-0.1270,  0.3963,  0.9531, -0.2949,  0.8294],
#         [-0.9759, -0.6335,  0.9299, -1.0988, -0.1496],
#         [-0.7224,  0.2181, -1.1219,  0.8629, -0.8825]])
# =============================================================================

#8.xavier_normal 初始化
#torch.nn.init.xavier_normal_(tensor, gain=1.0)
print(nn.init.xavier_normal_(w))
# =============================================================================
# tensor([[ 1.0463,  0.1275, -0.3752,  0.1858,  1.1008],
#         [-0.5560,  0.2837,  0.1000, -0.5835,  0.7886],
#         [-0.2417,  0.1763, -0.7495,  0.4677, -0.1185]])
# =============================================================================

#9.kaiming_uniform 初始化
#torch.nn.init.kaiming_uniform_(tensor, a=0, mode='fan_in', nonlinearity='leaky_relu')
#From - Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification - HeKaiming 2015
print(nn.init.kaiming_uniform_(w, mode='fan_in', nonlinearity='relu'))
# =============================================================================
# tensor([[-0.7712,  0.9344,  0.8304,  0.2367,  0.0478],
#         [-0.6139, -0.3916, -0.0835,  0.5975,  0.1717],
#         [ 0.3197, -0.9825, -0.5380, -1.0033, -0.3701]])
# =============================================================================

#10.kaiming_normal 初始化
#torch.nn.init.kaiming_normal_(tensor, a=0, mode='fan_in', nonlinearity='leaky_relu')
print(nn.init.kaiming_normal_(w, mode='fan_out', nonlinearity='relu'))
# =============================================================================
# tensor([[-0.0210,  0.5532, -0.8647,  0.9813,  0.0466],
#         [ 0.7713, -1.0418,  0.7264,  0.5547,  0.7403],
#         [-0.8471, -1.7371,  1.3333,  0.0395,  1.0787]])
# =============================================================================

#11.正交矩阵 - (semi)orthogonal matrix
#torch.nn.init.orthogonal_(tensor, gain=1)
#From - Exact solutions to the nonlinear dynamics of learning in deep linear neural networks - Saxe 2013
print(nn.init.orthogonal_(w))
# =============================================================================
# tensor([[-0.0346, -0.7607, -0.0428,  0.4771,  0.4366],
#         [-0.0412, -0.0836,  0.9847,  0.0703, -0.1293],
#         [-0.6639,  0.4551,  0.0731,  0.1674,  0.5646]])
# =============================================================================

#12.稀疏矩阵 - sparse matrix 
#torch.nn.init.sparse_(tensor, sparsity, std=0.01)
#From - Deep learning via Hessian-free optimization - Martens 2010
print(nn.init.sparse_(w, sparsity=0.1))
# =============================================================================
# tensor([[ 0.0000,  0.0000, -0.0077,  0.0000, -0.0046],
#         [ 0.0152,  0.0030,  0.0000, -0.0029,  0.0005],
#         [ 0.0199,  0.0132, -0.0088,  0.0060,  0.0000]])
# =================================================================

3.参数共享：想要参数共享只需要插入同样的实例化对象

shared = nn.Linear(8, 8)
net = nn.Sequential(nn.Linear(4, 8), nn.ReLU(),
                    shared, nn.ReLU(),
                    shared, nn.ReLU(),
                    nn.Linear(8, 1))
net(X)

4.参数读与存

load和save函数分别读写它们

torch.save(params1,filename) params可以是直接一个张量对象，可以是张量列表，可以是张量字典

torch.save([x, y],'x-files')
mydict = {'x': x, 'y': y}

torch.save(net.state_dict(), 'mlp.params')：将保存模型的参数而不是保存整个模型
clone = MLP()
clone.load_state_dict(torch.load('mlp.params'))
可以对一个新的实例化网络模型，加载保存过的参数

é«

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
【李沐】深度学习里的参数管理

1.一层的参数储存在net[].state_dict()2.参数初始化。# -*- coding: utf-8 -*-"""Created on 2019@author: fancp"""import torch import torch.nn as nnw = torch.empty(3,5)#1.均匀分布 - u(a,b)#torch.nn.init.uniform_(tensor, a=0.0, b=1.0)print(nn.init.uniform_(w))
复制链接

扫一扫