深度学习Pytorch入门：神经网络模型的参数初始化操作

最新推荐文章于 2024-07-28 03:54:21 发布

PingBryant

最新推荐文章于 2024-07-28 03:54:21 发布

阅读量2.9k

点赞数 6

分类专栏： ML_DL_CV 文章标签：深度学习 pytorch

本文链接：https://blog.csdn.net/pingbryant/article/details/121635711

版权

ML_DL_CV 专栏收录该内容

3 篇文章 0 订阅

订阅专栏

本文重点

其实如果我们使用 pytorch 封装好的网络层的时候,我们并不需要对模型的参数初始化,因为这些都是 pytorch 帮助我们完成的,但是如果我们自己搭建模型(不使用 pytorch 中的封装好的网络层)或者对 pytorch 中封装好的模型初始化参数不满意,那么此时我们对模型进行参数初始化。

用 pytorch 已经封装好的层来搭建网络模型

import torch
import numpy as np
from torch import nn

class Simple_net(nn.Module):
	def __init__(self):
		super(Simple_net,self).__init__()
		self.layer1=nn.Linear(4,3)
		
	def forward(self,x):
		return self.layer1(x)
		
net=Simple_net()
print(net)
print(net.layer1.weight.shape)    # torch.Size([3, 4])
data=torch.randn(5,4)
out=net(data)
print(out.shape)    # torch.Size([5, 3])

输出为：

Simple_net(
  (layer1): Linear(in_features=4, out_features=3, bias=True)
)
torch.Size([3, 4])
torch.Size([5, 3])

这里我们是使用pytorch已经封装好的nn.Linear层，这里我们需要注意一点，全连接层的输入维度为4，输出维度为3，那么它的参数shape=[3,4]，所以这是需要注意的一点，如果我们要是不使用pytorch中的Linear，而是自己定义全连接层，那么我们就要这个设置参数，也就是说如果我们想要让输入层的维度为4，输出层的维度为3，那么我们就要设置自定义的全连接层的参数矩阵的时候shape应该是[3,4]，而不是[4,3]，下面我们不使用pytorch中封装好的Linear层来，来复现上面的模型。

import torch
import numpy as np
from torch import nn

class Simple_net(nn.Module):
	def __init__(self,in_features,out_features):
		super(Simple_net,self).__init__()
		self.w=nn.Parameter(torch.randn(out_features,in_features))
		self.b=nn.Parameter(torch.randn(out_features))
		
	def forward(self,x):
		x=x@self.w.t()+self.b
		return x
		
net=Simple_net(4,3)
print(net.w.shape)   # torch.Size([3, 4])
data=torch.randn(5,4)
out=net(data)
print(out.shape)     # torch.Size([5, 3])

输出：

torch.Size([3, 4])
torch.Size([5, 3])

我们可以看到构建self.w的时候不是torch.randn(in_features,out_features) ，而是torch.randn(out_features,in_features)，这样表示该全连接层的输入维度为in_feature,
然后输出维度为out_features。

除此之外还要注意的一点是前向传播forward的计算方式。

x的维度为[5,4]，然后w的维度是[3,4]，所以需要用x乘上w的转置（x@self.w.t()），这样输出就是[5,3]，我们可以看到forward中就是这样做的。

以上证明了什么

以上的代码证明了什么呢？主要想说的一点就是如果我们要是用nn.Linear()构建全连接层的时候，它就已经包含了参数w和b，已经进行初始化了，如果我们对这个初始化不满意怎么办？我们可以将满意的初始化参数赋值给它。

import torch
from torch import nn
import numpy as np

net = nn.Sequential( 
		nn.Linear(50,100),
		nn.ReLU(True), 
		nn.Linear(100,200),
		nn.ReLU(True), 
		nn.Linear(200,100) 
		)
		
print(net)
print(net[0].weight.shape)
net[0].weight.data=torch.from_numpy(np.random.uniform(3,5,size=(100,50)))
print(net[0].weight.shape)

输出：

Sequential(
  (0): Linear(in_features=50, out_features=100, bias=True)
  (1): ReLU(inplace=True)
  (2): Linear(in_features=100, out_features=200, bias=True)
  (3): ReLU(inplace=True)
  (4): Linear(in_features=200, out_features=100, bias=True)
)
torch.Size([100, 50])
torch.Size([100, 50])

np.random.uniform(3,5,size=(100,50))表示产生一个100行，50列的3到5之间的随机数矩阵，这个维度正好和net[0].weight 一致，我们将这个随机矩阵赋值给net[0].weight就行。

这里我们使用net[0]来获取了net模块的第一个线性层，然后我们可以使用net[2]来获取net模块的第二个线性层，然后进行参数初始化，这样编程太不合理了，我们可以使用循环遍历的方式。

循环遍历为所有线性层参数初始化（sequential）

import torch
from torch import nn
import numpy as np

net = nn.Sequential(
		nn.Linear(50,100),
		nn.ReLU(True), 
		nn.Linear(100,200),
		nn.ReLU(True), 
		nn.Linear(200,100) 
		)
		
for layer in net: 
	if isinstance(layer,nn.Linear):
		layer.weight.data=torch.from_numpy(np.random.normal(0,0.5,size=layer.weight.shape))

这里要使用data而不是detach(),这是因为data改变，layer.weight也会进行改变，我们的目的就是改变layer.weight。

循环遍历（class）

模型model不支持索引，但是sequential支持索引，要想获得model中的内容只能通过遍历操作，而获取sequential中的内容不仅可以遍历操作，还可以直接通过索引下标获取。

这里我们是使用sequential的方式构建了一个模型，所以我们可以遍历它获取其中的线性层，如果我们使用class类的方式来构建一个模型，那么我们就不能遍历它来获取到它的线性层，此时我们可以使用modules()来完成这个任务。

import torch
from torch import nn
import numpy as np

class net(nn.Module):
	def __init__(self):
		super(net,self).__init__()
		self.layer1=nn.Sequential(
			nn.Linear(30,40), 
			nn.ReLU() 
		)
		self.layer2=nn.Linear(40,10)
		
	def forward(self,x):
		x=self.layer1(x)
		x=self.layer2(x)
		return x
		
n=net()
print(n)
for layer in n.modules():
	if isinstance(layer,nn.Linear):
		param_size=layer.weight.shape
		layer.weight.data=torch.from_numpy(np.random.normal(0,0.5,size=param_size))

PingBryant

关注

6
点赞
踩
19

收藏

觉得还不错? 一键收藏
0
评论
深度学习Pytorch入门：神经网络模型的参数初始化操作

本文重点其实如果我们使用 pytorch 封装好的网络层的时候,我们并不需要对模型的参数初始化,因为这些都是 pytorch 帮助我们完成的,但是如果我们自己搭建模型(不使用 pytorch 中的封装好的网络层)或者对 pytorch 中封装好的模型初始化参数不满意,那么此时我们对模型进行参数初始化。用 pytorch 已经封装好的层来搭建网络模型import torchimport numpy as npfrom torch import nnclass Simple_net(nn.Modu
复制链接

扫一扫

专栏目录