在使用pytorch构建网络时,如果不自定义进行参数初始化,那么将使用默认的初始化方法,具体查看 pytorch默认参数初始化以及自定义参数初始化
Pytorch提供了几种不同类型的初始化方法 https://pytorch.org/docs/stable/nn.init.html ,如果想使用自定义的初始化方式,则有两种方法:
- 在定义模型的类的里面定义初始化函数,然后在类的__init__()函数中进行参数初始化
- 在在定义模型的类的外部定义初始化函数,然后将类实例化之后再进行参数的初始化
下面将具体介绍两种方法。
1、在定义模型的类的里面定义初始化函数
1.1、如果想对模型里面的不同module,比如线性层,卷积层等采用不同的初始化方法,那么可以使用 self.modules()获取模型中的module,然后使用 isinstance函数判断module的类型,代码如下:
import torch
import torch.nn as nn
import torch.nn.functional as F
torch.manual_seed(1)
class MyModel(nn.Module):
def __init__(self):
super(MyModel, self).__init__()
self.linear1 = nn.Linear(3, 3)
self.linear2 = nn.Linear(3, 3)
self._init_parameters()
def forward(self, x):
output = self.linear1(x)
output = self.linear1(output)
def _init_parameters(self):
for m in self.modules():
if isinstance(m, nn.Linear):
nn.init.xavier_uniform_(m.weight.data)
nn.init.constant_(m.bias.data, 0)
net = MyModel()
for name, p in net.named_parameters():
print(name)
print(p)
output:
linear1.weight
Parameter containing:
tensor([[-0.3204, 0.0479, 0.5961],
[ 0.5435, -0.9776, 0.6199],
[ 0.2794, 0.9486, 0.6601]], requires_grad=True)
linear1.bias
Parameter containing:
tensor([0., 0., 0.], requires_grad=True)
linear2.weight
Parameter containing:
tensor([[-0.9111, -0.9508, -0.4823],
[ 0.8781, -0.1666, 0.4280],
[-0.4647, 0.9812, -0.4231]], requires_grad=True)
linear2.bias
Parameter containing:
tensor([0., 0., 0.], requires_grad=True)
1.2、如果想对所有类型的module都使用相同的初始化方法,可以采用self.parameters()获取参数,代码如下:
import torch
import torch.nn as nn
import torch.nn.functional as F
torch.manual_seed(1)
class MyModel(nn.Module):
def __init__(self):
super(MyModel, self).__init__()
self.linear1 = nn.Linear(3, 3)
self.linear2 = nn.Linear(3, 3)
self._init_parameters()
def forward(self, x):
output = self.linear1(x)
output = self.linear1(output)
def _init_parameters(self):
for p in self.parameters():
# 对weights进行初始化
if p.dim() > 1:
nn.init.xavier_uniform_(p)
# 对bias进行初始化
else:
nn.init.constant_(p, 0)
net = MyModel()
for name, p in net.named_parameters():
print(name)
print(p)
outputs:
linear1.weight
Parameter containing:
tensor([[-0.3204, 0.0479, 0.5961],
[ 0.5435, -0.9776, 0.6199],
[ 0.2794, 0.9486, 0.6601]], requires_grad=True)
linear1.bias
Parameter containing:
tensor([0., 0., 0.], requires_grad=True)
linear2.weight
Parameter containing:
tensor([[-0.9111, -0.9508, -0.4823],
[ 0.8781, -0.1666, 0.4280],
[-0.4647, 0.9812, -0.4231]], requires_grad=True)
linear2.bias
Parameter containing:
tensor([0., 0., 0.], requires_grad=True)
2、在在定义模型的类的外部定义初始化函数
2.1、同样的,如果想对模型里面的不同module,比如线性层,卷积层等采用不同的初始化方法,那么可以使用 net.modules()获取模型中的module,然后使用 isinstance函数判断module的类型,代码如下:
import torch
import torch.nn as nn
import torch.nn.functional as F
torch.manual_seed(1)
class MyModel(nn.Module):
def __init__(self):
super(MyModel, self).__init__()
self.linear1 = nn.Linear(3, 3)
self.linear2 = nn.Linear(3, 3)
def forward(self, x):
output = self.linear1(x)
output = self.linear1(output)
def init_parameters(net):
for m in net.modules():
if isinstance(m, nn.Linear):
nn.init.xavier_uniform_(m.weight.data)
nn.init.constant_(m.bias.data, 0)
net = MyModel()
for name, p in net.named_parameters():
print(name)
print(p)
init_parameters(net)
print("\n在执行自定义的初始化后:")
for name, p in net.named_parameters():
print(name)
print(p)
outputs:
linear1.weight
Parameter containing:
tensor([[ 0.2975, -0.2548, -0.1119],
[ 0.2710, -0.5435, 0.3462],
[-0.1188, 0.2937, 0.0803]], requires_grad=True)
linear1.bias
Parameter containing:
tensor([-0.0707, 0.1601, 0.0285], requires_grad=True)
linear2.weight
Parameter containing:
tensor([[ 0.2109, -0.2250, -0.0421],
[-0.0520, 0.0837, -0.0023],
[ 0.5047, 0.1797, -0.2150]], requires_grad=True)
linear2.bias
Parameter containing:
tensor([-0.3487, -0.0968, -0.2490], requires_grad=True)
在执行自定义的初始化后:
linear1.weight
Parameter containing:
tensor([[-0.3204, 0.0479, 0.5961],
[ 0.5435, -0.9776, 0.6199],
[ 0.2794, 0.9486, 0.6601]], requires_grad=True)
linear1.bias
Parameter containing:
tensor([0., 0., 0.], requires_grad=True)
linear2.weight
Parameter containing:
tensor([[-0.9111, -0.9508, -0.4823],
[ 0.8781, -0.1666, 0.4280],
[-0.4647, 0.9812, -0.4231]], requires_grad=True)
linear2.bias
Parameter containing:
tensor([0., 0., 0.], requires_grad=True)
2.2、如果想对所有类型的module都使用相同的初始化方法,可以采用net.parameters()获取参数,代码如下:
import torch
import torch.nn as nn
import torch.nn.functional as F
torch.manual_seed(1)
class MyModel(nn.Module):
def __init__(self):
super(MyModel, self).__init__()
self.linear1 = nn.Linear(3, 3)
self.linear2 = nn.Linear(3, 3)
def forward(self, x):
output = self.linear1(x)
output = self.linear1(output)
def init_parameters(net):
for p in net.parameters():
if p.dim() > 1:
nn.init.xavier_uniform_(p)
else:
nn.init.constant_(p, 0)
net = MyModel()
for name, p in net.named_parameters():
print(name)
print(p)
init_parameters(net)
print("\n在执行自定义的初始化后:")
for name, p in net.named_parameters():
print(name)
print(p)
outputs:
linear1.weight
Parameter containing:
tensor([[ 0.2975, -0.2548, -0.1119],
[ 0.2710, -0.5435, 0.3462],
[-0.1188, 0.2937, 0.0803]], requires_grad=True)
linear1.bias
Parameter containing:
tensor([-0.0707, 0.1601, 0.0285], requires_grad=True)
linear2.weight
Parameter containing:
tensor([[ 0.2109, -0.2250, -0.0421],
[-0.0520, 0.0837, -0.0023],
[ 0.5047, 0.1797, -0.2150]], requires_grad=True)
linear2.bias
Parameter containing:
tensor([-0.3487, -0.0968, -0.2490], requires_grad=True)
在执行自定义的初始化后:
linear1.weight
Parameter containing:
tensor([[ 0.2975, -0.2548, -0.1119],
[ 0.2710, -0.5435, 0.3462],
[-0.1188, 0.2937, 0.0803]], requires_grad=True)
linear1.bias
Parameter containing:
tensor([-0.0707, 0.1601, 0.0285], requires_grad=True)
linear2.weight
Parameter containing:
tensor([[ 0.2109, -0.2250, -0.0421],
[-0.0520, 0.0837, -0.0023],
[ 0.5047, 0.1797, -0.2150]], requires_grad=True)
linear2.bias
Parameter containing:
tensor([-0.3487, -0.0968, -0.2490], requires_grad=True)