背景:
想看看模型结构和参数情况,用来判断输入输出之间的对应关系。
1.打印模型结构
demo:
from transformers import AutoTokenizer, AutoModelForMaskedLM,BertTokenizer, BertModel, BertForMaskedLM
import torch
tokenizer = BertTokenizer.from_pretrained("uer/chinese_roberta_L-2_H-128")
model = BertForMaskedLM.from_pretrained("uer/chinese_roberta_L-2_H-128")
text = "我是一个学生。"
tokenized_text = tokenizer.tokenize(text)
masked_index = 2
tokenized_text[masked_index] = '[MASK]'
indexed_tokens = tokenizer.convert_tokens_to_ids(tokenized_text)
tokens_tensor = torch.tensor([indexed_tokens])
# encoded_input = tokenizer(text, return_tensors='pt')
output = model(tokens_tensor)
print(model)
输出BertForMaskedLM模型的结构:
BertForMaskedLM(
(bert): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(21128, 128, padding_idx=0)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0): BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(1): BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
(cls): BertOnlyMLMHead(
(predictions): BertLMPredictionHead(
(transform): BertPredictionHeadTransform(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
)
(decoder): Linear(in_features=128, out_features=21128, bias=True)
)
)
)
这里有点迷惑,每一层的输入、输出是什么形状的?
可以看到这种方式输出的模型结构内容不够详细而且形式不够直观,我们想实现像keras打印模型结构的那种效果,可以使用torchkeras模型的summary函数。
2.打印有条理的网络结构
torchkeras模型的安装方法为(可以放心安装,不会破坏你当前的环境,已经尝试过。每次安装新的东西,都担心会破坏掉我当前的虚拟环境,虽然环境可以回滚,可还是不想麻烦这一遭):
pip install torchkeras
调用代码如下,这里没有用上面的例子,上面的例子稍微有点复杂,没有调试通....:
import torch
from torch import nn
from torchkeras import summary
def create_net():
net = nn.Sequential()
net.add_module('linear1', nn.Linear(15, 20))
net.add_module('relu1', nn.ReLU())
net.add_module('linear2', nn.Linear(20, 1))
net.add_module('sigmoid', nn.Sigmoid())
return net
# 创建模型
net = create_net()
# 使用torchkeras中的summary函数打印模型结构和参数
print(summary(net, input_shape=(15, )))
输出:
/Users/wang/anaconda3/envs/torch/bin/python /Users/wangzhenzhu/UER-py/demo.py
----------------------------------------------------------------
Layer (type) Output Shape Param #
================================================================
Linear-1 [-1, 20] 320
ReLU-2 [-1, 20] 0
Linear-3 [-1, 1] 21
Sigmoid-4 [-1, 1] 0
================================================================
Total params: 341
Trainable params: 341
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.000057
Forward/backward pass size (MB): 0.000320
Params size (MB): 0.001301
Estimated Total Size (MB): 0.001678
----------------------------------------------------------------
None
Process finished with exit code 0
3.打印模型参数
for name, param in model.named_parameters():
if param.requires_grad:
print("-----model.named_parameters()--{}:{}".format(name, ""))
设置参数是否更新:
opt_para = ['module.classifier.weight', 'module.classifier.bias']
for name, param in model.named_parameters():
if name not in opt_para:
param.requires_grad = False
if name in opt_para:
param.requires_grad = True
参考: