神经网络模型中-model/model.name_module()的区别

标题 model / model.name_module()的区别:

对于已经训练好的网络模型,为了方便后续使用和提取某一层的参数,在储存模型时,我们会对网络每一层进行命名+储存参数数据。所以我们可以对每一层进行操作,例如剪枝操作。
在判断当前层时,使用:

for name, m0 in model.name_module():
	if isinstance(m0, Conv):
		....(argument you need)
		

那么到底print(model)和print(model.name_module())到底有什么区别呢?
当我们使用 print(model)时,输出model中全部结构和层,例如:
[下列打印的输出是博主在进行shuffleNetV2剪枝的时候输出的模型结构]

ShuffleNetV2_2(
  (conv1): Conv2d(3, 24, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
  (stage2): Sequential(
    (ShuffleUnit_Stage2_0): ShuffleUnit(
      (g_conv_1x1_compress): Sequential(
        (conv1x1): Conv2d(24, 50, kernel_size=(1, 1), stride=(1, 1))
        (batch_norm): BatchNorm2d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU()
      )
      (depthwise_conv3x3): Conv2d(50, 50, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=50)
      (bn_after_depthwise): BatchNorm2d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (g_conv_1x1_expand): Sequential(
        (conv1x1): Conv2d(50, 176, kernel_size=(1, 1), stride=(1, 1), groups=2)
        (batch_norm): BatchNorm2d(176, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (ShuffleUnit_Stage2_1): ShuffleUnit(
      (g_conv_1x1_compress): Sequential(
        (conv1x1): Conv2d(200, 50, kernel_size=(1, 1), stride=(1, 1), groups=2)
        (batch_norm): BatchNorm2d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU()
      )
      (depthwise_conv3x3): Conv2d(50, 50, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=50)
      (bn_after_depthwise): BatchNorm2d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (g_conv_1x1_expand): Sequential(
        (conv1x1): Conv2d(50, 200, kernel_size=(1, 1), stride=(1, 1), groups=2)
        (batch_norm): BatchNorm2d(200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (ShuffleUnit_Stage2_2): ShuffleUnit(
      (g_conv_1x1_compress): Sequential(
        (conv1x1): Conv2d(200, 50, kernel_size=(1, 1), stride=(1, 1), groups=2)
        (batch_norm): BatchNorm2d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU()
      )
      (depthwise_conv3x3): Conv2d(50, 50, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=50)
      (bn_after_depthwise): BatchNorm2d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (g_conv_1x1_expand): Sequential(
        (conv1x1): Conv2d(50, 200, kernel_size=(1, 1), stride=(1, 1), groups=2)
        (batch_norm): BatchNorm2d(200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (ShuffleUnit_Stage2_3): ShuffleUnit(
      (g_conv_1x1_compress): Sequential(
        (conv1x1): Conv2d(200, 50, kernel_size=(1, 1), stride=(1, 1), groups=2)
        (batch_norm): BatchNorm2d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU()
      )
      (depthwise_conv3x3): Conv2d(50, 50, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=50)
      (bn_after_depthwise): BatchNorm2d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (g_conv_1x1_expand): Sequential(
        (conv1x1): Conv2d(50, 200, kernel_size=(1, 1), stride=(1, 1), groups=2)
        (batch_norm): BatchNorm2d(200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
  )
 ])

但我们打印print(model.name_module()):

为了方便各位观看,这里只截取一小部分,进行讲解展示

  1. model.name_module()先将model全部内容打印一边,如同前面print(model)结果一样
  2. 打印一边全部内容后,进行迭代,将结构中的全部block中的每一层打印一边:
    1)先打印当前block,2)后打印当前block中的每一层
    3)以此类推,把全部结构打印出来
    举个本文代码的例子,结构共16个block,但最后打印出来166个结果,具体打印出来多少,根据每一个block中的层数而定!
0  ShuffleNetV2_2(
  (conv1): Conv2d(3, 24, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
  (stage2): Sequential(
    (ShuffleUnit_Stage2_0): ShuffleUnit(
      (g_conv_1x1_compress): Sequential(
        (conv1x1): Conv2d(24, 50, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (batch_norm): BatchNorm2d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU()
      )
      (depthwise_conv3x3): Conv2d(50, 50, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=50, bias=False)
      (bn_after_depthwise): BatchNorm2d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (g_conv_1x1_expand): Sequential(
        (conv1x1): Conv2d(50, 176, kernel_size=(1, 1), stride=(1, 1), groups=2, bias=False)
        (batch_norm): BatchNorm2d(176, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (ShuffleUnit_Stage2_1): ShuffleUnit(
      (g_conv_1x1_compress): Sequential(
        (conv1x1): Conv2d(200, 50, kernel_size=(1, 1), stride=(1, 1), groups=2, bias=False)
        (batch_norm): BatchNorm2d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU()
      )
      (depthwise_conv3x3): Conv2d(50, 50, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=50, bias=False)
      (bn_after_depthwise): BatchNorm2d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (g_conv_1x1_expand): Sequential(
        (conv1x1): Conv2d(50, 200, kernel_size=(1, 1), stride=(1, 1), groups=2, bias=False)
        (batch_norm): BatchNorm2d(200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (ShuffleUnit_Stage2_2): ShuffleUnit(
      (g_conv_1x1_compress): Sequential(
        (conv1x1): Conv2d(200, 50, kernel_size=(1, 1), stride=(1, 1), groups=2, bias=False)
        (batch_norm): BatchNorm2d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU()
      )
      (depthwise_conv3x3): Conv2d(50, 50, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=50, bias=False)
      (bn_after_depthwise): BatchNorm2d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (g_conv_1x1_expand): Sequential(
        (conv1x1): Conv2d(50, 200, kernel_size=(1, 1), stride=(1, 1), groups=2, bias=False)
        (batch_norm): BatchNorm2d(200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (ShuffleUnit_Stage2_3): ShuffleUnit(
      (g_conv_1x1_compress): Sequential(
        (conv1x1): Conv2d(200, 50, kernel_size=(1, 1), stride=(1, 1), groups=2, bias=False)
        (batch_norm): BatchNorm2d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU()
      )
      (depthwise_conv3x3): Conv2d(50, 50, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=50, bias=False)
      (bn_after_depthwise): BatchNorm2d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (g_conv_1x1_expand): Sequential(
        (conv1x1): Conv2d(50, 200, kernel_size=(1, 1), stride=(1, 1), groups=2, bias=False)
        (batch_norm): BatchNorm2d(200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
  )

下面的输出结果解释什么叫迭代输出:

4 是当前block,内含conv,bn,relu三层
5 是迭代4中的block内容,依次输出 5 conv,6 bn,7 relu

4 stage2.ShuffleUnit_Stage2_0.g_conv_1x1_compress Sequential(
  (conv1x1): Conv2d(24, 50, kernel_size=(1, 1), stride=(1, 1), bias=False)
  (batch_norm): BatchNorm2d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU()
)
5 stage2.ShuffleUnit_Stage2_0.g_conv_1x1_compress.conv1x1 Conv2d(24, 50, kernel_size=(1, 1), stride=(1, 1), bias=False)
6 stage2.ShuffleUnit_Stage2_0.g_conv_1x1_compress.batch_norm BatchNorm2d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
7 stage2.ShuffleUnit_Stage2_0.g_conv_1x1_compress.relu ReLU()

所以在使用model中block内容的时候,一定弄清两者区别
尤其要注意后者的输出内容和输出顺序!!

祝好

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
python web_demo.py Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Traceback (most recent call last): File "/home/nano/THUDM/ChatGLM-6B/web_demo.py", line 5, in <module> tokenizer = AutoTokenizer.from_pretrained("/home/nano/THUDM/chatglm-6b", trust_remote_code=True) File "/home/nano/.local/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 679, in from_pretrained return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs) File "/home/nano/.local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1804, in from_pretrained return cls._from_pretrained( File "/home/nano/.local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1958, in _from_pretrained tokenizer = cls(*init_inputs, **init_kwargs) File "/home/nano/.cache/huggingface/modules/transformers_modules/chatglm-6b/tokenization_chatglm.py", line 221, in __init__ self.sp_tokenizer = SPTokenizer(vocab_file, num_image_tokens=num_image_tokens) File "/home/nano/.cache/huggingface/modules/transformers_modules/chatglm-6b/tokenization_chatglm.py", line 64, in __init__ self.text_tokenizer = TextTokenizer(vocab_file) File "/home/nano/.cache/huggingface/modules/transformers_modules/chatglm-6b/tokenization_chatglm.py", line 22, in __init__ self.sp.Load(model_path) File "/home/nano/.local/lib/python3.10/site-packages/sentencepiece/__init__.py", line 905, in Load return self.LoadFromFile(model_file) File "/home/nano/.local/lib/python3.10/site-packages/sentencepiece/__init__.py", line 310, in LoadFromFile return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg) RuntimeError: Internal: src/sentencepiece_processor.cc(1101) [model_proto->ParseFromArray(serialized.data(), serialized.size())]什么错误
07-22

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值