(最全)PyTorch神经网络打印存储所有权重+状态+激活值(运行时中间值)+量化权重和激活

很多时候嵌入式或者新硬件需要纯净的权重模型和激活值(运行时中间值),本文提供一种最简洁的方法。
假设已经有模型model和pt文件了,在当前目录下新建weights文件夹,运行这段代码,就可以得到模型的权重(文本形式和二进制形式)。注意一定要使用state_dict(),不要用named_parameters()named_children()等等,有些数据比如BN的running_meanrunning_var就不是parameter,但在dict里。 在推理过程中,这些不是权重的数据也是必要的!

model.load_state_dict(state_dict)

global_index = 0
for name, state in model.state_dict().items():
    print(name, state.size())
    print(state.numpy(),file=open(f"weights/{global_index}-{name}.txt", "w"))
    state.numpy().tofile(f"weights/{global_index}-{name}.bin")
    global_index += 1

对于二进制形式的文件,可以通过od -t f4 <binary file name> 查看其对应的浮点数值。f4表示fp32.

打印forward的中间值:(这么复杂也是必要的)

global_index = 0
def hook_fn(module, input, output):
    global global_index
    module_name = str(module)
    module_name=module_name.replace(" ", "")
    module_name=module_name.replace("\n", "")
    # print(name)
    intermediate_outputs = {}
    # input is a tuple, output is a tensor
    for i, inp in enumerate(input):
        intermediate_outputs[f"{global_index}-{module_name}-input-{i}"] = inp
    intermediate_outputs[f"{global_index}-{module_name}-output"] = output
    module_name = module_name[0:200]  # make sure full path <= 255
    print(intermediate_outputs)
    print(f"Size input:",end=" ")
    if(type(input) == tuple):
        for i, inp in enumerate(input):
            if type(inp) == torch.Tensor:
                print(f"{i}-th Size: {inp.size()}", end=", ")
                inp.numpy().tofile(f"activations/{global_index}-{module_name}-input-{i}.bin")
            else:
                print(f"{i}-th : {inp}", end=", ")
    elif type(input) == torch.Tensor:
        print(f"Size: {input.size()}")
        input.numpy().tofile(f"activations/{global_index}-{module_name}-input.bin")
    print(f"Size output: {output.size()}")
    output.numpy().tofile(f"activations/{global_index}-{module_name}-output.bin")
    global_index += 1

def register_hooks(model):
    for name, layer in model.named_children():
        # print(name, layer) # dump all layers, > layers.txt
        # Register the hook to the current layer
        layer.register_forward_hook(hook_fn)
        # Recursively apply the same to all submodules
        register_hooks(layer)

register_hooks(model)

其中regster_hooks和以下等价(不需要recursive了)

def register_hooks(model):
    for name, layer in model.named_modules():
        # print(name, layer) # dump all layers
        layer.register_forward_hook(hook_fn)

接下来还可能进行权重和激活的量化(动态量化)。

# quantize data per tensor
import numpy as np
import os
import re

# read from weights dir
weight_binaries = os.listdir("weights")
# filter files containing 'weight'
weight_binaries = [w for w in weight_binaries if w.endswith('weight.bin')]
print(weight_binaries)
num_bit = 8
# notice this is symetric quantization
# quantize weights
for w in weight_binaries:
    # get the file name prefix
    weight = np.fromfile(f"weights/{w}", dtype=np.float32)
    max_weight_abs = np.max(np.abs(weight))
    fmax = 2**(num_bit-1)-1
    scale= (max_weight_abs / fmax).astype(np.float32)
    print(scale)
    weight = np.round(weight/scale).astype(np.int8)
    print(weight)
    # insert "q" in the file name before ".bin"
    prefix = os.path.splitext(w)[0]
    extension = os.path.splitext(w)[1]
    weight.tofile(f"weights/{prefix}-q{extension}")
    scale.tofile(f"weights/{prefix}-s{extension}")

# quantize activations
activation_binaries = os.listdir("activations")
# regular pattern input-<n>.bin
input_activation = [a for a in activation_binaries if bool(re.search(r'-input-(\d+)\.bin', a))]
output_activation = [a for a in activation_binaries if a.endswith('output.bin')]
for ia in input_activation:
    # get the file name prefix
    acts = np.fromfile(f"activations/{ia}", dtype=np.float32)
    # only quantize GEMM and CONV layers right now
    if '-Linear' not in ia and '-Conv' not in ia:
        continue
    max_acts_abs = np.max(np.abs(acts))
    fmax = 2**(num_bit-1)-1
    # min_weight = np.min(acts)
    scale= (max_acts_abs / fmax).astype(np.float32)
    print(scale)
    acts = np.round(acts/scale).astype(np.int8)
    print(acts)
    # insert "q" in the file name before ".bin"
    prefix = os.path.splitext(ia)[0]
    extension = os.path.splitext(ia)[1]
    acts.tofile(f"activations/{prefix}-q{extension}")
    scale.tofile(f"activations/{prefix}-s{extension}")

for oa in output_activation:
    # get the file name prefix
    acts = np.fromfile(f"activations/{oa}", dtype=np.float32)
    if '-Linear' not in oa and '-Conv' not in oa:
        continue
    max_acts_abs = np.max(np.abs(acts))
    fmax = 2**(num_bit-1)-1
    scale= (max_acts_abs / fmax).astype(np.float32)
    print(scale)
    acts = np.round(acts/scale).astype(np.int8)
    print(acts)
    # insert "q" in the file name before ".bin"
    prefix = os.path.splitext(oa)[0]
    extension = os.path.splitext(oa)[1]
    acts.tofile(f"activations/{prefix}-q{extension}")
    scale.tofile(f"activations/{prefix}-s{extension}")
  • 3
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
PyTorch神经网络中,常用的激活函数有Sigmoid函数、ReLU函数和Softmax函数。Sigmoid函数在机器学习的二分类模型中常被使用,例如逻辑回归。它模拟了生物神经元的特性,当神经元获得的输入信号累计超过一定的阈后,神经元被激活并输电信号,否则处于抑制状态。ReLU函数是一种非线性函数,它在输入大于零时返回输入,而在输入小于等于零时返回零。ReLU函数的主要作用是增加神经网络的表达能力,使其能够提取高语义的信息。Softmax函数常用于多分类问题,它将输入向量转换为概率分布,使得每个类别的概率之和为1。通过选择适当的激活函数,可以提高神经网络的性能和准确度。\[1\]\[2\]\[3\] #### 引用[.reference_title] - *1* [PyTorch教程(5)激活函数](https://blog.csdn.net/weixin_43229348/article/details/119353266)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v91^control_2,239^v3^insert_chatgpt"}} ] [.reference_item] - *2* *3* [一起来学PyTorch——神经网络激活函数层)](https://blog.csdn.net/TomorrowZoo/article/details/129453233)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v91^control_2,239^v3^insert_chatgpt"}} ] [.reference_item] [ .reference_list ]

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值