人工智能 AI项目模型硬件优化简记

鹏晓星

已于 2023-07-16 14:53:55 修改

阅读量136

点赞数

分类专栏：学后扩展学习笔记文章标签：人工智能学习深度学习 pytorch

于 2023-05-07 22:31:48 首次发布

本文链接：https://blog.csdn.net/weixin_44194638/article/details/130548884

版权

学习笔记同时被 2 个专栏收录

24 篇文章 0 订阅

订阅专栏

学后扩展

5 篇文章 0 订阅

订阅专栏

记录一个错误

model.load_state_dict(torch.load(origin_model_path))

在这里插入图片描述

报错信息：
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device(‘cpu’) to map your storages to the CPU.

修改方法:
增加map_location

model.load_state_dict(torch.load(origin_model_path, 
                                 map_location=lambda storage, loc:storage))
# 或者下面这种形式
model.load_state_dict(torch.load(origin_model_path, 
                                 map_location=torch.device('cpu'))

GPU训练 + CPU部署

# 将在GPU上训练好的模型加载到CPU上
model.load_state_dict(torch.load(origin_model_path, map_location=lambda storage, loc:storage))

CPU优化之模型量化

Quantizing a network means converting it to use a reduced precision integer representation for the weights and/or activations. This saves on model size and allows the use of higher throughput math operations on your CPU or GPU.

量化网络意味着将其转换为使用权重和/或激活的精度降低的整数表示。这节省了模型大小，并允许在CPU或GPU上使用更高吞吐量的数学运算。

model.load_state_dict(torch.load(origin_model_path, map_location=config.DEVICE))

# 使用torch.quantization.quantize_dynamic获得动态量化的模型
# 量化的网络层为所有的nn.Linear的权重，使其成为int8
quantized_model = torch.quantization.quantize_dynamic(model, {torch.nn.Linear}, dtype=torch.qint8)

print_size_of_model(model)
print_size_of_model(quantized_model)