ncnnqat

最新推荐文章于 2024-04-02 20:45:59 发布

摸愚校尉

最新推荐文章于 2024-04-02 20:45:59 发布

阅读量174

点赞数 1

分类专栏：笔记

本文链接：https://blog.csdn.net/chen13017535518/article/details/118162158

版权

笔记专栏收录该内容

1 篇文章 0 订阅

订阅专栏

ncnnqat

ncnnqat is a quantize aware training package for NCNN on pytorch.
https://www.github.com/ChenShisen/ncnnqat

ncnnqat

Installation

Supported Platforms: Linux
Accelerators and GPUs: NVIDIA GPUs via CUDA driver 10.1.
Dependencies:
- python >= 3.5, < 4
- pytorch >= 1.6
- numpy >= 1.18.1
- onnx >= 1.7.0
- onnx-simplifier >= 0.3.6
Install ncnnqat via pypi:
```
$ pip install ncnnqat (to do....)
```
It is recommended to install from the source code

or Install ncnnqat via repo：

$ git clone https://github.com/ChenShisen/ncnnqat
$ cd ncnnqat
$ make install

Usage

merge bn weight into conv and freeze bn

suggest finetuning from a well-trained model, register_quantization_hook and merge_freeze_bn at beginning. do it after a few epochs of training otherwise.

from ncnnqat import unquant_weight, merge_freeze_bn, register_quantization_hook
...
...
    for epoch in range(epoch_train):
  	  model.train()
  	  if epoch==well_epoch:
  		  register_quantization_hook(model)
  	  if epoch>=well_epoch:
  		  model = merge_freeze_bn(model)  #it will change bn to eval() mode during training
...

Unquantize weight before update it

...
...
    model.apply(unquant_weight)  # using original weight while updating
    optimizer.step()
...

Save weight and save ncnn quantize table after train

...
...
    onnx_path = "./xxx/model.onnx"
    table_path="./xxx/model.table"
    dummy_input = torch.randn(1, 3, img_size, img_size, device='cuda')
    input_names = [ "input" ]
    output_names = [ "fc" ]
    torch.onnx.export(model, dummy_input, onnx_path, verbose=False, input_names=input_names, output_names=output_names)
    save_table(model,onnx_path=onnx_path,table=table_path)

...

if use “model = nn.DataParallel(model)”,pytorch unsupport torch.onnx.export,you should save state_dict first and prepare a new model with one gpu,then you will export onnx model.

...
...
    model_s = new_net() #
    model_s.cuda()
    register_quantization_hook(model_s)
    #model_s = merge_freeze_bn(model_s)
    onnx_path = "./xxx/model.onnx"
    table_path="./xxx/model.table"
    dummy_input = torch.randn(1, 3, img_size, img_size, device='cuda')
    input_names = [ "input" ]
    output_names = [ "fc" ]
    model_s.load_state_dict({k.replace('module.',''):v for k,v in model.state_dict().items()}) #model_s = model     model = nn.DataParallel(model)
          
    torch.onnx.export(model_s, dummy_input, onnx_path, verbose=False, input_names=input_names, output_names=output_names)
    save_table(model_s,onnx_path=onnx_path,table=table_path)
    

...