项目地址:https://www.eetree.cn/project/2604
代码下载:(1)csdn下载:https://download.csdn.net/download/qq_47941078/89713325
(2)项目地址页面最下方免费下载
视频地址:基于MAX78000的手势识别人机交互系统_哔哩哔哩_bilibili
(项目有点简陋,本来以为时间很长,开头就拿到板子的时候搞了一段时间,中间一直没碰。。。。等到最后发现时间快截止了的时候又和期末考试时间撞一起了,只能匆匆完结。。。)
模型是基于resnet18进行修改的,只保留了最后修改完的代码
conda activate Maxim
(1)要使用不同数据集的话,要在 ai8x-training-data 下里面的路径下存放数据集,并生成txt文件
(2)ai8x-training-datasets 下的 gesture.py 文件里修改数据读取路径
一、流程和一些细节
训练
train下
bash scripts/train-gesture.sh
#!/bin/sh
python train.py --epochs 200 --optimizer Adam --lr 0.001 --batch-size 64 --gpus 0 --deterministic --compress policies/schedule-gesture.yaml --model ai85net_gesture --dataset gesture --param-hist --pr-curves --embedding --device MAX78000 "$@"
log中训练得到的
复制到ai8x-synthesis的 self_proj_gesture中,主要是qat量化文件
注释了 ai8x-training/ai8x.py 的两行代码
1808 :# b = target_attr.op.bias.data
1830 :# target_attr.op.bias.data = b_new
量化
synthesis下
scripts/quantize_gesture.sh
#!/bin/sh
python quantize.py self_proj/gesture/qat_best.pth.tar self_proj/gesture/qat_best-ai8x-q.pth.tar --device MAX78000 -v "$@"
修改了 ai8x-synthesis/izer/quantize.py 一行代码
159 :# bias_name = '.'.join([layer, operation, 'bias'])
160 : bias_name = 'bias'
评估
train下
scripts/evaluate_gesture.sh
#!/bin/sh
python ./train.py --model ai85net_gesture --dataset gesture --confusion --evaluate --exp-load-weights-from ../ai8x-synthesis/self_proj/gesture/qat_best-ai8x-q.pth.tar -8 --device MAX78000 "$@"
生成npy文件
train中
./train.py --model ai85net_gesture --save-sample 10 --dataset gesture --evaluate --exp-load-weights-from ../ai8x-synthesis/self_proj/gesture/qat_best-ai8x-q.pth.tar -8 --device MAX78000 "$@"
会在 ai8x-training 文件下生存 sample_fpr2.npy
移动到测试目录:ai8x-synthesis/tests/sample_fpr2.npy
模型转换
synthesis中
./ai8xize.py --verbose --test-dir demos --prefix ai85-gesture --checkpoint-file self_proj/gesture/qat_best-ai8x-q.pth.tar --config-file networks/gesture-chw.yaml --device MAX78000 --compact-data --mexpress --softmax --overwrite
下列的模型转换的yaml配置文件可供参考,搭配模型一起看:
yaml文件
---
# FaceNet sequential model ending with avg_pool, CHW(big data) data_format
arch: ai85net_gesture
dataset: gesture
layers:
- out_offset: 0x1000
processors: 0x0000000000000007
operation: conv2d
max_pool: 1
pool_stride: 3
pad: 1
kernel_size: 3x3
activate: ReLU
data_format: HWC
- processors: 0x0ffff00000000000 # 16
out_offset: 0x0000
operation: conv2d
activate: ReLU
write_gap: 1
max_pool: 1
pool_stride: 2
kernel_size: 3x3
pad: 1
output_processors: 0x00000000ffffffff # 32
- processors: 0x00000000ffffffff
out_offset: 0x2000
operation: passthrough
write_gap: 1
output_processors: 0x00000000ffffffff # 32
name: res1
- pad: 1
operation: conv2d
kernel_size: 3x3
activate: ReLU
out_offset: 0x4000
processors: 0x00000000ffffffff
- operation: conv2d
out_offset: 0x2004
kernel_size: 3x3
pad: 1
name: res2
write_gap: 1
processors: 0x00000000ffffffff
# layer4 + blk2
- in_sequences: [res1, res2]
processors: 0x00000000ffffffff
in_offset: 0x2000
out_offset: 0x0000
operation: conv2d
eltwise: add
max_pool: 1
pool_stride: 2
kernel_size: 3x3
pad: 1
- processors: 0x00000000ffffffff
out_offset: 0x2000
operation: passthrough
write_gap: 1
output_processors: 0x00000000ffffffff
name: res3
- pad: 1
operation: conv2d
kernel_size: 3x3
activate: ReLU
out_offset: 0x4000
processors: 0x00000000ffffffff
- operation: conv2d
out_offset: 0x2004
kernel_size: 3x3
pad: 1
name: res4
write_gap: 1
processors: 0x00000000ffffffff
# layer8 + blk3
- in_sequences: [res3, res4]
processors: 0x00000000ffffffff
in_offset: 0x2000
out_offset: 0x0000
operation: conv2d
eltwise: add
max_pool: 1
pool_stride: 2
kernel_size: 3x3
pad: 1
- processors: 0xffffffffffffffff # 64
out_offset: 0x2000
operation: passthrough
write_gap: 1
output_processors: 0xffffffffffffffff
name: res5
- pad: 1
operation: conv2d
kernel_size: 3x3
activate: ReLU
out_offset: 0x4000
processors: 0xffffffffffffffff
- operation: conv2d
out_offset: 0x2004
kernel_size: 3x3
pad: 1
name: res6
write_gap: 1
processors: 0xffffffffffffffff
# layer12 + blk4
- in_sequences: [res5, res6]
processors: 0xffffffffffffffff
in_offset: 0x2000
out_offset: 0x0000
operation: conv2d
eltwise: add
max_pool: 1
pool_stride: 2
kernel_size: 3x3
pad: 1
- processors: 0xffffffffffffffff
out_offset: 0x2000
operation: passthrough
write_gap: 1
output_processors: 0xffffffffffffffff
name: res7
- pad: 1
operation: conv2d
kernel_size: 3x3
activate: ReLU
out_offset: 0x4000
processors: 0xffffffffffffffff
- operation: conv2d
out_offset: 0x2004
kernel_size: 3x3
pad: 1
name: res8
write_gap: 1
processors: 0xffffffffffffffff
# layer16
- in_sequences: [res7, res8]
in_offset: 0x2000
out_offset: 0x0000
eltwise: add
avg_pool: [2,2] # 64*2*2 -> 64*1*1, 设置为[1,1]的时候是把 64*2*2 -> 16*2*2,所以不行
pool_stride: 1
operation: None
processors: 0xffffffffffffffff
output_processors: 0xffffffffffffffff
# Layer 18 - LINNER
- out_offset: 0x2000
processors: 0xffffffffffffffff
output_processors: 0x00000000000000f9
operation: fc
flatten: true
output_width: 32
模型:
import torch
import torch.nn as nn
from torch.nn import functional as F
import ai8x
class ResBlk(nn.Module):
"""
resnet block
"""
def __init__(self, ch_in, ch_out, stride=1, bias=False, **kwargs): # 要传入输入、输出的维度
"""
:param ch_in:
:param ch_out:
"""
super(ResBlk, self).__init__()
self.ch_in = ch_in
self.ch_out = ch_out
self.conv1 = ai8x.FusedMaxPoolConv2dReLU(ch_in, ch_out, kernel_size=3, pool_size=1, pool_stride=2, stride=stride, padding=1, bias=bias, **kwargs)
self.conv2 = ai8x.FusedConv2dReLU(ch_out, ch_out, kernel_size=3,stride=1, padding=1, bias=bias, **kwargs)
# self.conv2 = ai8x.FusedConv2dReLU(ch_out, ch_out, kernel_size=3, stride=1, padding=1, #bias=bias, **kwargs)
#self.conv1 = ai8x.Conv2d(ch_in, ch_out, kernel_size=3, stride=stride, padding=1)
#self.bn1 = nn.BatchNorm2d(ch_out)
#self.conv2 = ai8x.Conv2d(ch_out, ch_out, kernel_size=3, stride=1, padding=1)
#self.bn2 = nn.BatchNorm2d(ch_out)
# self.extra = nn.Sequential()
# if ch_out != ch_in:
# # [b, ch_in, h, w] => [b, ch_out, h, w]
# self.extra = nn.Sequential(
# ai8x.Conv2d(ch_in, ch_out, kernel_size=1, stride=stride, bias=bias, **kwargs),
# #nn.BatchNorm2d(ch_out)
# )
self.resid1 = ai8x.Add()
self.extra = ai8x.Conv2d(ch_out, ch_out, kernel_size=3, stride=stride, padding=1, bias=bias, **kwargs)
# self.extra = ai8x.Conv2d(ch_in, ch_out, kernel_size=1, stride=stride, bias=bias, **kwargs)
def forward(self, x):
"""
:param x: [b, ch, h, w]
:return:
"""
#out = F.relu(self.bn1(self.conv1(x)))
#out = self.bn2(self.conv2(out))
#print("out1:", x)
x = self.conv1(x)
#print("x:", x.shape)
out = self.conv2(x)
#print("out:", x.shape)
# short cut.
# extra module: [b, ch_in, h, w] => [b, ch_out, ch_out, h, w]
# element-wise add:
# out = self.extra(x) + out
# if self.ch_in != self.ch_out:
# out = self.extra(x) + out
# else:
# out = x + out
# out = self.extra(x) + out
out = self.extra(out)
#print("out2:", out.shape)
out = self.resid1(out, x)
#print("out3:", out.shape)
return out
class ResNet18(nn.Module):
def __init__(self,num_classes=6, num_channels=1,dimensions=(64, 64), bias=False, **kwargs):
super(ResNet18, self).__init__()
#self.conv1 = nn.Sequential(
# ai8x.Conv2d(3, 64, kernel_size=3, stride=3, padding=0),
# nn.BatchNorm2d(64)
#)
# self.conv1 = ai8x.FusedConv2dReLU(3, 32, kernel_size=3, stride=3, padding=0, bias=bias, #**kwargs)
self.conv2 = ai8x.FusedMaxPoolConv2dReLU(3, 16, kernel_size=3, pool_size=1, pool_stride=3, stride=1, padding=1, bias=bias, **kwargs)
# 修改为 stride=2, padding=1 后就是 64->32
# followed 4 blocks
# [b, 64, h, w] => [b, 128, h, w] # 输入 22*22
self.blk1 = ResBlk(16, 32, stride=1) # 11*11
# [b, 128, h, w] => [b, 256, h, w]
self.blk2 = ResBlk(32, 32, stride=1) # 6*6
# [b, 256, h, w] => [b, 5112, h, w]
self.blk3 = ResBlk(32, 64, stride=1) # 3*3
# [b, 512, h, w] => [b, 1024, h, w]
self.blk4 = ResBlk(64, 64, stride=1) # 2*2
#self.out = ai8x.Conv2d(512, 256, kernel_size=1, stride=1, bias=bias, **kwargs)
#self.out2 = ai8x.Conv2d(256, 128, kernel_size=1, stride=1, bias=bias, **kwargs)
self.outlayer = ai8x.Linear(64 * 1 * 1, 6)
def forward(self, x):
"""
:param x:
:return:
"""
# x = F.relu(self.conv1(x))
# print(x.shape) # torch.Size([64, 9, 64, 64])
# x = x[:, :3, :, :]
#print(x.shape) # torch.Size([3, 64, 64])
#x = self.conv1(x)
x = self.conv2(x)
#print("x1:",x.shape)
# [b, 64, h, w] => [b, 1024, h, w]
x = self.blk1(x)
#print("x2:",x.shape)
x = self.blk2(x)
#print("x3:",x.shape)
x = self.blk3(x)
#print("x4:",x.shape)
x = self.blk4(x)
#print("x5:",x.shape)
# print('after conv:', x.shape) # [b, 512, 2, 2]
# [b, 512, h, w] => [b, 512, 1, 1]
# 不管你的输入是多少,最终经过这个 avgpooling 都会变成 [1, 1]的
x = F.adaptive_avg_pool2d(x, [1, 1])
#print("x6:",x.shape)
# print('after pool:', x.shape)
# after pool: torch.Size([2, 512, 1, 1])
#x = self.out(x)
#x = self.out2(x)
x = x.view(x.size(0), -1)
#print("x7:",x.shape)
x = self.outlayer(x)
#print("x8:",x.shape)
return x
def ai85net_gesture(pretrained=False, **kwargs):
assert not pretrained
return ResNet18(**kwargs)
models = [
{
'name': 'ai85net_gesture',
'min_input': 1,
'dim': 3,
},
]
二、补充
(1)若模型量化后(这里都用qat方式)评估时,精度下降严重,将你模型中所有使用 nn.xx 的替换成 ai8x.xx 的,需要找 ai8x.py 中的函数的对应名称
(2)使用ai8x.Add()进行ai8x的残差连接
(3)avg_pool: [2,2]
具体含义小编也不太清楚,不过可参考下式理解
[2,2]:64*2*2 -> 64*1*1, 设置为[1,1]的时候是把 64*2*2 -> 16*2*2,所以不行
(4)个人理解 processors 和 output_processors 其实就是通道channel的大小,processors是输入通道,output_processors是输出通道
0xffffffffffffffff = 64
0x00000000ffffffff = 32
0x0ffff00000000000 = 16
0x0000000000000007 = 3
0x00000000000000f9 = 6
一些 大概的值,具体计算没去研究
(5)[res3, res4] 也可用层号代替,如[1,3]
(6)yaml 中好像不能写入stride,可能默认是1,只能写pool_stride,所以我对我的模型做了一些修改,改为stride都使用1的情况
(7)yaml 文件的配置参考:MaximAI_Documentation/Guides/YAML Quickstart.md at main · analogdevicesinc/MaximAI_Documentation · GitHub
补充一个:gesture-chw.yaml
kernel 为1*1时,pad要设置为0
(kernel应该是要一起设置,不然pad会有默认值)