基于MAX78000的手势识别人机交互系统-CSDN博客

本文链接：https://blog.csdn.net/qq_47941078/article/details/139009727

项目地址：https://www.eetree.cn/project/2604

代码下载：（1）csdn下载：https://download.csdn.net/download/qq_47941078/89713325

（2）项目地址页面最下方免费下载

视频地址：基于MAX78000的手势识别人机交互系统_哔哩哔哩_bilibili

（项目有点简陋，本来以为时间很长，开头就拿到板子的时候搞了一段时间，中间一直没碰。。。。等到最后发现时间快截止了的时候又和期末考试时间撞一起了，只能匆匆完结。。。）

模型是基于resnet18进行修改的，只保留了最后修改完的代码

conda activate Maxim

（1）要使用不同数据集的话，要在 ai8x-training-data 下里面的路径下存放数据集，并生成txt文件

（2）ai8x-training-datasets 下的 gesture.py 文件里修改数据读取路径

一、流程和一些细节

训练

train下

bash scripts/train-gesture.sh

#!/bin/sh
python train.py --epochs 200  --optimizer Adam --lr 0.001 --batch-size 64 --gpus 0 --deterministic --compress policies/schedule-gesture.yaml --model ai85net_gesture --dataset gesture --param-hist --pr-curves --embedding --device MAX78000 "$@"

log中训练得到的

复制到ai8x-synthesis的 self_proj_gesture中，主要是qat量化文件

注释了 ai8x-training/ai8x.py 的两行代码

1808 ：# b = target_attr.op.bias.data

1830 ：# target_attr.op.bias.data = b_new

量化

synthesis下

scripts/quantize_gesture.sh

#!/bin/sh
python quantize.py self_proj/gesture/qat_best.pth.tar self_proj/gesture/qat_best-ai8x-q.pth.tar --device MAX78000 -v "$@"

修改了 ai8x-synthesis/izer/quantize.py 一行代码

159 ：# bias_name = '.'.join([layer, operation, 'bias'])
160 ： bias_name = 'bias'

评估

train下

scripts/evaluate_gesture.sh

#!/bin/sh
python ./train.py --model ai85net_gesture --dataset gesture --confusion --evaluate --exp-load-weights-from ../ai8x-synthesis/self_proj/gesture/qat_best-ai8x-q.pth.tar -8 --device MAX78000 "$@"

生成npy文件

train中

./train.py --model ai85net_gesture --save-sample 10 --dataset gesture --evaluate --exp-load-weights-from ../ai8x-synthesis/self_proj/gesture/qat_best-ai8x-q.pth.tar -8 --device MAX78000 "$@"

会在 ai8x-training 文件下生存 sample_fpr2.npy

移动到测试目录：ai8x-synthesis/tests/sample_fpr2.npy

模型转换

synthesis中

./ai8xize.py --verbose --test-dir demos --prefix ai85-gesture --checkpoint-file self_proj/gesture/qat_best-ai8x-q.pth.tar --config-file networks/gesture-chw.yaml --device MAX78000 --compact-data --mexpress --softmax --overwrite

下列的模型转换的yaml配置文件可供参考，搭配模型一起看：

yaml文件

---
# FaceNet sequential model ending with avg_pool, CHW(big data) data_format

arch: ai85net_gesture
dataset: gesture

layers:
  - out_offset: 0x1000
    processors: 0x0000000000000007
    operation: conv2d
    max_pool: 1
    pool_stride: 3
    pad: 1
    kernel_size: 3x3
    activate: ReLU
    data_format: HWC



  - processors: 0x0ffff00000000000 # 16
    out_offset: 0x0000
    operation: conv2d
    activate: ReLU
    write_gap: 1
    max_pool: 1
    pool_stride: 2
    kernel_size: 3x3
    pad: 1
    output_processors: 0x00000000ffffffff # 32
  - processors: 0x00000000ffffffff
    out_offset: 0x2000
    operation: passthrough
    write_gap: 1
    output_processors: 0x00000000ffffffff # 32
    name: res1
  - pad: 1
    operation: conv2d
    kernel_size: 3x3
    activate: ReLU
    out_offset: 0x4000
    processors: 0x00000000ffffffff
  - operation: conv2d
    out_offset: 0x2004
    kernel_size: 3x3
    pad: 1
    name: res2
    write_gap: 1
    processors: 0x00000000ffffffff
# layer4 + blk2
  - in_sequences: [res1, res2]
    processors: 0x00000000ffffffff
    in_offset: 0x2000
    out_offset: 0x0000
    operation: conv2d
    eltwise: add
    max_pool: 1
    pool_stride: 2
    kernel_size: 3x3
    pad: 1
  - processors: 0x00000000ffffffff
    out_offset: 0x2000
    operation: passthrough
    write_gap: 1
    output_processors: 0x00000000ffffffff
    name: res3
  - pad: 1
    operation: conv2d
    kernel_size: 3x3
    activate: ReLU
    out_offset: 0x4000
    processors: 0x00000000ffffffff
  - operation: conv2d
    out_offset: 0x2004
    kernel_size: 3x3
    pad: 1
    name: res4
    write_gap: 1
    processors: 0x00000000ffffffff
# layer8 + blk3
  - in_sequences: [res3, res4]
    processors: 0x00000000ffffffff
    in_offset: 0x2000
    out_offset: 0x0000
    operation: conv2d
    eltwise: add
    max_pool: 1
    pool_stride: 2
    kernel_size: 3x3
    pad: 1
  - processors: 0xffffffffffffffff # 64
    out_offset: 0x2000
    operation: passthrough
    write_gap: 1
    output_processors: 0xffffffffffffffff
    name: res5
  - pad: 1
    operation: conv2d
    kernel_size: 3x3
    activate: ReLU
    out_offset: 0x4000
    processors: 0xffffffffffffffff
  - operation: conv2d
    out_offset: 0x2004
    kernel_size: 3x3
    pad: 1
    name: res6
    write_gap: 1
    processors: 0xffffffffffffffff
# layer12 + blk4
  - in_sequences: [res5, res6]
    processors: 0xffffffffffffffff
    in_offset: 0x2000
    out_offset: 0x0000
    operation: conv2d
    eltwise: add
    max_pool: 1
    pool_stride: 2
    kernel_size: 3x3
    pad: 1
  - processors: 0xffffffffffffffff
    out_offset: 0x2000
    operation: passthrough
    write_gap: 1
    output_processors: 0xffffffffffffffff
    name: res7
  - pad: 1
    operation: conv2d
    kernel_size: 3x3
    activate: ReLU
    out_offset: 0x4000
    processors: 0xffffffffffffffff
  - operation: conv2d
    out_offset: 0x2004
    kernel_size: 3x3
    pad: 1
    name: res8
    write_gap: 1
    processors: 0xffffffffffffffff
# layer16
  - in_sequences: [res7, res8]
    in_offset: 0x2000
    out_offset: 0x0000
    eltwise: add
    avg_pool: [2,2] # 64*2*2 -> 64*1*1, 设置为[1,1]的时候是把 64*2*2 -> 16*2*2,所以不行
    pool_stride: 1
    operation: None
    processors: 0xffffffffffffffff
    output_processors: 0xffffffffffffffff
# Layer 18 - LINNER
  - out_offset: 0x2000
    processors: 0xffffffffffffffff
    output_processors: 0x00000000000000f9
    operation: fc
    flatten: true
    output_width: 32

模型：


import torch
import torch.nn as nn
from torch.nn import functional as F

import ai8x

class ResBlk(nn.Module):
    """
    resnet block
    """

    def __init__(self, ch_in, ch_out, stride=1, bias=False, **kwargs):  # 要传入输入、输出的维度
        """

        :param ch_in:
        :param ch_out:
        """
        super(ResBlk, self).__init__()
        self.ch_in = ch_in
        self.ch_out = ch_out
        self.conv1 = ai8x.FusedMaxPoolConv2dReLU(ch_in, ch_out, kernel_size=3, pool_size=1, pool_stride=2, stride=stride, padding=1, bias=bias, **kwargs)	
        self.conv2 = ai8x.FusedConv2dReLU(ch_out, ch_out, kernel_size=3,stride=1, padding=1, bias=bias, **kwargs)
#        self.conv2 = ai8x.FusedConv2dReLU(ch_out, ch_out, kernel_size=3, stride=1, padding=1, #bias=bias, **kwargs)

        #self.conv1 = ai8x.Conv2d(ch_in, ch_out, kernel_size=3, stride=stride, padding=1)
        #self.bn1 = nn.BatchNorm2d(ch_out)
        #self.conv2 = ai8x.Conv2d(ch_out, ch_out, kernel_size=3, stride=1, padding=1)
        #self.bn2 = nn.BatchNorm2d(ch_out)

#        self.extra = nn.Sequential()
#        if ch_out != ch_in:
#            # [b, ch_in, h, w] => [b, ch_out, h, w]
#            self.extra = nn.Sequential(
#                ai8x.Conv2d(ch_in, ch_out, kernel_size=1, stride=stride, bias=bias, **kwargs),
#                #nn.BatchNorm2d(ch_out)
#            )

        self.resid1 = ai8x.Add()
 
        self.extra = ai8x.Conv2d(ch_out, ch_out, kernel_size=3, stride=stride, padding=1, bias=bias, **kwargs)
#        self.extra = ai8x.Conv2d(ch_in, ch_out, kernel_size=1, stride=stride, bias=bias, **kwargs)

            

    def forward(self, x):
        """

        :param x: [b, ch, h, w]
        :return:
        """
        #out = F.relu(self.bn1(self.conv1(x)))
        #out = self.bn2(self.conv2(out))

        #print("out1:", x)
        x = self.conv1(x)
        #print("x:", x.shape)
        out = self.conv2(x)
        #print("out:", x.shape)

        # short cut.
        # extra module: [b, ch_in, h, w] => [b, ch_out, ch_out, h, w]
        # element-wise add:
        # out = self.extra(x) + out

#        if self.ch_in != self.ch_out:
#            out = self.extra(x) + out
#        else:
#            out = x + out

#        out = self.extra(x) + out

        out = self.extra(out) 
        #print("out2:", out.shape)
        out = self.resid1(out, x)
        #print("out3:", out.shape)

        return out

class ResNet18(nn.Module):

    def __init__(self,num_classes=6, num_channels=1,dimensions=(64, 64),  bias=False, **kwargs):
        super(ResNet18, self).__init__()

        #self.conv1 = nn.Sequential(
        #    ai8x.Conv2d(3, 64, kernel_size=3, stride=3, padding=0),
        #    nn.BatchNorm2d(64)
        #)

#        self.conv1 = ai8x.FusedConv2dReLU(3, 32, kernel_size=3, stride=3, padding=0, bias=bias, #**kwargs)
        self.conv2 = ai8x.FusedMaxPoolConv2dReLU(3, 16, kernel_size=3, pool_size=1, pool_stride=3, stride=1, padding=1, bias=bias, **kwargs)
        # 修改为 stride=2, padding=1 后就是 64->32

        # followed 4 blocks
        # [b, 64, h, w] => [b, 128, h, w]       # 输入 22*22
        self.blk1 = ResBlk(16, 32, stride=1)  # 11*11
        # [b, 128, h, w] => [b, 256, h, w]
        self.blk2 = ResBlk(32, 32, stride=1)  # 6*6
        # [b, 256, h, w] => [b, 5112, h, w]
        self.blk3 = ResBlk(32, 64, stride=1)  # 3*3
        # [b, 512, h, w] => [b, 1024, h, w]
        self.blk4 = ResBlk(64, 64, stride=1)  # 2*2

        #self.out = ai8x.Conv2d(512, 256, kernel_size=1, stride=1, bias=bias, **kwargs)
        #self.out2 = ai8x.Conv2d(256, 128, kernel_size=1, stride=1, bias=bias, **kwargs)

        self.outlayer = ai8x.Linear(64 * 1 * 1, 6)

    def forward(self, x):
        """

        :param x:
        :return:
        """
        # x = F.relu(self.conv1(x))

        # print(x.shape) # torch.Size([64, 9, 64, 64])
        # x = x[:, :3, :, :]
        #print(x.shape) # torch.Size([3, 64, 64])

        #x = self.conv1(x)
        x = self.conv2(x)

        #print("x1:",x.shape)

        # [b, 64, h, w] => [b, 1024, h, w]
        x = self.blk1(x)
        #print("x2:",x.shape)
        x = self.blk2(x)
        #print("x3:",x.shape)
        x = self.blk3(x)
        #print("x4:",x.shape)
        x = self.blk4(x)
        #print("x5:",x.shape)

        # print('after conv:', x.shape) # [b, 512, 2, 2]
        # [b, 512, h, w] => [b, 512, 1, 1]
        # 不管你的输入是多少，最终经过这个 avgpooling 都会变成 [1, 1]的
        x = F.adaptive_avg_pool2d(x, [1, 1])

        #print("x6:",x.shape)
        # print('after pool:', x.shape)
        # after pool: torch.Size([2, 512, 1, 1])

        #x = self.out(x)
        #x = self.out2(x)

        x = x.view(x.size(0), -1)
        #print("x7:",x.shape)
        x = self.outlayer(x)
        #print("x8:",x.shape)

        return x


    

def ai85net_gesture(pretrained=False, **kwargs):
    assert not pretrained
    return ResNet18(**kwargs)

models = [
    {
        'name': 'ai85net_gesture',
        'min_input': 1,
        'dim': 3,
    },

]