嵌入式AI实践--基于RT-PI识别“石头剪刀布”

最新推荐文章于 2023-11-25 02:34:16 发布

青锋断尘

最新推荐文章于 2023-11-25 02:34:16 发布

阅读量1.4k

点赞数

本文链接：https://blog.csdn.net/qq_40262357/article/details/119114516

版权

嵌入式AI实践–基于RT-PI识别“石头剪刀布”

文章目录

嵌入式AI实践--基于RT-PI识别“石头剪刀布”

背景和实践目标

在人工智能迅猛发展的当下，AI部署也从PC端逐步下移到嵌入式端。所谓的嵌入式AI，则是一种让 AI 算法可以在终端设备上运行的技术概念。当然，嵌入式设备受算力的限制，不能像PC机一样同时完成AI模型的训练和推理。一般是利用PC机完成AI模型的训练、嵌入式端完成AI模型的推理。

本次嵌入式AI实践的目标则是利用RT-Thread出的RT-Pi通过摄像头识别出石头、剪刀、布。AI模型为简单的卷积加全连接结构。模型的训练在PC上完成，推理在RT-Pi上完成。

大致流程为

在PC端利用TensorFlow部署网络框架，训练模型
PC端对模型进行量化
嵌入式端模型部署

结果展示

在这里插入图片描述

软硬件介绍

RT-Pi

在这里插入图片描述

RT-Pi是一款基于STM32H750XBH6的开发板。其具体的介绍见网址：ART-Pi

AI转换工具：RT-AK

RT-AK 是 RT-Thread 团队为 RT-Thread 实时操作系统所开发的 AI 套件，能够将 AI 模型一键部署到 RT-Thread 项目中，让用户可以在统一的 API 之上进行业务代码开发，又能在目标平台上获极致优化的性能，从而更简单方便地开发端侧 AI 应用程序。

其具体介绍见网址：RT-AK:

模型搭建和训练

数据搜集

之所以选择识别石头、剪刀、布，是因为本人是AI小白，这个模型简单，数据集容易获取。数据集链接：data

该数据集的图片尺寸为300*300。其中，训练集：石头、剪刀、布的图片各840张；测试集的的布为117张、石头和剪刀图片各为124张；也提供10多张的预测图片。

数据集读取和处理

TensorFlow无法处理图像格式的数据的，需要图像数据转化为数组格式或者张量格式。在读取图数据的同时还需要打上标签。另外，同一类型的图像放在一起不利于模型的训练，因此需打乱数据。

# 获取文件路径和标签
def get_files(file_dir):
    # file_dir: 文件夹路径
    # return: 乱序后的图片和标签

    paper = []
    laber_paper = []
    rock = []
    laber_rock = []
    scissors= []
    laber_scissors= []
    # 载入数据路径并写入标签值
    for file in os.listdir(file_dir):
        if file == 'paper':
            for i in os.listdir(file_dir + file):
                paper.append(file_dir + file+'\\'+i)
                laber_paper.append(0)
        elif file == 'rock':
            for i in os.listdir(file_dir + file):
                rock.append(file_dir + file+'\\'+i)
                laber_rock.append(1)  
        else:
            for i in os.listdir(file_dir + file):
                scissors.append(file_dir + file+'\\'+i)
                laber_scissors.append(2)
    print("There are %d paper\nThere are %d rock\nthere are %d scissors" % (len(paper), len(rock) ,len(scissors)))
    # 打乱文件顺序
    image_list = np.hstack((paper, rock,scissors))
    label_list = np.hstack((laber_paper, laber_rock, laber_scissors))
    temp = np.array([image_list, label_list])
    temp = temp.transpose()     # 转置
    np.random.shuffle(temp)

    image_list = list(temp[:, 0])
    label_list = list(temp[:, 1])
    label_list = [int(i) for i in label_list]
    
    #格式转换
    train_data=[]
    for i in range(len(image_list)):
        im=tf.io.read_file(image_list[i])
        im=tf.image.decode_png(im,channels=3)
        train_data.append(im)
    train_data = tf.cast(train_data, tf.uint8)
    train_laber= tf.cast(label_list, tf.uint8) 

    return train_data, train_laber

由于原始数据的尺寸为300*300，图片为RGBA格式。为了模型训练方便，需要将图像缩放为60*60，RGB格式。

#归一化处理,缩放
train_image = tf.image.resize(train_data,[60,60])
test_image = tf.image.resize(test_data,[60,60])
predict_image = tf.image.resize(predict_data,[60,60])
train_image=train_image/255
test_image=test_image/255
predict_image=predict_image/255

神经网络搭建

本次的神经网络模模型为2层卷积神经网络和一层全连接网络。

# build network
from tensorflow.keras import models, layers
tf.keras.backend.clear_session()
model = models.Sequential()
# conv1
model.add(layers.Conv2D(input_shape=(60, 60, 3), filters=8, 
    kernel_size=(3, 3), activation='relu', padding='same', name='conv1'))
model.add(layers.MaxPool2D(pool_size=(2,2), name='pool1'))
# conv2
model.add(layers.Conv2D(filters=16, kernel_size=(5, 5), 
                        activation='relu', name='conv2'))
model.add(layers.MaxPool2D(pool_size=(2,2), name='pool2'))
# flattern

model.add(layers.Flatten(name='flatten'))

model.add(layers.Dense(3, activation='softmax', name="FC3"))
model.summary()

模型的训练和测试

model.compile(optimizer='adam',
             loss='sparse_categorical_crossentropy',
             metrics=['accuracy',])
history = model.fit(train_image, train_laber, batch_size=128, epochs=12)

训练12次以后模型的损失函数为0.189，准确率为0.9996。损失函数和准确率的变化如下:
在这里插入图片描述

我们使用测试集的数据验证模型


print("Evaluate on test data")
results = model.evaluate(test_image, test_laber, batch_size=12)
print("test loss, test acc:", results)

结果为损失函数：0.371,；准确率为：0.8849

从结果可以看出模型由于简单，测试的效果没有达到训练的效果。

模型量化

我们知道TensorFlow对模型和数据均采用浮点型数据，但是嵌入式设备由于存储和算力的限制。使用int型的数据类型能大大提升推理速度。

# 恢复 keras 模型，并预测
    model = tf.keras.models.load_model(keras_file)

    # 动态量化 dynamic range quantization
    converter = tf.lite.TFLiteConverter.from_keras_model(model)
    converter.optimizations = [tf.lite.Optimize.DEFAULT]
    converter.representative_dataset = representative_data_gen
    # Ensure that if any ops can't be quantized, the converter throws an error
    converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
    # Set the input and output tensors to uint8 (APIs added in r2.3)
    converter.inference_input_type = tf.uint8
    converter.inference_output_type = tf.uint8
    
    tflite_model = converter.convert()
    
    tflite_file.write_bytes(tflite_model)
    print("convert model to tflite done...")

在没有量化前的h5模型大小为170Kb，量化后的tflite模型为16kb

嵌入式端部署

模型转化

模型的转化前需要准备好一个可用的工程，下载好RT-thread提供AI-tools转化工具

python aitools.py --project=<your_project_path> --model=<your_model_path> --platform=stm32 --ext_tools=<your_x-cube-ai_path> --clear

一条命令则可以将AI模型部署到现有工程中。
在这里插入图片描述

从生成的头文件可以看出模型需要占用的内存，模型的输入、输出格式。

嵌入式端模型部署

图像缩放

由于摄像头输入的图像为240*320的，但是模型输入的要求为60*60的。因此采用双线插值对图像数据进行缩放

int is_in_array(short x, short y, short height, short width)
{
    if (x >= 0 && x < width && y >= 0 && y < height)
        return 1;
    else
        return 0;
}
void bilinera_interpolation(rt_uint8_t* in_array, short height, short width,
                            rt_uint8_t* out_array, short out_height, short out_width)
{
    double h_times = (double)out_height / (double)height,
           w_times = (double)out_width / (double)width;
    short  x1, y1, x2, y2, f11, f12, f21, f22;
    double x, y;

    for (int i = 0; i < out_height; i++){
        for (int j = 0; j < out_width*3; j=j+3){
            for (int k =0; k <3; k++){
                x = j / w_times + k;
                y = i / h_times;

                x1 = (short)(x - 3);
                x2 = (short)(x + 3);
                y1 = (short)(y + 1);
                y2 = (short)(y - 1);
                f11 = is_in_array(x1, y1, height, width*3) ? in_array[y1*width*3+x1] : 0;
                f12 = is_in_array(x1, y2, height, width*3) ? in_array[y2*width*3+x1] : 0;
                f21 = is_in_array(x2, y1, height, width*3) ? in_array[y1*width*3+x2] : 0;
                f22 = is_in_array(x2, y2, height, width*3) ? in_array[y2*width*3+x2] : 0;
                out_array[i*out_width*3+j+k] = (rt_uint8_t)(((f11 * (x2 - x) * (y2 - y)) +
                                           (f21 * (x - x1) * (y2 - y)) +
                                           (f12 * (x2 - x) * (y - y1)) +
                                           (f22 * (x - x1) * (y - y1))) / ((x2 - x1) * (y2 - y1)));
            }
        }
    }
}

模型的推理

要完成模型的推理，就简单了。只需要发现模型、完成模型的内存分配、模型的初始化后就可以进行模型的推理，输出推理结果

    /* find a registered model handle */
    static rt_ai_t model = NULL;
    model = rt_ai_find(RT_AI_GAME_MODEL_NAME);
    if(!model) {rt_kprintf("ai model find err\r\n"); return -1;}
    // allocate input memory
    rt_ai_buffer_t *input_image = rt_malloc(RT_AI_GAME_IN_1_SIZE_BYTES);
    if (!input_image) {rt_kprintf("malloc err\n"); return -1;}
    // allocate calculate memory
    rt_ai_buffer_t *work_buf = rt_malloc(RT_AI_GAME_WORK_BUFFER_BYTES);
    if (!work_buf) {rt_kprintf("malloc err\n"); return -1;}
    // allocate output memory
    rt_ai_buffer_t *_out = rt_malloc(RT_AI_GAME_OUT_1_SIZE_BYTES);
    if (!_out) {rt_kprintf("malloc err\n"); return -1;}
    // ai model init
    rt_ai_buffer_t model_init = rt_ai_init(model, work_buf);
    if (model_init != 0) {rt_kprintf("ai init err\n"); return -1;}
    rt_ai_config(model, CFG_INPUT_0_ADDR, input_image);
    rt_ai_config(model, CFG_OUTPUT_0_ADDR, _out);

运行模型、得出推理结果

rt_ai_run(model, NULL, NULL);
uint8_t *out = (uint8_t *)rt_ai_output(model, 0);

总结

由于AI部分是现学现卖，对模型有太多的知识盲区，导致模型的效果并不是很好。另外数据集来自于国外，其手势和国内有差别，导致实时识别的效果并不理想

FG_INPUT_0_ADDR, input_image);
rt_ai_config(model, CFG_OUTPUT_0_ADDR, _out);


运行模型、得出推理结果

```c
rt_ai_run(model, NULL, NULL);
uint8_t *out = (uint8_t *)rt_ai_output(model, 0);

总结

在这里插入图片描述

青锋断尘

关注

0
点赞
踩
16

收藏

觉得还不错? 一键收藏
1
评论
嵌入式AI实践--基于RT-PI识别“石头剪刀布”

嵌入式AI实践–基于RT-PI识别“石头剪刀布”文章目录嵌入式AI实践--基于RT-PI识别“石头剪刀布”背景和实践目标结果展示软硬件介绍RT-PiAI转换工具：RT-AK模型搭建和训练数据搜集数据集读取和处理神经网络搭建模型的训练和测试模型量化嵌入式端部署模型转化嵌入式端模型部署图像缩放模型的推理总结总结背景和实践目标在人工智能迅猛发展的当下，AI部署也从PC端逐步下移到嵌入式端。所谓的嵌入式AI，则是一种让 AI 算法可以在终端设备上运行的技术概念。当然，嵌入式设备受算力的限制，不能像PC机一
复制链接

扫一扫