乐鑫公司的ESP32芯片也可以AI加速啦!主要介绍三种,ESP-DSP是计算加速库
ESP-NN是深度学习加速库
TensorFlow Lite Micro for Espressif Chipsets是AI人工智能框架
ESP-DSP
ESP-DSP is the official DSP library for all Espressif chips. The library contains optimized functions for ESP32, ESP32-S3 and ESP32P4 chips.
GitHub - espressif/esp-dsp: DSP library for ESP-IDF
ESP-DSP库包括以下功能的实现:
- Matrix multiplication 矩阵乘法: reference
- Dot product 点乘?: reference, example
- FFT: reference, example
- IIR: reference, example
- FIR: reference
- Vector math operations 矢量数学运算: reference
- Kalman filter 卡尔曼滤波器: reference
ESP-NN
GitHub - espressif/esp-nn: Optimised Neural Network functions for Espressif chipsets
The library contains optimised NN (Neural Network) functions for various Espressif chips.
-
Supported platforms:
- TensorFlow Lite Micro (TFLite Micro). Repo can be found here
-
Supported ESP chips include:
- ESP32-S3 (Assembly versions optimised to benefit from vector instructions of ESP32-S3)
- ESP32 (Generic optimisations)
- ESP32-C3 (Generic optimisations)
对S3的加速效果非常明显:
Kernelwise performance on ESP32-S3 chip
- Numbers are ticks taken for kernel to execute
- Chip config: 240MHz, SPI: QPI 80MHz, Data cache: 64KB
Function | ANSI C | Optimized | Opt Ratio | Data info | Memory |
---|---|---|---|---|---|
elementwise_add | 312327 | 71644 | 4.36 | size = 1615 | External |
elementwise_mul | 122046 | 30950 | 3.95 | size = 1615 | External |
convolution | 4642259 | 461398 | 10.06 | input(10,10), filter(64x1x1x64), pad(0,0), stride(1,1) | External |
convolution | 300032 | 43578 | 6.9 | input(8,8), filter(16x1x1x16), pad(0,0), stride(1,1) | External |
convolution | 2106801 | 643689 | 3.27 | input(10,10), filter(64x3x3x3), pad(0,0), stride(1,1) | External |
depthwise conv | 1192832 | 191931 | 6.2 | input (18, 18), pad(0,0), stride(1,1) filter: 1x3x3x16 | External |
depthwise conv | 1679406 | 366102 | 4.59 | input (12, 12), pad(1,1), stride(1,1) filter: 8x5x5x4 | External |
max pool | 485714 | 76747 | 6.33 | input(16,16), filter (1x3x3x16) | Internal |
avg pool | 541462 | 160580 | 3.37 | input(16,16), filter (1x3x3x16) | Internal |
fully connected | 12290 | 4439 | 2.77 | len: 265, ch = 3 | Internal |
prelu (relu6) | 18315 | 1856 | 9.87 | size, 1615 | Internal |
可以看到 对ESP32-S3的加速非常明显,conv加速7倍左右,relu6加速10倍左右
TensorFlow Lite Micro for Espressif Chipsets
GitHub - espressif/esp-tflite-micro: TensorFlow Lite Micro for Espressif Chipsets
请介绍下:TensorFlow Lite Micro for Espressif Chipsets
TensorFlow Lite Micro for Espressif Chipsets 是一个针对乐鑫芯片集(Espressif)的系列芯片(例如 ESP32 和 ESP8266)的 TensorFlow Lite 微控制器框架。TensorFlow Lite 是一个由 TensorFlow 团队开发的库,旨在为移动设备或嵌入式设备提供机器学习能力。TensorFlow Lite Micro 是 TensorFlow Lite 的一个轻量级版本,专门为资源有限的设备(如微控制器)而设计。
特性:
适用于微控制器的轻量级机器学习。
支持神经网络推断,无需完整的操作系统。
使用 Espressif 的开发板和芯片进行快速原型设计和测试。
使用 TensorFlow Lite Micro for Espressif Chipsets 的基本步骤:
安装必要的软件和工具,如 ESP-IDF 和 TensorFlow。
为目标硬件创建或转换一个 TensorFlow Lite 模型。
使用 ESP-IDF 创建一个项目,并集成 TensorFlow Lite Micro 库。
移植模型到目标硬件上,并进行编译和烧录。
在ESP-NN的加持下,速度飞快:
A quick summary of ESP-NN optimisations, measured on various chipsets:
Target | TFLite Micro Example | without ESP-NN | with ESP-NN | CPU Freq |
---|---|---|---|---|
ESP32-P4 | Person Detection | 1395ms | 73ms | 360MHz |
ESP32-S3 | Person Detection | 2300ms | 54ms | 240MHz |
ESP32 | Person Detection | 4084ms | 380ms | 240MHz |
ESP32-C3 | Person Detection | 3355ms | 426ms | 160MHz |