乐鑫公司的ESP32芯片也可以AI加速啦！ESP-DSP 、ESP-NN和TensorFlow Lite Micro for Espressif Chipsets

最新推荐文章于 2025-03-29 19:12:16 发布

skywalk8163

最新推荐文章于 2025-03-29 19:12:16 发布

阅读量1.6k

点赞数 32

分类专栏：物联网人工智能文章标签：人工智能嵌入式硬件 esp32

本文链接：https://blog.csdn.net/skywalk8163/article/details/144335264

版权

人工智能同时被 2 个专栏收录

254 篇文章

订阅专栏

物联网

43 篇文章

订阅专栏

乐鑫公司的ESP32芯片也可以AI加速啦！主要介绍三种，ESP-DSP是计算加速库

ESP-NN是深度学习加速库

TensorFlow Lite Micro for Espressif Chipsets是AI人工智能框架

ESP-DSP

ESP-DSP is the official DSP library for all Espressif chips. The library contains optimized functions for ESP32, ESP32-S3 and ESP32P4 chips.

GitHub - espressif/esp-dsp: DSP library for ESP-IDF

ESP-DSP库包括以下功能的实现：

Matrix multiplication 矩阵乘法: reference
Dot product 点乘？: reference, example
FFT: reference, example
IIR: reference, example
FIR: reference
Vector math operations 矢量数学运算: reference
Kalman filter 卡尔曼滤波器: reference

ESP-NN

GitHub - espressif/esp-nn: Optimised Neural Network functions for Espressif chipsets

The library contains optimised NN (Neural Network) functions for various Espressif chips.

Supported platforms:
- TensorFlow Lite Micro (TFLite Micro). Repo can be found here
Supported ESP chips include:
- ESP32-S3 (Assembly versions optimised to benefit from vector instructions of ESP32-S3)
- ESP32 (Generic optimisations)
- ESP32-C3 (Generic optimisations)

对S3的加速效果非常明显：

Kernelwise performance on ESP32-S3 chip

Numbers are ticks taken for kernel to execute
Chip config: 240MHz, SPI: QPI 80MHz, Data cache: 64KB

Function	ANSI C	Optimized	Opt Ratio	Data info	Memory
elementwise_add	312327	71644	4.36	size = 1615	External
elementwise_mul	122046	30950	3.95	size = 1615	External
convolution	4642259	461398	10.06	input(10,10), filter(64x1x1x64), pad(0,0), stride(1,1)	External
convolution	300032	43578	6.9	input(8,8), filter(16x1x1x16), pad(0,0), stride(1,1)	External
convolution	2106801	643689	3.27	input(10,10), filter(64x3x3x3), pad(0,0), stride(1,1)	External
depthwise conv	1192832	191931	6.2	input (18, 18), pad(0,0), stride(1,1) filter: 1x3x3x16	External
depthwise conv	1679406	366102	4.59	input (12, 12), pad(1,1), stride(1,1) filter: 8x5x5x4	External
max pool	485714	76747	6.33	input(16,16), filter (1x3x3x16)	Internal
avg pool	541462	160580	3.37	input(16,16), filter (1x3x3x16)	Internal
fully connected	12290	4439	2.77	len: 265, ch = 3	Internal
prelu (relu6)	18315	1856	9.87	size, 1615	Internal

可以看到对ESP32-S3的加速非常明显，conv加速7倍左右，relu6加速10倍左右

TensorFlow Lite Micro for Espressif Chipsets

GitHub - espressif/esp-tflite-micro: TensorFlow Lite Micro for Espressif Chipsets

请介绍下:TensorFlow Lite Micro for Espressif Chipsets

TensorFlow Lite Micro for Espressif Chipsets 是一个针对乐鑫芯片集（Espressif）的系列芯片（例如 ESP32 和 ESP8266）的 TensorFlow Lite 微控制器框架。TensorFlow Lite 是一个由 TensorFlow 团队开发的库，旨在为移动设备或嵌入式设备提供机器学习能力。TensorFlow Lite Micro 是 TensorFlow Lite 的一个轻量级版本，专门为资源有限的设备（如微控制器）而设计。

特性：

适用于微控制器的轻量级机器学习。

支持神经网络推断，无需完整的操作系统。

使用 Espressif 的开发板和芯片进行快速原型设计和测试。

使用 TensorFlow Lite Micro for Espressif Chipsets 的基本步骤：

安装必要的软件和工具，如 ESP-IDF 和 TensorFlow。

为目标硬件创建或转换一个 TensorFlow Lite 模型。

使用 ESP-IDF 创建一个项目，并集成 TensorFlow Lite Micro 库。

移植模型到目标硬件上，并进行编译和烧录。

在ESP-NN的加持下，速度飞快：

A quick summary of ESP-NN optimisations, measured on various chipsets:

Target	TFLite Micro Example	without ESP-NN	with ESP-NN	CPU Freq
ESP32-P4	Person Detection	1395ms	73ms	360MHz
ESP32-S3	Person Detection	2300ms	54ms	240MHz
ESP32	Person Detection	4084ms	380ms	240MHz
ESP32-C3	Person Detection	3355ms	426ms	160MHz