1、在SE5上安装libsophon的开发包,开发包在SDK内的位置为 sophon-img_/bsp-debs/sophon-soc-libsophon-dev<x.y.z>_arm64.deb 。我的deb包在~/bsp-debs:
linaro@bm1684:~/bsp-debs$ ls
SDK_VERSION sophgo-bsp-images_2.4.0_arm64.deb
be_byteshift.h sophgo-bsp-qt5_1.0.0_arm64.deb
bm_panda_3.1.0_soc_arm64.deb sophgo-bsp-rootfs_1.0.0_arm64.deb
firefly-cluster-csa1-client-wdt_1.0.0_arm64.deb sophgo-bsp-tools_1.0.0_arm64.deb
firefly-cluster-csa1-client_1.7.2_arm64.deb sophgo-hdmi_1.0.0_arm64.deb
le_byteshift.h sophgo-system_1.0.0_arm64.deb
linux-headers-5.4.217-bm1684-gda40c6bd729e.deb sophon-mw-soc-sophon-ffmpeg_0.7.1_arm64.deb
linux-headers-install.sh sophon-mw-soc-sophon-opencv_0.7.1_arm64.deb
linux-image-5.4.217-bm1684-gda40c6bd729e-dbg.deb **sophon-soc-libsophon-dev_0.4.9-LTS_arm64.deb**
linux-image-5.4.217-bm1684-gda40c6bd729e.deb sophon-soc-libsophon_0.4.9-LTS_arm64.deb
安装代码:
sudo dpkg -i sophon-soc-libsophon-dev_*_arm64.deb
2、在SE5上执行 bm_version 命令,输出包含 sophon-soc-libsophon 和 sophon-soc-libsophon-dev ,且版本号一致标志安装成功。此次使用的版本号如下:
运行代码:
bm_version
3、将1684相关的tpu-kernel压缩包拷贝到SE5上,其对应在SDK内的位置为 tpu-kernel__/tpu-kernel-1684_v<x.y.z>--.tar.gz 。
如:
tpu-kernel-1684_v3.1.7-e74df266-230710.tar.gz
4、在SE5上解压tpu-kernel压缩包,进入目录后执行更新固件命令。
代码如下:
tar -xzvf tpu-kernel-1684_v<x.y.z>-<hash>-<data>.tar.gz
cd tpu-kernel-1684_v<x.y.z>-<hash>-<data>/
python3 ./scripts/load_firmware.py --firmware ./firmware/bm1684_ddr.bin_v<x.y.z>-<hash>-<data> --firmware_tcm ./firmware/bm1684_tcm.bin_v<x.y.z>-<hash>-<data> --device_id 0
执行完成后将打印相关信息,不同版本的tpu-kernel显示的版本信息可能有所差异,次版本的打印的信息如下:
args: Namespace(device_id=0, firmware='./firmware/bm1684_ddr.bin_v3.1.7-e74df266-230710', firmware_tcm='./firmware/bm1684_tcm.bin_v3.1.7-e74df266-230710')
succeed to load firmware ./firmware/bm1684_ddr.bin_v3.1.7-e74df266-230710 on device_id=0
5、进入测试文件目录,并编译可执行程序:
cd samples/
g++ test_instruction_perf.cpp -lbmlib -lpthread -ldl -L/opt/sophon/libsophon-current/lib/ -I/opt/sophon/libsophon-current/include/ -std=c++11 -o test_instruction_perf
6、测试:
场景 | 命令 |
---|---|
INT8 关闭winograd加速 | ./test_instruction_perf -d int8 |
INT8 打开winograd加速 | ./test_instruction_perf -d int8 -w 1 |
FP32 | ./test_instruction_perf -d fp32 |
附:完整运行日志如下:
linaro@bm1684:~/bsp-debs$ ls
SDK_VERSION sophgo-bsp-images_2.4.0_arm64.deb
be_byteshift.h sophgo-bsp-qt5_1.0.0_arm64.deb
bm_panda_3.1.0_soc_arm64.deb sophgo-bsp-rootfs_1.0.0_arm64.deb
firefly-cluster-csa1-client-wdt_1.0.0_arm64.deb sophgo-bsp-tools_1.0.0_arm64.deb
firefly-cluster-csa1-client_1.7.2_arm64.deb sophgo-hdmi_1.0.0_arm64.deb
le_byteshift.h sophgo-system_1.0.0_arm64.deb
linux-headers-5.4.217-bm1684-gda40c6bd729e.deb sophon-mw-soc-sophon-ffmpeg_0.7.1_arm64.deb
linux-headers-install.sh sophon-mw-soc-sophon-opencv_0.7.1_arm64.deb
linux-image-5.4.217-bm1684-gda40c6bd729e-dbg.deb sophon-soc-libsophon-dev_0.4.9-LTS_arm64.deb
linux-image-5.4.217-bm1684-gda40c6bd729e.deb sophon-soc-libsophon_0.4.9-LTS_arm64.deb
linaro@bm1684:~/bsp-debs$ sudo dpkg -i sophon-soc-libsophon-dev_0.4.9-LTS_arm64.deb
(Reading database ... 57609 files and directories currently installed.)
Preparing to unpack sophon-soc-libsophon-dev_0.4.9-LTS_arm64.deb ...
Unpacking sophon-soc-libsophon-dev (0.4.9) over (0.4.9) ...
Setting up sophon-soc-libsophon-dev (0.4.9) ...
linaro@bm1684:~/bsp-debs$ cd ..
linaro@bm1684:~$ ls
bsp-debs multimedia_test tpu-kernel-1684_v3.1.7-e74df266-230710
linaro@bm1684:~$ cd tpu-kernel-1684_v3.1.7-e74df266-230710/
linaro@bm1684:~/tpu-kernel-1684_v3.1.7-e74df266-230710$ ls
README.md doc firmware include lib samples scripts
linaro@bm1684:~/tpu-kernel-1684_v3.1.7-e74df266-230710$ python3 ./scripts/load_firmware.py --firmware ./firmware/bm1684_ddr.bin_v3.1.7-e74df266-230710 --firmware_tcm ./firmware/bm1684_tcm.bin_v3.1.7-e74df266-230710 --device_id 0
args: Namespace(device_id=0, firmware='./firmware/bm1684_ddr.bin_v3.1.7-e74df266-230710', firmware_tcm='./firmware/bm1684_tcm.bin_v3.1.7-e74df266-230710')
succeed to load firmware ./firmware/bm1684_ddr.bin_v3.1.7-e74df266-230710 on device_id=0
linaro@bm1684:~/tpu-kernel-1684_v3.1.7-e74df266-230710$ bm_version
SophonSDK version: 23.09 LTS
sophon-soc-libsophon : 0.4.9
sophon-soc-libsophon-dev : 0.4.9
sophon-mw-soc-sophon-ffmpeg : 0.7.1
sophon-mw-soc-sophon-opencv : 0.7.1
BL2 v2.7(release):424d1a71 Built : 09:02:27, Oct 28 2023
BL31 v2.7(release):424d1a71 Built : 09:02:27, Oct 28 2023
U-Boot 2022.10 424d1a71 (Oct 28 2023 - 09:02:21 +0000) Sophon BM1684
KernelVersion : Linux bm1684 5.4.217-bm1684-gda40c6bd729e #4 SMP Sat Oct 28 09:02:41 UTC 2023 aarch64 aarch64 aarch64 GNU/Linux
HWVersion: 0x12
MCUVersion: 0x38
linaro@bm1684:~/tpu-kernel-1684_v3.1.7-e74df266-230710$ python3 ./scripts/load_firmware.py --firmware ./firmware/bm1684_ddr.bin_v3.1.7-e74df266-230710 --firmware_tcm ./firmware/bm1684_tcm.bin_v3.1.7-e74df266-230710 --device_id 0
args: Namespace(device_id=0, firmware='./firmware/bm1684_ddr.bin_v3.1.7-e74df266-230710', firmware_tcm='./firmware/bm1684_tcm.bin_v3.1.7-e74df266-230710')
succeed to load firmware ./firmware/bm1684_ddr.bin_v3.1.7-e74df266-230710 on device_id=0
linaro@bm1684:~/tpu-kernel-1684_v3.1.7-e74df266-230710$ ls
README.md doc firmware include lib samples scripts
linaro@bm1684:~/tpu-kernel-1684_v3.1.7-e74df266-230710$ cd samples/
linaro@bm1684:~/tpu-kernel-1684_v3.1.7-e74df266-230710/samples$ ls
README.md test_instruction_perf.cpp
linaro@bm1684:~/tpu-kernel-1684_v3.1.7-e74df266-230710/samples$ g++ test_instruction_perf.cpp -lbmlib -lpthread -ldl -L/opt/sophon/libsophon-current/lib/ -I/opt/sophon/libsophon-current/include/ -std=c++11 -o test_instruction_perf
linaro@bm1684:~/tpu-kernel-1684_v3.1.7-e74df266-230710/samples$ ls
README.md test_instruction_perf test_instruction_perf.cpp
linaro@bm1684:~/tpu-kernel-1684_v3.1.7-e74df266-230710/samples$ ./test_instruction_perf -d int8
================= start test conv2d =================
data type: int8 loop times: 1000
conv param: input shape=(4 64 64 128), oc=64, kernel=(3 3), stride=(1 1), dilation=(1 1), insert=(0 0), pad=(1 1 1 1)
TPU total time= 141673(us) TPU avg time= 141.67(us) TPU compute capability= 17.05T
linaro@bm1684:~/tpu-kernel-1684_v3.1.7-e74df266-230710/samples$ ./test_instruction_perf -d int8 -w 1
================= start test conv2d =================
data type: int8 loop times: 1000
conv param: input shape=(4 64 64 128), oc=64, kernel=(3 3), stride=(1 1), dilation=(1 1), insert=(0 0), pad=(1 1 1 1)
TPU total time= 100729(us) TPU avg time= 100.73(us) TPU compute capability= 23.98T
linaro@bm1684:~/tpu-kernel-1684_v3.1.7-e74df266-230710/samples$ ./test_instruction_perf -d fp32
================= start test conv2d =================
data type: fp32 loop times: 1000
conv param: input shape=(4 64 64 128), oc=64, kernel=(3 3), stride=(1 1), dilation=(1 1), insert=(0 0), pad=(1 1 1 1)
TPU total time= 1267925(us) TPU avg time= 1267.93(us) TPU compute capability= 1.91T