zynq部署yolo并实现硬件加速（一）

52Hz鲸_孑

于 2024-08-29 20:23:00 发布

阅读量568

点赞数 29

文章标签： YOLO

本文链接：https://blog.csdn.net/a1915067127/article/details/140714619

版权

所需材料

模型：已配置训练完成的yolo模型(主要是用于参考，比对)，训练完的模型量化后的参数文件。

硬件加速IP核：conv层

参考案例最终资源消耗情况：暂无

卷积层硬件加速IP核制作（HLS）

hls导入卷积层源码（加入加速方案），进行仿真后输出ip核

整体布局（vivado）

导入HLS制作的IP核

hls导出的zip解压后的文件夹

DMA

PS控制部分代码

yolo v3 tiny 网络结构及实现

裸机实现（vitis）

‘暂无

流程大概是使用vitis建立app工程，加入裸机代码，编译后将boot.bin烧录

不知道为什么app中添加的头文件会有很多错误

因为没有操作系统，所以代码可以通过生成的ip核中的接口函数直接操作寄存器（调用xilinx官方提供的操作函数），实现对PL端数据流的控制。

在hls打包输出的ip文件中，驱动源码的sinit.c和.c用于裸机驱动，linux.c和.c用于linux驱动。

linux下实现

linux 操作系统下，app代码无法直接操作寄存器控制PL。app可以通过设备树和PL端模块的驱动实现控制（设备树中加入dma和conv，把dma和conv的驱动编译成模块安装，app通过系统调用函数使用PL模块）。

dma和conv的linux驱动

dma

驱动开源代码如下

xilinx_axidma:A zero-copy Linux driver and a userspace interface library for Xilinx's AXI DMA and VDMA IP blocks. These serve as bridges for communication between the processing system and FPGA programmable logic fabric, through one of the DMA ports on the Zynq processing system. Distributed under the MIT License. - GitCodehttps://gitcode.com/gh_mirrors/xil/xilinx_axidma/overview?utm_source=csdn_github_accelerator&isLogin=1

驱动依赖于连续内存分配器（CMA），Xilinx的DMA和VDMA驱动，以及DMA缓冲区共享。这些都必须在编译驱动程序时启用的内核中。

CONFIG_CMA=y
CONFIG_DMA_CMA=y
CONFIG_XILINX_DMAENGINES=y
CONFIG_DMA_SHARED_BUFFER=y

//修改config.mk      
/*Makefile 支持在配置文件中指定这些变量。*/
CROSS_COMPILE = arm-linux-gnueabihf-
ARCH = arm
KBUILD_DIR = /home/wrj/vivado/kernel/linux-xlnx-xilinx-v2018.3/
OUTPUT_DIR = outputs

make->得到.ko文件、example、动态库等文件

文件解析 tree xilinx_axidma/
xilinx_axidma/
├── driver               /*driver provide ops to lib*/
│   ├── axi_dma.c        /*probe & init DMA interface & init chardev*/
│   ├── axidma_chrdev.c  /*init chardev & fops for lib*/
│   ├── axidma_dma.c     /*init DMA interface & ops for ioctrl */
│   ├── axidma.h         /*inter .h (include DMA interface func)*/
│   ├── axidma_of.c      /*devicetree decode*/
│   ├── driver.mk
│   └── Kbuild
├── examples
├── include
│   ├── axidma_ioctl.h    /*IOCTL Interface for lib*/
│   └── libaxidma.h       /*lib func declare*/
├── libaxidma.dox
├── library
│   ├── libaxidma.c        /*lib source */
│   └── library.mk
├── Makefile
├── outputs
│   ├── axidma.ko
│   └── libaxidma.so     /*lib for app(user)*/
└── README.md

驱动最底层的操作（调用了xilinx_dma.h）就是axidma_dma.c

conv

文件结构
.
├── xyolo_conv_top.c         /*interface func*/
├── xyolo_conv_top.h         /*interface headfile*/
├── xyolo_conv_top_hw.h      /*save reg addr*/
├── xyolo_conv_top_sinit.c   /*Bare Metal Init func (get baseaddr)*/
└── xyolo_conv_top_linux.c   /*Linux Init func (UIO driver freame)*/

输出的ip文件中的驱动源码去编译，linux.c和.c用于linux驱动。

linux驱动使用的是UIO框架（用户态驱动）。对于定制FPGA不方便编写一个完整的内核驱动。UIO框架允许用户空间程序直接访问物理设备资源，其本质就是一个应用开发。

UIO框架分为2部分，内核空间驱动和用户空间驱动，内核部分主要实现硬件寄存器的内存映射及读写操作（猜测从设备树中获取寄存器地址，映射给上层），而用户空间部分负责将UIO设备的uio_mem映射到本地，实现用户空间程序能够访问硬件设备寄存器。

用户空间存取设备内存：

/*get addr*/
addr_fd = open(/sys/class/uio/uio%d/maps/map%d/addr)
read(addr_fd,uio_addr_buf,sizeof(uio_addr_buf))

/*get size*/
size_fd = open(/sys/class/uio/uio%d/maps/map%d/size)
read(size_fd ,uio_size_buf,sizeof(uio_addr_buf))

/*设备内存空间*/
uio_fd = open("/dev/uiox",O_RDWR);

/*设备内存空间映射到用户空间*/
access_address = mmap(NULL,uio_size,PROT_READ |PROT_WRITE,
                           MAP_SHARED,uio_fd,0);

可以直接在主函数调用

注意要 #define __linux__

app coding

/ {
	amba_pl: amba_pl {
		#address-cells = <1>;
		#size-cells = <1>;
		compatible = "simple-bus";
		ranges ;
		axi_dma_0: dma@40400000 {
			#dma-cells = <1>;
			clock-names = "s_axi_lite_aclk", "m_axi_mm2s_aclk", "m_axi_s2mm_aclk";
			clocks = <&clkc 15>, <&clkc 15>, <&clkc 15>;
			compatible = "xlnx,axi-dma-7.1", "xlnx,axi-dma-1.00.a";
			reg = <0x40400000 0x10000>;
			xlnx,addrwidth = <0x20>;
			xlnx,sg-length-width = <0x1a>;
			dma-channel@40400000 {
				compatible = "xlnx,axi-dma-mm2s-channel";
				dma-channels = <0x1>;
				xlnx,datawidth = <0x40>;
				xlnx,device-id = <0x0>;
				xlnx,include-dre ;
			};
			dma-channel@40400030 {
				compatible = "xlnx,axi-dma-s2mm-channel";
				dma-channels = <0x1>;
				xlnx,datawidth = <0x40>;
				xlnx,device-id = <0x1>;                  /*must diff id*/
				xlnx,include-dre ;
			};
		};

		axidma_chrdev: axidma_chrdev@0 {
			compatible = "xlnx,axidma-chrdev";    /*probe */
			dmas = <&axi_dma_0 0 &axi_dma_0 1>;
			dma-names = "tx_channel", "rx_channel";
		};


		yolo_conv_top_0: yolo_conv_top@43c00000 {
			/* This is a place holder node for a custom IP, user may need to update the entries */
			clock-names = "ap_clk";
			clocks = <&clkc 15>;
			compatible = "xlnx,yolo-conv-top-1.0";
			reg = <0x43c00000 0x10000>;
			xlnx,s-axi-ctrl-bus-addr-width = <0x7>;
			xlnx,s-axi-ctrl-bus-data-width = <0x20>;
		};
	};
};