DPU-PYNQ Ultra96v2安装使用说明

最新推荐文章于 2023-11-28 20:00:12 发布

人工智能和FPGA AI技术

最新推荐文章于 2023-11-28 20:00:12 发布

阅读量4.1k

点赞数 4

分类专栏： FPGA 嵌入式 AI

本文链接：https://blog.csdn.net/u010879745/article/details/109477145

版权

FPGA 同时被 3 个专栏收录

46 篇文章 13 订阅

订阅专栏

嵌入式

38 篇文章 5 订阅

订阅专栏

29 篇文章 2 订阅

订阅专栏

由于不能直接上传word文档，本文档中大量的图片我没有时间上传，待有空时再补图片，请谅解！

DPU-PYNQ Ultra96v2安装使用说明
目录
预备知识 3
Execution Model 3
Host Program Build Process 5
FPGA Binary Build Process 5
第一部分安装例程 6
六步安装 6
SD扩容 6
网络连通 7
通信诊断 9
切换镜像源 12
运行jupyter 14
输入文件和库文件（pynq-dpu/dpu_resnet50_0.elf，overlay/ dpu.hwh, dpu.bit, dpu.xclbin） 17
输入文件 17
库文件 18
第二部分硬件文件制作 18
拷贝host文件夹 18
编译文件build.sh（首次下载） 19
下载硬件项目（首次下载） 21
Vivado 21
(→/dpu_overlay/.hwh,.bit,.xsa,build.sh→PYNQ-derivative-overlays/dpu makefile → dpu.tcl和build_bitstream.tcl) 21
Vitis Platform 23
(dpu/.xsa→dpu/.xpfm, build.sh→PYNQ-derivative-overlays/vitis_platform makefile → xsct build_pfm.tcl) 23
Vitis DPU 25
(.xpfm,/DPU-TRD/prj/Vitis/dpu_conf.vh,prj_config→.xclbin, build.sh→DPU-TRD/prj/Vitis Makefile→vivado
scripts/gen_dpu_xo.tcl→v++ .xclbin，首次下载) 25
dpu定义 25
连接文件 26
第三部分安装Docker 28
下载Vitis-AI 28
安装Docker 28
安装NVIDIA Docker Runtime 29
Manage Docker as a non-root user 30
Load&Run Docker Container 30
Docker文件操作 31
挂载文件夹 32
第四部分 Model Kernel制作 34
(dpu_overlay/dpu.hwh，Ultra96.json→quantized/deploy.prototxt，deploy.caffemodel，下载AI模型) 34
编译文件compile.sh 34
.dcf和.json文件 36
AI模型下载 36
vai_c_caffe编译 36
运行 37

作者：王伟博士
在这里插入图片描述

预备知识
Execution Model
在这里插入图片描述

In the Vitis core development kit, an application program is split between a host application and hardware accelerated kernels with a communication channel between them. The host program, written in C/C++ and using API abstractions like OpenCL, runs on a host processor (such as an x86 server or an Arm processor for embedded platforms), while hardware accelerated kernels run within the programmable logic (PL) region of a Xilinx device.
The API calls, managed by XRT, are used to process transactions between the host program and the hardware accelerators. Communication between the host and the kernel, including control and data transfers, occurs across the PCIe® bus or an AXI bus for embedded platforms. While control information is transferred between specific memory locations in the hardware, global memory is used to transfer data between the host program and the kernels. Global memory is accessible by both the host processor and hardware accelerators, while host memory is only accessible by the host application.
For instance, in a typical application, the host first transfers data to be operated on by the kernel from host memory into global memory. The kernel subsequently operates on the data, storing results back to the global memory. Upon kernel completion, the host transfers the results back into the host memory.
在这里插入图片描述
Host Program Build Process
The main application is compiled and linked with the g++ compiler, using the following two step process:

Compile any required code into object files (.o).
Link the object files (.o) with the XRT shared library to create the executable.

FPGA Binary Build Process
Kernels can be described in C/C++, or OpenCL C code, or can be created from packaged RTL designs. As shown in the figure above, each hardware kernel is independently compiled to a Xilinx object (.xo) file.
Xilinx object (.xo) files are linked with the hardware platform to create an FPGA binary file (.xclbin) that is loaded into the Xilinx device on the target platform.
The Vitis compiler build process generates the host program executable and the FPGA binary (.xclbin).

第一部分安装例程

六步安装
参考网页：
DPU-PYNQ now available for ZU+ and RFSoC devices - Announcements – PYNQ
https://discuss.pynq.io/t/dpu-pynq-now-available-for-zu-and-rfsoc-devices/1251

git clone --recursive --shallow-submodules https://github.com/Xilinx/DPU-PYNQ.git
cd DPU-PYNQ/upgrade
make
pip3 install pynq-dpu
cd $PYNQ_JUPYTER_NOTEBOOKS
pynq get-notebooks pynq-dpu -p .

root@pynq:/home/xilinx# git clone --recursive --shallow-submodules https://github.com/Xilinx/DPU-PYNQ.git
root@pynq:/home/xilinx # cd DPU-PYNQ/upgrade
root@pynq:/home/xilinx/DPU-PYNQ/upgrade# make
root@pynq:/home/xilinx/DPU-PYNQ/upgrade# pip3 install pynq-dpu
root@pynq:/home/xilinx/DPU-PYNQ/upgrade# cd $PYNQ_JUPYTER_NOTEBOOKS
root@pynq:/home/xilinx/jupyter_notebooks# pynq get-notebooks pynq-dpu -p .

SD扩容
使用Win32DiskImager工具来写入TF卡pynq-2.5.1 ，根文件分区只有6.05G
这时直接执行以上六大步骤TF会显示内存不足，我升级完后用下面的命令看到它根文件分区的容量为8.3G。

你需要用gparted软件对TF卡进行扩容，再来执行以上六大步骤。
网络连通
在升级过程中需要网络，建议使用以下的USB以太网卡
在这里插入图片描述

将以下内容拷贝到将etc/network/interfaces文件原有内容删掉，并把将以下内容拷入
#Include files from /etc/network/interfaces.d:
#source-directory /etc/network/interfaces.d
allow-hotplug eth0
iface eth0 inet static
address 192.168.1.15
gateway 192.168.1.1
netmask 255.255.255.0
dns-nameservers 8.8.8.8
#network 192.168.1.0
#broadcast 192.168.1.255

解释：
auto: the interface should be configured during boot time.
iface : interface
inet: interface uses TCP/IP networking.
That means interface eth0 should be configured during boot time , and interface name eth0 uses TCP/IP protocol。

Ping路由器

我们检查下IP地址是否正确

计算机上可用MobaXterm看到ultra96图形化文件界面

通信诊断
计算机与ultra96是通过TCP和SSH来通信的。

观察路由
Route -n
route del default
如果如下图，则是路由没指定

增加路由
route add default gw 192.168.1.1

网络重启
sudo /etc/init.d/networking restart
systemctl restart networking

ssh

无线配置sshd_config

确认ssh配置是否对root进行特殊设置，修改/etc/ssh/sshd_config文件中
PermitRootLogin without-password将 without-password改为yes。

重启ssh服务
/etc/init.d/ssh restart
systemctl restart sshd

查看ssh状态
service ssh status
systemctl status ssh
ssh -V

resolv.conf
/etc下的resolv.conf只是指向文件

Scp传输文件
建立起SSH通道后，可以用它来传输文件
scp /home/music/1.mp3 root@www.runoob.com:/home/root/music
scp /home/music/1.mp3 root@www.runoob.com:/home/root/music/001.mp3
切换镜像源
我们登录清华pip源网址，很幸运找到了 pynq-2.5.1的镜像源网址

国外pip镜像源换成清华源
/root/.pip/pip.conf
[global]
timeout = 6000
index-url = https://pypi.tuna.tsinghua.edu.cn/simple
trusted-host = pypi.tuna.tsinghua.edu.cn

如果不更换pip源，其下载速度如下图慢

更换完镜像源，速度从4.6k直接到17.8M

国外sources.list镜像源换成华为源，清华没有arm64源.

github域名IP
将hosts对github进行IP地址映射，映射前检查文件

github.global.ssl.fastly.Net 151.101.77.194
github.com 13.250.177.223

如果不映射github IP，下载将是如下图中惊人的慢

也可以将github由计算机VPN下载

运行jupyter
在浏览器输入上面interfaces文件中的静态地址192.168.1.15，可能需要输入密码xilinx，这时你可以看到

输入文件和库文件（pynq-dpu/dpu_resnet50_0.elf，overlay/dpu.hwh, dpu.bit, dpu.xclbin）
输入文件
将overlay files (dpu.hwh, dpu.bit, and dpu.xclbin)拷入/usr/local/lib/python3.6/dist-packages/pynq_dpu/overlays/

将dpu_resnet50_0.elf拷入/home/xilinx/jupyter_noterbooks/pynq-dpu/

库文件
pynq-dpu/dpu.py
class DpuOverlay(pynq.Overlay):
def download(self):
“”“Download the overlay.
This method overwrites the existing download() method defined in
the overlay class. It will download the bitstream, set AXI data width,
copy xclbin and ML model files.
“””
def copy_xclbin(self):
“”“Copy the xclbin file to a specific location.
This method will copy the xclbin file into the destination directory to
make sure DNNDK libraries can work without problems.
The xclbin file, if not set explicitly, is required to be located
in the same folder as the bitstream and hwh files.
The destination folder by default is /usr/lib.
“””
def load_model(self, model_elf):
“”"Load DPU models under a specific location.
第二部分硬件文件制作
拷贝host文件夹
初始host只有这五个设计文件

硬件需要产生三个文件dpu.hwh, dpu.bit和 dpu.xclbin

编译文件build.sh（首次下载）
下载PYNQ-derivative-overlays.git和Vitis-AI.git，以后应避免重复下载。

john@john-virtual-machine:~/ultra96$ xclbinutil -i ./root/root/home/xilinx/jupyter_notebooks/pynq-dpu/dpu.xclbin --info

#put under $CURDIR/PYNQ-derivative-overlays/vitis_platform/Ultra96/platforms

下载硬件项目（首次下载）
我们现在只用下载的dpu文件夹
git clone https://github.com/yunqu/PYNQ-derivative-overlays.git
cd PYNQ-derivative-overlays
git checkout -b temp tags/v2019.2
再次运行时，注释掉这两行git

Vivado
(→/dpu_overlay/.hwh,.bit,.xsa,build.sh→PYNQ-derivative-overlays/dpu makefile → dpu.tcl和build_bitstream.tcl)

要求输出/dpu_overlay/.hwh,.bit,.xsa三个文件

进入硬件文件夹
cd dpu //vivado文件夹
make

makefile文件

Makefile调用dpu.tcl和build_bitstream.tcl
block_design:
@sed -i “s/(create_project )(.)( -part )(.)”
“/\1$(overlay_name) $KaTeX parse error: Undefined control sequence: \3 at position 15: (overlay_name)\̲3̲$ (device)/”
$(overlay_name).tcl;
sed -i ‘s/^set design_name (.*)/set design_name $(overlay_name)/g’
$(overlay_name).tcl;
vivado -mode batch -source $(overlay_name).tcl -notrace
//创建IP
bitstream:
vivado -mode batch -source build_bitstream.tcl -notrace
//编译项目

至此，生成三个硬件文件 .xsa .bit .hwh，但后两个文件并不是最终文件。

Vitis Platform
(dpu/.xsa→dpu/.xpfm, build.sh→PYNQ-derivative-overlays/vitis_platform/Makefile → xsct build_pfm.tcl)
要求输入dpu/.xsa，输出dpu/.xpfm文件

cd $CURDIR/PYNQ-derivative-overlays/vitis_platform
//vitis文件夹
下面这条语句生成.xpfm硬件平台文件，make shell命令行参数优先级最高（makefile：变量定义的优先级 https://blog.csdn.net/test1280/article/details/81266207）
make XSA_PATH=…/dpu/dpu.xsa BOARD=Ultra96
makefile关键语句：
xsct -sdx build_pfm.tcl $(XSA_PATH) $(OVERLAY) $(BOARD) $(PROC)
build_pfm.tcl文件
进入xsct命令行，执行build_pfm.tcl中的platform和domain命令

Vitis DPU
(.xpfm,/DPU-TRD/prj/Vitis/dpu_conf.vh,prj_config→.xclbin, build.sh→DPU-TRD/prj/Vitis/Makefile→vivado scripts/gen_dpu_xo.tcl→v++ .xclbin，首次下载)
输入文件.xpfm,输出文件.xclbin，需要定义文件/DPU-TRD/prj/Vitis/dpu_conf.vh,prj_config
再次使用时需要注释掉两行git
Makefile 引用scripts/gen_dpu_xo.tcl，用vivado工具产生dpu.xo, 用v++工具产生.xclbin
工作目录切换到/host/Vitis-AI/DPU-TRD/prj/Vitis，拷贝两个文件dpu_conf.vh,prj_config

dpu定义

连接文件

cd $CURDIR
git clone https://github.com/Xilinx/Vitis-AI.git
cd Vitis-AI
git checkout -b temp tags/v1.0
cd DPU-TRD/prj/Vitis
rm -rf dpu_conf.vh
rm -rf config_file/prj_config
cp -rf $CURDIR/dpu_conf.vh .
cp -rf $CURDIR/prj_config config_file export SDX_PLATFORM=$ CURDIR/PYNQ-derivative-overlays/vitis_platform/Ultra96/platforms/dpu/dpu.xpfm
make KERNEL=DPU DEVICE=Ultra96

Makefile文件中的位置，计算机的虚拟机中：

上图最后一行的.hwh代表着所有的硬件连接关系（握手信号）。

dpu_HDLSRCS=kernel_xml/dpu/kernel.xml
scripts/package_dpu_kernel.tcl
scripts/gen_dpu_xo.tcl
./dpu_conf.vh
…/…/dpu_ip/Vitis/dpu/hdl/dpu_xrt_top.v
…/…/dpu_ip/Vitis/dpu/inc/arch_def.vh
…/…/dpu_ip/Vitis/dpu/xdc/.xdc
…/…/dpu_ip/dpu_eu_/hdl/dpu_eu_dpu.sv
…/…/dpu_ip/dpu_eu/inc/function.vh
…/…/dpu_ip/dpu_eu_*/inc/arch_para.vh

dpu_TCL=scripts/gen_dpu_xo.tcl
binary_container_1/dpu.xo: $(dpu_HDLSRCS)
$(VIVADO) -mode batch -source $(dpu_TCL) -tclargs $@ $(DPU_KERN_NAME) ${TARGET} ${DEVICE}

binary_container_1/dpu.xclbin: $(kernel_xo)
v++ $XOCC_OPTS) -l --temp_dir binary_container_1 --log_dir binary_container_1/logs --remote_ip_cache binary_container_1/ip_cache -o "$ @" $(+)

scripts/gen_dpu_xo.tcl内容：

第三部分安装Docker
下载Vitis-AI
Clone the Vitis-AI repository to obtain the examples, reference code, and scripts.
• git clone https://github.com/Xilinx/Vitis-AI
• cd Vitis-AI

安装Docker
Add the Docker repository into your Ubuntu host
sudo add-apt-repository “deb [arch=amd64] https://download.docker.com/linux/ubuntu bionic stable” //这是将docker ppa 加到sources.list
执行此句后，加入sources.list

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add - //sS不输出进度，只输出错误，L重定向，下载docker key
sudo apt-get update && sudo apt install docker-ce docker-ce-cli containerd.io //安装docker
安装NVIDIA Docker Runtime
Instructions from https://nvidia.github.io/nvidia-container-runtime/
curl -s -L https://nvidia.github.io/nvidia-container-runtime/gpgkey | \
sudo apt-key add -
distribution=$(. /etc/os-release;echo $I D$ VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-container-runtime/$distribution/nvidia-container-runtime.list |
sudo tee /etc/apt/sources.list.d/nvidia-container-runtime.list
执行此句后，下载nvidia-container-runtime.list

sudo apt-get update
sudo apt-get install nvidia-container-runtime
The edit the docker config to limit / allow users in docker group membership to run Docker containers
sudo systemctl edit docker
This is the content
[Service]
ExecStart=
ExecStart=/usr/bin/dockerd --group docker -H unix:///var/run/docker.sock --add-runtime=nvidia=/usr/bin/nvidia-container-runtime
Then restart the InitD daemon and restart Docker
sudo systemctl daemon-reload
sudo systemctl restart docker
Manage Docker as a non-root user
Create the docker group.
$ sudo groupadd docker
Add your user to the docker group.
$ sudo usermod -aG docker $USER
activate the changes to groups:
$ newgrp docker
Verify that you can run docker commands without sudo.
$ docker run hello-world
Load&Run Docker Container
CPU tools docker
./docker_run.sh xilinx/vitis-ai:tools-<x.y.z>-cpu

Runtime docker
…/docker_run.sh xilinx/vitis-ai:runtime-1.0.0-cpu

Docker文件操作
对于Vitis-AI, 放入host文件夹的文件，启动时会自动挂载
查看images
docker images

启动docker
sudo service docker start
systemctl start docker

停止docker
sudo service docker stop
systemctl stop docker

重启docker
sudo service docker restart

docker run -t -i ubuntu /bin/bash

• docker run: runs a container.
• ubuntu: is the image you would like to run.
• -t: flag assigns a pseudo-tty or terminal inside the new container.
• -i: flag allows you to make an interactive connection by grabbing the standard in (STDIN) of the container.
• /bin/bash: launches a Bash shell inside our container.
理解很简单：
• docker run：启动container
• ubuntu：你想要启动的image
• -t：进入终端
• -i：获得一个交互式的连接，通过获取container的输入
• /bin/bash：在container中启动一个bash shell
这样就进入container的内部了：

用docker ps看的是当前运行的image, docker ps -a看的是所有image(包括曾经运行的image)

取ID命令
docker inspect -f '{{.ID}}'store-dev

docker cp :用于容器与主机之间的数据拷贝。

将文件拷入docker, 注意用户名每次不一样，取实时用户名，如下

docker cp /home/john/ultra96/host nifty_robinson: /workspace/
docker cp /home/john/ultra96/host affectionate_brown: /workspace/

挂载文件夹
本地aaa文件夹挂载到虚拟机中的bbb文件夹上（“冒号”左边是本地路径）。
sudo docker run -itv /home/用户/aaa:/home/bbb ubuntu:14.04 /bin/bash

启动docker之后的文件目录：

第四部分 Model Kernel制作
(dpu_overlay/dpu.hwh，Ultra96.json， quantized/deploy.prototxt，deploy.caffemodel → resnet50，docker→ compile.sh→vai_c_caffe，首次下载AI模型)
编译文件compile.sh
在Ubuntu中拷入host文件夹，现在只进行编译，只需要两个设计文件dpu_overlay/dpu.hwh和Ultra96.json文件，dlet产生.dcf文件，下载zoo的AI deploy定点模型，输出DPU Kernel。

.dcf和.json文件
/opt/vitis_ai/utility/dlet -f dpu_overlay/dpu.hwh //从dpu.hwh中产生*.dcf文件

sudo mkdir -p /opt/vitis_ai/compiler/arch/dpuv2/Ultra96
sudo cp *.dcf /opt/vitis_ai/compiler/arch/dpuv2/Ultra96/Ultra96.dcf
sudo cp -f Ultra96.json /opt/vitis_ai/compiler/arch/dpuv2/Ultra96/Ultra96.json
{
“target” : “dpuv2”,
“dcf” : “/opt/vitis_ai/compiler/arch/dpuv2/Ultra96/Ultra96.dcf”,
“cpu_arch” : “arm64”
}

AI模型下载
wget -O resnet50.zip
https://www.xilinx.com/bin/public/openDownload?filename=cf_resnet50_imagenet_224_224_1.1.zip
unzip resnet50.zip

vai_c_caffe编译
直接引用zoo的定点模型
vai_c_caffe
–prototxt cf_resnet50_imagenet_224_224_7.7G/quantized/deploy.prototxt
–caffemodel cf_resnet50_imagenet_224_224_7.7G/quantized/deploy.caffemodel
–arch /opt/vitis_ai/compiler/arch/dpuv2/Ultra96/Ultra96.json
–output_dir .
–net_name resnet50

john@john-virtual-machine:/workspace/host$ ./compile.sh

运行
运行结果如下：

总结
PYNQ运行所需文件
Python运行时需要输入五个文件dpu_resnet50_0.elf，libdpumodelresnet50.so，dpu.hwh, dpu.bit, dpu.xclbin。so文件必须位于/usr/lib，其余文件位于程序所在目录，或者/usr/local/lib/python3.6/dist-packages/pynq_dpu/overlay目录。

将elf文件编译成so文件的命令
model=tf_resnetv1
aarch64-linux-gnu-gcc -fPIC -shared dpu_ ${model}_0.elf -o libdpumodel$ {model}.so

请注意elf与so文件的命名方式。

用户程序
overlay = DpuOverlay(“dpu.bit”)
overlay.load_model(“dpu_mnist_classifier_0.elf”)
此句判断elf文件是否位于当前或overlay目录中。
KERNEL_NAME = “mnist_classifier_0”
n2cube.dpuOpen()
kernel = n2cube.dpuLoadKernel(KERNEL_NAME)

DpuOverlay ：/usr/local/lib/python3.6/dist-packages/pynq_dpu/dpu.py
MODULE_PATH = os.path.dirname(os.path.realpath(file))
这是该脚本dpu.py所在的绝对路径

OVERLAY_PATH = os.path.join(MODULE_PATH, ‘overlays’)
XCL_DST_PATH = “/usr/lib”

class DpuOverlay(pynq.Overlay):
“”"DPU overlay class.

This class inherits from the PYNQ overlay class. The initialization method
is similar except that we have additional bit file search path.

"""
def __init__(self, bitfile_name, dtbo=None,
             download=True, ignore_version=False, device=None):
    """Initialization method.

    Check PYNQ overlay class for more information on parameters.

    By default, the bit file will be searched in the following paths:
    (1) the `overlays` folder inside this module; (2) an absolute path;
    (3) the relative path of the current working directory.

    By default, this class will set the runtime to be `dnndk`.

    """

此句是在两个目录中寻找bit文件
if os.path.isfile(bitfile_name):
abs_bitfile_name = bitfile_name
elif os.path.isfile(os.path.join(OVERLAY_PATH, bitfile_name)):
abs_bitfile_name = os.path.join(OVERLAY_PATH, bitfile_name)
else:
raise FileNotFoundError(‘Cannot find {}.’.format(bitfile_name))
super().init(abs_bitfile_name,
dtbo=dtbo,
download=download,
ignore_version=ignore_version,
device=device)
self.overlay_dirname = os.path.dirname(self.bitfile_name)
self.overlay_basename = os.path.basename(self.bitfile_name)
self.runtime = ‘dnndk’
self.runner = None

def load_model(self, model):
    """Load DPU models for both DNNDK runtime and VART.

    For DNNDK, this method will compile the ML model `*.elf` binary file,
    compile it into `*.so` file located in the destination directory
    on the target. This will make sure DNNDK libraries can work
    without problems.

    The ML model file, if not set explicitly, is required to be located
    in the same folder as the bitstream and hwh files.

    The destination folder by default is `/usr/lib`.

    Currently only `*.elf` files are supported as models. The reason is
    that `*.so` usually have to be recompiled targeting a specific
    rootfs.

    For VART, this method will automatically generate the `meta.json` file
    in the same folder as the model file.

    Parameters
    ----------
    model : str
        The name of the ML model binary. Can be absolute or relative path.

    """
    if os.path.isfile(model):
        abs_model = model

print(f"os.path.abspath {os.path.abspath(model)}")

判断elf文件是否位于两个目录中
elif os.path.isfile(self.overlay_dirname + “/” + model):
abs_model = self.overlay_dirname + “/” + model
else:
raise ValueError(
“File {} does not exist.”.format(model))
if not os.path.isdir(XCL_DST_PATH):
raise ValueError(
“Folder {} does not exist.”.format(XCL_DST_PATH))

    if not model.endswith(".elf"):
        raise RuntimeError("Currently only elf files can be loaded.")
    else:
        if self.runtime == 'dnndk':
            kernel_name = get_kernel_name_for_dnndk(abs_model)
            model_so = "libdpumodel{}.so".format(kernel_name)
            _ = subprocess.check_output(
                ["gcc", "-fPIC", "-shared", abs_model, "-o",
                 os.path.join(XCL_DST_PATH, model_so)])

这段程序只是从已确定的指针中载入kernel，下载后即使删除elf文件，仍能工作，因为该firmware已经下载到DPU中了。
/usr/local/lib/python3.6/dist-packages/dnndk/n2cube.py
def dpuLoadKernel(kernelName):
“”"
Load a DPU Kernel and allocate DPU memory space for
its Code/Weight/Bias segments
kernelName: The pointer to neural network name.
Use the names produced by Deep Neural Network Compiler (DNNC) after the compilation of neural network. For each DL application, perhaps there are many DPU Kernels existing in its hybrid CPU+DPU binary executable. For each DPU Kernel, it has one unique name for differentiation purpose
Returns: The loaded DPU Kernel on success, or report error in case of any failure
“”"
return pyc_libn2cube.pyc_dpuLoadKernel(c_char_p(kernelName.encode(“utf-8”)))

硬件xsa文件（此步骤不用）
用Vivado工具输出三个文件dpu_overlay/.hwh, .bit, .xsa
build.sh→PYNQ-derivative-overlays/dpu/makefile → IP核文件dpu.tcl和比特流文件build_bitstream.tcl

硬件xpfm平台（此步骤不用）
输入为dpu/.xsa，输出为vitis_platform/Ultra96/platforms /dpu/.xpfm
makefile用vitis命令行工具xsct ，调用build_pfm.tcl 生成.xpfm平台。
build.sh→PYNQ-derivative-overlays/vitis_platform/Makefile → xsct build_pfm.tcl

硬件.xclbin文件
此步骤可以根据自己设计DPU结构文件dpu_conf.vh, prj_config生成。
cd DPU-TRD/prj/Vitis
输入文件 host主目录
dpu_conf.vh，prj_config
PYNQ-derivative-overlays/vitis_platform/Ultra96/ platforms/dpu/dpu.xpfm
输出dpu_overlay目录
.xclbin，.bit和*.hwh。

build负责将host主目录下dpu_conf.vh, prj_config分别拷到DPU-TRD/prj/Vitis/及其下的config_file文件夹。
export SDX_PLATFORM=$CURDIR/PYNQ-derivative-overlays/vitis_platform/Ultra96/ platforms/dpu/dpu.xpfm
输入为vitis_platform/Ultra96/platforms/dpu/.xpfm，
Makefile文件调用vivado生成gen_dpu_xo.tcl，生成.xo文件，再调用v++生成xclbin文件
build.sh→DPU-TRD/prj/Vitis/Makefile→vivado scripts/gen_dpu_xo.tcl→v++ .xclbin

build文件负责将生成的binary_container_1下的*.xclbin和binary_container_1/link/ vivado/vpl/prj下的*.bit和*.hwh拷到dpu_overlay目录。
Model Kernel文件
可以自己下载合适的AI模型。
输入为dpu_overlay/dpu.hwh，Ultra96.json， quantized/deploy.prototxt，deploy.caffemodel，输出为resnet50
启动docker，调用compile.sh文件，执行vai_c_caffe函数，生成resnet50执行文件。
dpu_overlay/dpu.hwh，Ultra96.json， quantized/deploy.prototxt，deploy.caffemodel → resnet50，docker→ compile.sh→vai_c_caffe

硬件xsa文件（
build.sh→PYNQ-derivative-overlays/dpu/makefile → IP核文件dpu.tcl和比特流文件build_bitstream.tcl
硬件xpfm平台（此步骤不用）
makefile用vitis命令行工具xsct
build.sh→PYNQ-derivative-overlays/vitis_platform/Makefile → xsct build_pfm.tcl
硬件.xclbin文件
build.sh→DPU-TRD/prj/Vitis/Makefile→vivado scripts/gen_dpu_xo.tcl→v++ .xclbin
Model Kernel文件
dpu_overlay/dpu.hwh，Ultra96.json， quantized/deploy.prototxt，deploy.caffemodel → resnet50，docker→ compile.sh→vai_c_caffe
在这里插入图片描述

可能的故障：
1 太热会导致程序频繁出错，或频繁返回首页
2 如果数据集出错，删掉重新下载。

65 n02422106
confidence: 0.75 category index: 65 label: ‘n01751748 sea snake’

LSVRC 2012 CLS-LOC Annotations
CLS-LOC refere classification and localization respectively.

The classification ground truth of the validation images is in
data/ILSVRC2014_clsloc_validation_ground_truth.txt,
where each line contains one ILSVRC2014_ID for one image, in the
ascending alphabetical order of the image file names.
The localization ground truth for the validation images can be downloaded
in xml format.

http://image-net.org/download-images

Tcl学习使用原来的和现在的

caffe标签到原版标签的映射：
caffemodel预测出top1的index是idx（比如65），加上1就是66，对应到synsets.txt第66行，也就是n01751748；从ILSVRC2012_mapping.txt找到n01751748，对应的映射id为490，表示原版imagenet label是490（1开始标记）或489（0开始标记）
Fetch an image using the OpenCV function imread() and set it as the input to the DPU kernel resnet50 by calling the dpuSetInputImage2() for Caffe model. For TensorFlow model, the users should implement the pre-processing (instead of directly using
dpuSetInputImage2()) to feed input image into DPU.

注释掉Loadwords程序时，编译出现文件过短错误

Top-1 Accuracy baseline:should be > 68.5. Accuracy lessthan 68.5 will be disqualified.
Winner will win based onlatency and accuracy.
(Accruacy-top1% -68.5)/15)*0.4+(10/latencyms)*0.6

以下为减少显示，毫无帮助，反而延长时间

[DNNDK_XRT] Alloc BO Failed, size: 0xc1000
[DNNDK] Fail to alloc memory for Task inception_v1_0-1196 of DPU Kernel inception_v1_0: Size: 790272

Resnet50

sudo dd if=/dev/sdb status=progress | gzip>/home/john/ultra96image.gz

人工智能和FPGA AI技术

关注

4
点赞
踩
9

收藏

觉得还不错? 一键收藏
5
评论
DPU-PYNQ Ultra96v2安装使用说明

由于不能直接上传word文档，本文档中大量的图片我没有时间上传，待有空时再补图片，请谅解！DPU-PYNQ Ultra96v2安装使用说明目录预备知识 3Execution Model 3Host Program Build Process 5FPGA Binary Build Process 5第一部分安装例程 6六步安装 6SD扩容 6网络连通 7通信诊断 9切换镜像源 12运行jupyter 14输入文件和库文件（pynq-dpu/dpu_resnet50_0.elf，
复制链接

扫一扫