【CANN训练营第三季】AI开发入门

irrationality

已于 2023-01-03 23:04:37 修改

阅读量1k

点赞数 1

分类专栏：昇腾机器学习文章标签：人工智能计算机视觉目标检测

于 2022-12-27 08:56:42 首次发布

本文链接：https://blog.csdn.net/weixin_54227557/article/details/127812889

版权

机器学习同时被 2 个专栏收录

113 篇文章 3 订阅

订阅专栏

昇腾

18 篇文章 4 订阅

订阅专栏

文章目录

1、购买云服务器

在这里插入图片描述

2、快速体验

参考
https://gitee.com/ascend/samples/tree/master/cplusplus/level2_simple_inference/2_object_detection/YOLOV3_coco_detection_picture

# 为了方便下载，在这里直接给出原始模型下载及模型转换命令,可以直接拷贝执行。也可以参照上表在modelzoo中下载并手工转换，以了解更多细节。   
 
cd $HOME/samples/cplusplus/level2_simple_inference/2_object_detection/YOLOV3_coco_detection_picture/model     
wget https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/003_Atc_Models/AE/ATC%20Model/Yolov3/yolov3.caffemodel
wget https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/003_Atc_Models/AE/ATC%20Model/Yolov3/yolov3.prototxt
wget https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/models/YOLOV3_coco_detection_picture/aipp_nv12.cfg
atc --model=yolov3.prototxt --weight=yolov3.caffemodel --framework=0 --output=yolov3 --soc_version=Ascend310 --insert_op_conf=aipp_nv12.cfg

cd $HOME/samples/cplusplus/level2_simple_inference/2_object_detection/YOLOV3_coco_detection_picture/scripts    
bash sample_build.sh
bash sample_run.sh

在这里插入图片描述

3、图像分类

参考：https://gitee.com/ascend/samples/tree/master/cplusplus/level2_simple_inference/1_classification/resnet50_imagenet_classification

4、结业考核

大题1、

1、

目标：

让学员动手编译和运行应用程序，并根据应用程序的参考实现及文档等理解其实现过程。

实战场景：

下载基于Caffe ResNet-50网络实现

图片分类（仅推理）样例应用的源码，并参考Readme成功编译、运行应用，体验基础推理过程。

评分细则：

总分30分：

使用atc工具转换模型，提供转换命令及转换成功的截图。（10分）

使用转换后的模型，重新编译运行样例应用（基于Caffe
ResNet-50网络实现图片分类（仅推理）），提交成功编译运行应用的截图。（10分）

总结实战过程中遇到的问题及解决方法，并提交总结。（5分）

优化样例应用，包括优化代码逻辑、优化代码注释、补充代码注释等，提交优化后的源码、优化思路说明、优化位置说明。（5分）

解答：

使用atc工具转换模型，提供转换命令及转换成功的截图。（10分）

cd ~/samples/cplusplus/level2_simple_inference/1_classification/resnet50_imagenet_classification
mkdir caffe_model
cd caffe_model
wget https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/003_Atc_Models/AE/ATC%20Model/resnet50/resnet50.prototxt
wget https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/003_Atc_Models/AE/ATC%20Model/resnet50/resnet50.caffemodel
cd ..
atc --model=caffe_model/resnet50.prototxt --weight=caffe_model/resnet50.caffemodel --framework=0 --output=model/resnet50 --soc_version=Ascend310 --input_format=NCHW --input_fp16_nodes=data --output_type=FP32 --out_nodes=prob:0
cd data
wget https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/models/aclsample/dog1_1024_683.jpg
wget https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/models/aclsample/dog2_1024_683.jpg
python3 ../script/transferPic.py

模型转换成功的截图如下
在这里插入图片描述

执行python脚本需要安装对应的包

 which pip
     /home/HwHiAiUser/.local/bin/pip
 sudo ln -s /home/HwHiAiUser/.local/bin/pip /usr/local/python3.7.5/bin/pip 

这样pip安装就装在的python3.7.5

在这里插入图片描述

使用转换后的模型，重新编译运行样例应用（基于Caffe ResNet-50网络实现图片分类（仅推理）），提交成功编译运行应用的截图。（10分）

export DDK_PATH=$HOME/Ascend/ascend-toolkit/latest
export NPU_HOST_LIB=$DDK_PATH/runtime/lib64/stub
cd ~/samples/cplusplus/level2_simple_inference/1_classification/resnet50_imagenet_classification
mkdir -p build/intermediates/host
cd build/intermediates/host
cmake ../../../src -DCMAKE_CXX_COMPILER=g++ -DCMAKE_SKIP_RPATH=TRUE
make

遇到了一个小错误
fatal error: AclLiteApp.h: No such file or directory
#include “AclLiteApp.h”
^~~~~~~~~~~~~~
compilation terminated.
CMakeFiles/main.dir/build.make:110: recipe for target ‘CMakeFiles/main.dir/sample_process.cpp.o’ failed
make[2]: *** [CMakeFiles/main.dir/sample_process.cpp.o] Error 1
CMakeFiles/Makefile2:67: recipe for target ‘CMakeFiles/main.dir/all’ failed
make[1]: *** [CMakeFiles/main.dir/all] Error 2
Makefile:129: recipe for target ‘all’ failed
make: *** [all] Error 2

find / -name AclLiteApp.h 2> /dev/null

原来是环境变量没配置好，需要重新配环境变量，也就是上面这段命令的前两行需要根据实际情况来。
在这里插入图片描述

cd $HOME/samples/cplusplus/level2_simple_inference/1_classification/resnet50_imagenet_classification/out
chmod +x main
./main

在这里插入图片描述
这样就运行成功了。

总结实战过程中遇到的问题及解决方法，并提交总结。（5分）

总结内容如上。

优化样例应用，包括优化代码逻辑、优化代码注释、补充代码注释等，提交优化后的源码、优化思路说明、优化位置说明。（5分）

优化为自动一键下载模型（模型初始化）
下载照片并进行分类即可（使用外链）
mkdir data0 && cd data0

同时把标号显示为查表后得到的单词结果。

这里我想列举文件夹下所有文件，需要用到
a.cpp:5:10: fatal error: io.h: No such file or directory
5 | #include <io.h>
| ^~~~~~
compilation terminated.
在ubuntu下，我写了如下代码获取文件夹中文件名称。

//头文件
#include <iostream>
#include <sys/types.h>
#include <dirent.h>
#include <vector>
#include <string.h>
 
using namespace std;
 
 
void GetFileNames(string path,vector<string>& filenames)
{
    DIR *pDir;
    struct dirent* ptr;
    if(!(pDir = opendir(path.c_str())))
        return;
    while((ptr = readdir(pDir))!=0) {
        if (strcmp(ptr->d_name, ".") != 0 && strcmp(ptr->d_name, "..") != 0)
            filenames.push_back(path + "/" + ptr->d_name);
    }
    closedir(pDir);
}
 
int main() {
    vector<string> file_name;
    string path = "/usr";
 
    GetFileNames(path, file_name);
 
    for(int i = 0; i <file_name.size(); i++)
    {
        cout<<file_name[i]<<endl;
    }
 
    return 0;
}

结合修改可以得到
我们要想办法把这个修改为动态获取data下文件
在这里插入图片描述
我写的脚本如下:

if [ ! -d "data" ];then
    mkdir data
else
    rm -rf data
    mkdir data
    # echo "data文件夹已经存在"
fi

cd data

read -p "input a pic:" pic
wget ${pic}
python3 ../script/transferPic.py
cd ..

if [ ! -d "caffe_model" ];then
    mkdir caffe_model
else
    echo "caffe_model文件夹已经存在"
fi

cd caffe_model
FILE1=resnet50.prototxt

if test -f "$FILE1"; then
    echo "$FILE1 exist"
else
    wget https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/003_Atc_Models/AE/ATC%20Model/resnet50/resnet50.prototxt
fi

FILE2=resnet50.caffemodel
if test -f "$FILE2"; then
    echo "$FILE2 exist"
else
    wget https://obs-9be7.obs.cn-east-2.myhuaweicloud.com/003_Atc_Models/AE/ATC%20Model/resnet50/resnet50.caffemodel
fi

echo "模型初始化完成"

cd ..

export DDK_PATH=$HOME/Ascend/ascend-toolkit/latest
export NPU_HOST_LIB=$DDK_PATH/runtime/lib64/stub
cd ~/samples/cplusplus/level2_simple_inference/1_classification/resnet50_imagenet_classification
if [ ! -d "build" ];then
    mkdir -p build/intermediates/host
else
    rm -rf build
    mkdir -p build/intermediates/host
fi

cd build/intermediates/host
cmake ../../../src -DCMAKE_CXX_COMPILER=g++ -DCMAKE_SKIP_RPATH=TRUE
make

cd $HOME/samples/cplusplus/level2_simple_inference/1_classification/resnet50_imagenet_classification/out
chmod +x main
./main

最后却出现了一些小问题，很是恼火。
终于发现是tranferPIC.py的问题，最后成功定制：
定制点：可以指定路径或者URL链接完成分类。
在这里插入图片描述
运行
bash test.sh #这是我写的一个脚本
比如我们可以在网页上复制我们想要分类的图像连接

粘贴到我的这个脚本里面，回车即可。
开源地址：
https://gitee.com/qmckw/AI-classification-ascend-AUTO-get-URL

大题2、

替换为resnet101

目标：

基于前面所体验的样例，在样例源码的基础上，实现如下定制开发，完成相关模型适配并推理成功。
实战场景：

基于Caffe ResNet-50网络实现图片分类（仅推理）样例，更换为ResNet-101分类模型。更换同类模型时，由于模型的输入、输出相似，所以源码基本可以复用，只需更换模型即可。
定制点说明如下：
1. 模型转换：
  
  下载ResNet-101模型（resnet101_tf.pb），放到resnet50_imagenet_classification样例的caffe_model目录（表示原始模型的存放路径），执行以下命令转换模型，在output参数处指定的model目录下获取resnet101_tf.om：

使用atc工具转换模型，提供转换命令及转换成功的截图。（5分）

cd samples/cplusplus/level2_simple_inference/1_classification/resnet50_imagenet_classification/
wget "https://ascend-repo-modelzoo.obs.cn-east-2.myhuaweicloud.com/model/ATC%20Resnet101(FP16)%20from%20TensorFlow%20-%20Ascend310/zh/1.1/ATC%20Resnet101(FP16)%20from%20TensorFlow%20-%20Ascend310.zip"

unzip 'ATC Resnet101(FP16) from TensorFlow - Ascend310.zip' -d resnet101-ascend310

mkdir caffe_model
cp resnet101-ascend310/Resnet101_for_TensorFlow/resnet101_tf.pb ./caffe_model/resnet101_tf.pb

 atc --model=caffe_model/resnet101_tf.pb --framework=3 --output=model/resnet101_tf --output_type=FP32 --soc_version=Ascend310 --input_shape="input:1,224,224,3" --log=info

在这里插入图片描述

模型的基本介绍如下：

输入数据

输入数据大小数据类型数据排布格式
input 224 x 224 RGB_FP32 NHWC
输出数据

输出数据大小（batch_size x 类别数）数据类型数据排布格式
resnet_v1_101/predictions/Softmax batch_size x 1000 FLOAT32 ND

输入数据	大小	数据类型	数据排布格式
input	224 x 224	RGB_FP32	NHWC

输出数据	大小（batch_size x 类别数）	数据类型	数据排布格式
resnet_v1_101/predictions/Softmax	batch_size x 1000	FLOAT32	ND

数据格式一般只能是“NHWC”或“NCHW”，默认是前者。“NHWC“表示[batch, height, width, channels]，如果是”NCHW“表示[batch, channels, height, width]。

生成测试数据

进入resnet50_imagenet_classification样例的script目录，修改transferPic.py脚本中的如下内容，将float16改为float32：

img = img.astype(“float16”)

修改后:

img = img.astype(“float32”)

切换到“resnet50_imagenet_classification样例目录/data“目录下，执行transferPic.py脚本，将*.jpg转换为*.bin，同时将图片从1024683的分辨率缩放为224224(需要修改py文件)。在“resnet50_imagenet_classification样例目录/data“目录下生成2个*.bin测试文件。

python3 …/script/transferPic.py

在这里插入图片描述

调用AscendCL接口（例如aclmdlLoadFromFileWithMem接口）加载ResNet-101模型：

在src/sample_process.cpp文件中定制代码。

只改个文件名。

参考Caffe ResNet-50网络实现图片分类（仅推理）样例的readme，重新编译并运行。

rm -rf build
mkdir -p build/intermediates/host
cd build/intermediates/host
export DDK_PATH=$HOME/Ascend/ascend-toolkit/latest
export NPU_HOST_LIB=$DDK_PATH/runtime/lib64/stub
cmake ../../../src -DCMAKE_CXX_COMPILER=g++ -DCMAKE_SKIP_RPATH=TRUE
make

使用转换后的模型、定制后的代码，重新编译运行应用，提交成功编译运行应用的截图。（10分）

在这里插入图片描述
运行成功，我们可以看到resnet101用的value是大于1的，而resnet50貌似都是小于1的value.

总结实战过程中遇到的问题及解决方法，并提交总结。（5分）

总结如上。
我完成后开源代码：
https://gitee.com/qmckw/resnet101_imagenet_classification

大题3、

2023年1月3日我看到题目有所修订，原先是用tf实现的：

3、使用Pytorch实现LeNet网络的minist手写数字识别。
硬件平台不限，可以基于windows或者linux系统，尽量给出整个过程的截图，并在最后给出loss或者accuracy运行结果，提供打印loss和accuracy日志，给出截图。【20分】
参考链接Github：https://github.com/allegrofb/LeNet.git
解答：

git clone https://github.com/allegrofb/LeNet.git 
cd LeNet

在这里插入图片描述
我们可以看到，该脚本训练两轮。

原题目：

使用Tensorflow1.15实现LeNet网络的minist手写数字识别。硬件平台不限，可以基于window或者linux系统，尽量给出整个过程的截图，并在最后给出Loss或者accuracy运行结果。参考链接Gitee或Github。【15分】
提交样式：步骤xx截图xx
采用课程中学习到的手工或自动迁移方式，将上述脚本迁移到昇腾AI处理器上，不要求执行训练，仅需提交迁移后的脚本。【15分】
将上述迁移好的LeNet网络使用minist数据集在ModelArts平台上正常跑通。

评分明细：

a. Pycharm控制台界面有正常训练日志打印，给出截图。【5分】

b. 将最终训练的模型权重文件（训练步数不限）保存在OBS上，给出截图。【2分】

c. 给出当前训练的CANN运行日志截图，给出截图。【3分】

解答：
参考https://gitee.com/lai-pengfei/LeNet

git clone https://gitee.com/lai-pengfei/LeNet
conda activate tensorflow1 # 我创建的一个tf环境
pip install tensorflow==1.15

中间遇到了一个报错，找不到tensorflow的tutorial模块
修改一下import的方式

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

这样就在本地顺利地跑起来了

使用Tensorflow1.15实现LeNet网络的minist手写数字识别。硬件平台不限，可以基于window或者linux系统，尽量给出整个过程的截图，并在最后给出Loss或者accuracy运行结果。参考链接Gitee或Github。【15分】

跑通之后，迁移脚本
采用课程中学习到的手工或自动迁移方式，将上述脚本迁移到昇腾AI处理器上，不要求执行训练，仅需提交迁移后的脚本。【15分】

cd ..
git clone https://gitee.com/ascend/tensorflow.git
cd tensorflow/convert_tf2npu #进入工具包
pip3 install pandas
pip3 install xlrd==1.2.0
pip3 install openpyxl
pip3 install tkintertable
pip3 install google_pasta
python3 main.py -i ../../LeNet

在这里插入图片描述

查看报告可知，迁移已经成功。

并且成功得到了输出脚本
下面使用pycharm在modelarts上训练，配置如图
3. 将上述迁移好的LeNet网络使用minist数据集在ModelArts平台上正常跑通。

评分明细：

a. Pycharm控制台界面有正常训练日志打印，给出截图。【5分】

在这里插入图片描述

这个时候出现了一个小错误
在这里插入图片描述
data作相应修改

mnist = input_data.read_data_sets("/home/ma-user/modelarts/user-job-dir/code/MNIST_data/", one_hot = True)

在这里插入图片描述
迭代设置为5000次，很快就得到了运行结果

在OBS得到了对应的checkpoint文件
b. 将最终训练的模型权重文件（训练步数不限）保存在OBS上，给出截图。【2分】

在这里插入图片描述

c. 给出当前训练的CANN运行日志截图，给出截图。【3分】
在这里插入图片描述

在这里插入图片描述

2、开发者自己选择非昇腾社区ModelZoo的Tensorflow模型，迁移成功并在ModelArts平台上训练跑通。

评分细则：

迁移成功，提交迁移后的脚本。【5分】
在ModelArts平台训练跑通，给出相关日志截图。【5分】

友情提醒：

昇腾社区ModelZoo的Tensorflow模型列表可单击LINK，并通过如下筛选条件查找：

在这里插入图片描述

没有迁移或训练成功也没关系，可在ModelZoo仓提issue，也能酌情给分。

我们选择Retina-VesselNet迁移

git clone https://github.com/DeepTrial/Retina-VesselNet.git
git clone https://gitee.com/ascend/tensorflow.git
cd tensorflow/convert_tf2npu #进入工具包
pip3 install pandas
pip3 install xlrd==1.2.0
pip3 install openpyxl
pip3 install tkintertable
pip3 install google_pasta
python3 main.py -i ../../Retina-VesselNet-v2