昇腾产品系列及试用

proware

已于 2025-03-29 21:17:25 修改

阅读量3.9k

点赞数 30

分类专栏：海思系列文章标签：人工智能昇腾

于 2024-11-16 16:26:28 首次发布

本文链接：https://blog.csdn.net/proware/article/details/143812176

版权

海思系列专栏收录该内容

5 篇文章

订阅专栏

Atlas 800 推理服务器(型号：3000)

海思由于分大小，即上海和深圳两个海思。个人理解对应的产品，分别为安防的3559等，以及算力的昇腾。这对于刚开始接触海思产品的人会产生困惑。本文记录海思昇腾的部分产品及试用说明。

训练和推理

训练服务器与推理服务器

Atlas 800T A2训练服务器

Atlas 800T A2训练服务器具有更高算力密度、极致能效比与高速网络带宽等特点。该服务器广泛应用于深度学习模型开发和训练，适用于智慧城市、智慧金融、教育科研、运营商等需要大算力的行业领域。

Atlas 800I A2推理服务器

Atlas 800I A2 推理服务器采用8模组高效推理方式，提供强劲 AI 推理能力，在算力、内存带宽和互联能力方面具有优势，可广泛应用于生成式大模型推理，例如智能客服、文案生成、知识沉淀等内容生成类场景，支持NPU互联，提升大模型推理效率

Atlas 800 推理服务器
(型号：3000)

Atlas 800 推理服务器（型号：3000）最大可支持8个Atlas 300I/V Pro，提供强大的实时推理能力和视频分析能力，广泛应用于中心侧AI推理场景。

1）大模型训练对计算架构的一致性，和节点间的高速互联，要求相对较高。

2）大容量内存：在训练过程中，需要存储大量的数据和模型参数。随着模型规模的不断增大和数据集的复杂性增加，训练服务器需要具备足够大的内存来容纳这些数据。
3）高速存储：快速的存储设备可以减少数据读取和写入的时间，提高训练效率。

而大模型推理，（通过优化）可以做到一台机器干很多件事情。

综合比较

型号	cpu	内存	存储	外部网络	内部网络	pcie扩展	场景
800T	4	32个DDR4 16-64G	sata nvme	8*200GE RDMA	无	3个	深度学习模型开发和训练
800I	4	32个DDR4 16-64G	sata nvme	8*200GE RDMA	NPU全互联	3个	生成式大模型推理
800	2	32个DDR4	sata	无	无	9个，1120T算力	实时推理能力和视频分析能力

问题：

1） NPU如何全互联？带宽392GB/s是如何得来的？全互联后作为一个NPU节点？

全互联后，RDMA网口可以直接访问到NPU的内存？

结论：

1）通过上表可以得到，生成式的模型推理，对互联性以及存储的容量、速度与训练服务器有同样的要求。

2）实时推理与视频分析依靠单独服务器，该服务器提供了上述三者中最强的算力。

推理卡与训练卡

推理卡

训练卡

很奇怪，昇腾官网已经没有训练卡 300T的资料，而全在华为主网站上，

框图来源：前面板 - 概述 - Atlas 300T 训练卡用户指南（型号 9000）14 - 华为

Atlas 300T Pro 训练卡技术白皮书 (型号 9000) - 华为企业业务

综合比较

卡

算力

外部接口

尺寸

功耗

内存速率

推理卡300i pro

140 TOPS INT8

70 TFLOPS FP16

无

72w

4266Mbps

训练卡300T pro

280 TFLOPS FP16

100G

300W

ddr4 2400Mbps

1）训练卡比推理卡多了RDMA网卡，即外面的数据可以直接到NPU。

2）算力单卡算力强。

3）训练卡内存不行。

4）功耗算力比差不多

软件模块

目前：

1）驱动及固件

2） cann，包括图像采集接口，预处理接口，内存拷贝等，以及ATC的转换工具。

3） mindx。包括mindxvision的开发接口，mindx-toolbox算力、PCIE带宽测试等工具。

用例运行

cann的用例

samples: CANN Samples

官方参考示例如上。目前主要采集图像部分。

mindx的用例

mindxsdk-referenceapps: MindX SDK Reference Apps

如下网址可以看到昇腾能做什么

昇腾AI应用案例-昇腾社区j

personcount示例

contrib/PersonCount · Ascend/mindxsdk-referenceapps - 码云 - 开源中国

环境变量设置

export MX_SDK_HOME=/home/HwHiAiUser/mxVision-6.0.RC3/

source  /usr/local/Ascend/ascend-toolkit/set_env.sh
source /home/HwHiAiUser/mxVision/set_env.sh

模型转换

atc --input_shape="blob1:8,3,800,1408" --weight=model/count_person.caffe.caffemodel --input_format=NCHW --output=model/count_person_8.caffe --soc_version=Ascend310P3 --insert_op_conf=model/insert_op.cfg --framework=0 --model=model/count_person.caffe.prototxt

修改后处理的编译选项

不修改此参数，会报如下错误：

error: expected primary-expression before '>' token

LogMessageFatal(file, line, std::make_uniquestd::string(names));

error: 'make_unique' is not a member of 'std'

LogMessageFatal(file, line, std::make_uniquestd::string(names));

cat Plugin1/CMakeLists.txt

默认的-std=c++11，修改为如下的-std=c++14

include_directories(${PROJECT_SOURCE_DIR}/opensource/include/glib-2.0)
include_directories(${PROJECT_SOURCE_DIR}/opensource/lib/glib-2.0/include)

link_directories(${PROJECT_SOURCE_DIR}/opensource/lib/)
link_directories(${PROJECT_SOURCE_DIR}/lib)

add_compile_options(-std=c++14 -fPIC -fstack-protector-all -pie -Wno-deprecated-declarations)
add_compile_options("-DPLUGIN_NAME=${PLUGIN_NAME}")

编译后处理库

bash build.sh

-- Build files have been written to: /home/HwHiAiUser/PersonCount/Plugin1/build
[ 50%] Building CXX object CMakeFiles/countpersonpostprocess.dir/CountPersonPostProcessor.cpp.o
[100%] Linking CXX shared library libcountpersonpostprocess.so
[100%] Built target countpersonpostprocess

运行

 ./run.sh
Begin to initialize Log.
The output directory of logs file exist.
WARNING: Logging before InitGoogleLogging() is written to STDERR
I20190903 10:02:50.367968 281465208715200 FileUtils.cpp:339] The input file is empty
I20190903 10:02:50.368009 281465208715200 FileUtils.cpp:495] Check Other group permission: Current permission is 4, but required no greater than 0.
Save logs information to specified directory.

(gst-plugin-scanner:1191): GStreamer-WARNING **: 10:02:50.420: Failed to load plugin '/home/HwHiAiUser/mxVision/opensource/lib/gstreamer-1.0/libgstinsertbin.so': libgstinsertbin-1.0.so.0: cannot open shared object file: No such file or directory

(gst-plugin-scanner:1191): GStreamer-WARNING **: 10:02:50.939: Failed to load plugin '/home/HwHiAiUser/mxVision/opensource/lib/gstreamer-1.0/libgstinsertbin.so': libgstinsertbin-1.0.so.0: cannot open shared object file: No such file or directory





total image number: 316
time cost 50.494364976882935 s
MAE: 21.838607594936708         MSE: 39.39852371455149