MLPerf inference基准测试步骤——Google Colab篇_onnx

最新推荐文章于 2025-04-03 08:52:43 发布

datutu_L

最新推荐文章于 2025-04-03 08:52:43 发布

阅读量4k

点赞数 41

分类专栏： MLPerf基准测试文章标签：测试工具深度学习机器学习

本文链接：https://blog.csdn.net/weixin_58277783/article/details/135716056

版权

MLPerf基准测试专栏收录该内容

2 篇文章

订阅专栏

MLPerf inference基准测试步骤——Google Colab篇_onnx

下面的测试步骤为使用谷歌免费服务器 Google Colab 进行的测试记录，以下流程均测试通过。
(此系列会持续更新，欢迎关注~)

测试环境

测试机器：Google Colab

基准模型：mobilenet、resnet50

模型框架：onnx

测试场景：视觉 Vision（图像分类）

数据集：fakeimagenet

测试步骤

1、挂载谷歌云盘

下面的步骤是将 Google Colab 挂载在谷歌云盘，Colab 数据都存储在Google Drive云端硬盘上。如何使用 Google Colab 和谷歌云盘可在网上找到相关教程，如果需要的话也可以单独出一篇相关文章。

from google.colab import drive
drive.mount('/content/drive')

!ls "/content/drive/MyDrive/"

!cd /
!ls

2、下载 MLPerf 仓库

创建一个目录，并从 Github 上克隆 MLPerf 仓库

!mkdir github_file
%cd github_file
!git clone https://github.com/mlperf/inference.git

进入以下目录

%cd inference/vision/classification_and_detection

3、构建并安装基准测试：

import os
root = os.getcwd()

官方实现中直接使用下面的命令来安装负载生成器（LoadGen），但实际这样做会出现错误。

!cd ../../loadgen; CFLAGS="-std=c++14" python setup.py develop; cd {root}
!python setup.py develop

因此在使用上述命令之前需要先安装以下必要的库：

!apt-get install python3-dev;
!apt-get install cmake;
!pip install  pytest;
!pip install  numpy;
!pip install  scipy;
!pip install  pybind11

安装完毕后再运行上述两条命令：

!cd ../../loadgen; CFLAGS="-std=c++14" python setup.py develop; cd {root}
!python setup.py develop

# 运行过程中会打印出一大串信息，最后几行信息如下
'''
running develop
......
Using /usr/local/lib/python3.10/dist-packages
Finished processing dependencies for mlperf-inference==0.1.0
'''

4、下载相关的库文件

该测试使用的是 onnx 模型进行测试，因此需要下载onnxruntime。

!pip install onnxruntime pycocotools opencv-python

5、下载模型

该测试使用的是 mobilenet 和 resnet50 两个模型来进行测试，mobilenet 为 resnet50 模型的轻量级模型，以降低模型复杂性和计算负担。

# 下载 mobilenet 模型
!wget -q https://zenodo.org/record/3157894/files/mobilenet_v1_1.0_224.onnx

# 下载 resnet50 模型
!wget -q https://zenodo.org/record/2592612/files/resnet50_v1.onnx

更多支持的模型链接inference/vision/classification_and_detection at master · mlcommons/inference (github.com)，可下载的模型如下图所示：

在这里插入图片描述

6、下载数据集

在本测试中，使用 MLPerf 提供的 tools/make_fake_imagenet.sh 工具创建一个假装为 imagenet 的小型假数据集。

!tools/make_fake_imagenet.sh

运行上述名字后，将在 /vision/classification_and_detection 文件夹中创建一个 fakeimagenet 文件夹，里面包含 val 文件夹和 val_map.txt 文件，val 文件夹中存放着 8 张假的图像数据集。

通常需要下载 imagenet2012/valiation 进行图像分类，或下载 coco2017/valiation 进行对象检测。如何下载数据集的链接和说明可以在inference/vision/classification_and_detection at master · mlcommons/inference (github.com)中找到。

7、添加环境变量

需要添加以下两个环境变量

import os
os.environ['MODEL_DIR'] = root
os.environ['DATA_DIR'] = os.path.join(root, "fake_imagenet")

对于 mlperf 提交的查询数、时间、延迟和百分位数，一般默认使用的设置。这个测试收中参考了官方教程，传递了一些额外的选项来让测试进展得更快。 run_local.sh 将查找环境变量 EXTRA_OPS 并将其添加到参数中。还可以在命令行中添加其他参数。以下选项将基准测试的运行时间限制为 10 秒，并添加准确性报告。

os.environ['EXTRA_OPS'] ="--time 10 --max-latency 0.2"

8、运行基准测试

!./run_local.sh onnxruntime mobilenet cpu --scenario SingleStream

直接运行上述命令后若出现很多很多信息，其中可能包含以下错误信息：

'''
......
024-01-16 06:35:35.747087365 [W:onnxruntime:, graph.cc:1283 Graph] Initializer MobilenetV1/MobilenetV1/Conv2d_8_pointwise/Conv2D_bn_offset:0 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
2024-01-16 06:35:35.747100740 [W:onnxruntime:, graph.cc:1283 Graph] Initializer MobilenetV1/MobilenetV1/Conv2d_7_pointwise/Conv2D_bn_offset:0 appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py.
......
'''

解决方案：

在 vision/classification_and_detection 文件夹中新建 remove_initializer_from_input.py 文件。该文件可在如下链接中下载：onnxruntime/tools/python/remove_initializer_from_input.py at main · microsoft/onnxruntime (github.com)，将下载的文件放到 classification_and_detection 文件夹中。

运行以下命令：

!python remove_initializer_from_input.py --input ./mobilenet_v1_1.0_224.onnx --output ./mobilenet_v1_1.0_224.onnx

然后再运行基准测试（mobilenet模型）：

!./run_local.sh onnxruntime mobilenet cpu --scenario SingleStream

在这里插入图片描述

--scenario {SingleStream,MultiStream,Server,Offline} 选择要测试基准的场景。

同时会输出日志文件，文件存储在 output文件夹中

!ls ./output/onnxruntime-cpu/mobilenet

# 输出结果：
'''
mlperf_log_accuracy.json  mlperf_log_summary.txt  results.json
mlperf_log_detail.txt	  mlperf_log_trace.json
'''

测试结果存储在 mlperf_log_summary.txt 文件中，可查看该日志文件中的内容获取测试结果。

!cat ./output/onnxruntime-cpu/mobilenet/mlperf_log_summary.txt

'''
================================================
MLPerf Results Summary
================================================
SUT name : PySUT
Scenario : Offline
Mode     : PerformanceOnly
Samples per second: 44.5301
Result is : INVALID
  Min duration satisfied : NO
  Min queries satisfied : Yes
  Early stopping satisfied: Yes
Recommendations:
 * Increase expected QPS so the loadgen pre-generates a larger (coalesced) query.

================================================
Additional Stats
================================================
Min latency (ns)                : 1509950925
Max latency (ns)                : 2245673702
Mean latency (ns)               : 1756932138
50.00 percentile latency (ns)   : 1525856934
90.00 percentile latency (ns)   : 2245673702
95.00 percentile latency (ns)   : 2245673702
97.00 percentile latency (ns)   : 2245673702
99.00 percentile latency (ns)   : 2245673702
99.90 percentile latency (ns)   : 2245673702

================================================
Test Parameters Used
================================================
samples_per_query : 100
target_qps : 1
target_latency (ns): 0
max_async_queries : 1
min_duration (ms): 10000
max_duration (ms): 10000
min_query_count : 1
max_query_count : 0
qsl_rng_seed : 148687905518835231
sample_index_rng_seed : 520418551913322573
schedule_rng_seed : 811580660758947900
accuracy_log_rng_seed : 0
accuracy_log_probability : 0
accuracy_log_sampling_target : 0
print_timestamps : 0
performance_issue_unique : 0
performance_issue_same : 0
performance_issue_same_index : 0
performance_sample_count : 8

No warnings encountered during test.

No errors encountered during test.
'''

运行基准测试（resnet50模型）：

!./run_local.sh onnxruntime resnet50 cpu --scenario SingleStream

在这里插入图片描述

9、附加

基准测试应用程序使用 shell 脚本来简化命令行选项，用户可以选择后端、型号和设备：

!./run_local.sh

# 输出结果：
'''
usage: ./run_local.sh tf|onnxruntime|pytorch|tflite|tvm-onnx|tvm-pytorch|tvm-tflite [resnet50|mobilenet|ssd-mobilenet|ssd-resnet34|retinanet] [cpu|gpu]
'''

模型是 [resnet50|retinanet|mobilenet|ssd-mobilenet|ssd-resnet34] 之一；