GEM5 x86 parsec full system configuration x86-parsec-benchmarks.py 设置详解

最新推荐文章于 2024-08-11 23:53:29 发布

yz_弘毅道远

最新推荐文章于 2024-08-11 23:53:29 发布

阅读量359

点赞数

分类专栏： GEM5 片上网络NoC 文章标签：硬件架构

本文链接：https://blog.csdn.net/qq_34898487/article/details/134172795

版权

GEM5 同时被 2 个专栏收录

38 篇文章 24 订阅

订阅专栏

片上网络NoC

36 篇文章 8 订阅

订阅专栏

简介

之前介绍了用默认的python文件作为configuration，跑通结果。如果需要修改配置，则需要了解到底用了哪些东西才能进行更改。本文逐行分析了configs/example/gem5_library/ x86-parsec-benchmarks.py的代码，即官方教程对parsec full system的配置。而且参考了 ruby_random_test.py 的配置，加以分析parsec + garnet的系统配置。

x86-parsec-benchmarks.py

这是官方的配置文件，描述如下

Script to run PARSEC benchmarks with gem5.
The script expects a benchmark program name and the simulation
size. The system is fixed with 2 CPU cores, MESI Two Level system
cache and 3 GB DDR4 memory. It uses the x86 board.

This script will count the total number of instructions executed
in the ROI. It also tracks how much wallclock and simulated time.

Usage:
  ./build/X86/gem5.opt \
    configs/example/gem5_library/x86-parsec-benchmarks.py \
    --benchmark <benchmark_name> \
    --size <simulation_size>

诸行分析

import 各种库

47-65行是引入各种库，照搬即可。通过与 ruby_random_test.py 对比，发现 parsec的配置引入了 import X86Board，还引入了 DualChannelDDR4_2400，SimpleSwitchableProcessor等库，这个虚拟的computer 比rubyrandome test虚拟的要复杂一些。

check gem5软件构建

官方教程下一句是：

# We check for the required gem5 build.

requires(
    isa_required=ISA.X86,
    coherence_protocol_required=CoherenceProtocol.MESI_TWO_LEVEL,
    kvm_required=True,
)

这里几乎不涉及编程，只是写了一些条件。如果是按教程来的就不会有问题，唯一的要求是之前要为了x86的isa编译。如果为了arm的编译，就不能直接用这个x86 parsec教程文件。

命令行可输入参数

可以通过命令行输入不同的字符串，来选择运行13种benchmar和三个仿真颗粒度。

benchmark_choices = [
    "blackscholes",
    "bodytrack",
    "canneal",
    "dedup",
    "facesim",
    "ferret",
    "fluidanimate",
    "freqmine",
    "raytrace",
    "streamcluster",
    "swaptions",
    "vips",
    "x264",
]

# Following are the input size.
size_choices = ["simsmall", "simmedium", "simlarge"]
parser = argparse.ArgumentParser(
    description="An example configuration script to run the npb benchmarks."
)
# The arguments accepted are the benchmark name and the simulation size.
parser.add_argument(
    "--benchmark",
    type=str,
    required=True,
    help="Input the benchmark program to execute.",
    choices=benchmark_choices,
)

parser.add_argument(
    "--size",
    type=str,
    required=True,
    help="Simulation size the benchmark program.",
    choices=size_choices,
)
args = parser.parse_args()

仿真的board设置

在gem5里，我们仿真了一个硬件来运行ubuntu以及ubuntu里的parsec benchmark。配置里会编辑一个board。
先是提供一些board会用上的组件：例如processor ，memory， cache_hierarchy。
教程的顺序和逻辑的顺序不完全一致，教程是先提供好 cache_hierarchy， memory，processor。

from gem5.components.cachehierarchies.ruby.mesi_two_level_cache_hierarchy import (
    MESITwoLevelCacheHierarchy,
)

cache_hierarchy = MESITwoLevelCacheHierarchy(
    l1d_size="32kB",
    l1d_assoc=8,
    l1i_size="32kB",
    l1i_assoc=8,
    l2_size="256kB",
    l2_assoc=16,
    num_l2_banks=2,
)

# Memory: Dual Channel DDR4 2400 DRAM device.
# The X86 board only supports 3 GB of main memory.

memory = DualChannelDDR4_2400(size="3GB")

# Here we setup the processor. This is a special switchable processor in which
# a starting core type and a switch core type must be specified. Once a
# configuration is instantiated a user may call `processor.switch()` to switch
# from the starting core types to the switch core types. In this simulation
# we start with KVM cores to simulate the OS boot, then switch to the Timing
# cores for the command we wish to run after boot.

processor = SimpleSwitchableProcessor(
    starting_core_type=CPUTypes.KVM,
    switch_core_type=CPUTypes.TIMING,
    isa=ISA.X86,
    num_cores=2,
)

然后教程调用引入的board库，完成board的配置：其中 =右边的，就是刚刚创建的组件。

# Here we setup the board. The X86Board allows for Full-System X86 simulations

board = X86Board(
    clk_freq="3GHz",
    processor=processor,
    memory=memory,
    cache_hierarchy=cache_hierarchy,
)

指定启动后和结束时的时刻和命令

没什么需要改动的，当八股照抄即可。

command = (
    "cd /home/gem5/parsec-benchmark;".format(args.benchmark)
    + "source env.sh;"
    + f"parsecmgmt -a run -p {args.benchmark} -c gcc-hooks -i {args.size}         -n 2;"
    + "sleep 5;"
    + "m5 exit;"
)
board.set_kernel_disk_workload(
    # The x86 linux kernel will be automatically downloaded to the
    # `~/.cache/gem5` directory if not already present.
    # PARSEC benchamarks were tested with kernel version 4.19.83
    kernel=Resource("x86-linux-kernel-4.19.83"),
    # The x86-parsec image will be automatically downloaded to the
    # `~/.cache/gem5` directory if not already present.
    disk_image=Resource("x86-parsec"),
    readfile_contents=command,
)

# functions to handle different exit events during the simuation
def handle_workbegin():
    print("Done booting Linux")
    print("Resetting stats at the start of ROI!")
    m5.stats.reset()
    processor.switch()
    yield False


def handle_workend():
    print("Dump stats at the end of the ROI!")
    m5.stats.dump()
    yield True

然后会创建一个Simulator对象，它会用我们的board作为仿真的

simulator = Simulator(
    board=board,
    on_exit_event={
        ExitEvent.WORKBEGIN: handle_workbegin(),
        ExitEvent.WORKEND: handle_workend(),
    },
)

启动和计时

这一段也是八股，照用就行，记录开始的时间结束时间以及打印一下。

# We maintain the wall clock time.

globalStart = time.time()

print("Running the simulation")
print("Using KVM cpu")

m5.stats.reset()

# We start the simulation
simulator.run()

print("All simulation events were successful.")

# We print the final simulation statistics.

print("Done with the simulation")
print()
print("Performance statistics:")

print("Simulated time in ROI: " + ((str(simulator.get_roi_ticks()[0]))))
print(
    "Ran a total of", simulator.get_current_tick() / 1e12, "simulated seconds"
)
print(
    "Total wallclock time: %.2fs, %.2f min"
    % (time.time() - globalStart, (time.time() - globalStart) / 60)
)

代码解读：
启动是通过 simulator.run()开始的。
在gem5/src/python/gem5/simulate/simulator.py 里，先是定义了，用户如果定义了on-exit-event，就用用户自己的，否则用默认的。而我们在教程的x86-parsec-benchmarks.py里，是定义过on_exit_event={
ExitEvent.WORKBEGIN: handle_workbegin(),
ExitEvent.WORKEND: handle_workend(),
},。

  if on_exit_event:
            self._on_exit_event = on_exit_event
        else:
            self._on_exit_event = self._default_on_exit_dict

现在在simulator看来，它是知道了on_exit_event的，但是它并没有使用，直到 x86-parsec-benchmarks.py 用到了simulator.run()。
这里的simulator.run()是调用了 gem5/src/python/gem5/simulate/simulator.py已经写好的函数，其中run函数如下图，会一直运行而且try 是否退出，满足退出条件时，这个run就会退出。
从x86-parsec-benchmarks.py角度看，它之前声明了一堆变量，就是为了simulator.run，run完成后，x86-parsec-benchmarks.py也打印simulator.run花费的时间。
到这里，单纯的 x86-parsec-benchmarks.py代码解读就完毕了。
在这里插入图片描述

x86-parsec-benchmarks.py以外的其他文件

m5stats读写文件

这个python直接的print，其实只有print一些字符和simulator.run 花费的时间。
那么，我们需要分析的ROI，需要关注的m5out里的一堆stats信息在哪里呢？
是由x86-parsec-benchmarks.py 调用的m5.stats.reset() 和m5.stats.dump() 完成的。
而这两个函数api，在gem5/src/python/m5/stats/init.py 里定义。

在这里插入图片描述

在这里插入图片描述
其中dump操作多一些，gem5/src/python/m5/stats/init.py里会把很多东西存到outputList(当前仍然只在python程序 init.py的内存里)，再用 from .gem5stats import JsonOutputVistor 开始读写文件。
而JsonOutputVistor 是在gem5/src/python/m5/stats/gem5stats.py 定义的，它有一个功能是打开文件并读写，如下：
：with open(self.file, "w") as fp

到了这一步，我们就知道教程的x86-parsec-benchmarks.py怎么调用src里的各个文件，来完成结果的IO读写了。

garnet parser

研究兴趣的原因，我需要使用garnet+parsec，然而 x86-parsec-benchmarks.py 是不支持garnet的argparse，而gem5/configs/example/ruby_random_test.py 是支持的。
例如：运行 x86-parsec-benchmarks.py 的教程没有garnet选项：

build/X86/gem5.opt configs/example/gem5_library/x86-parsec-benchmarks.py
 --benchmark blackscholes 
 --size simsmall

运行ruby_random_test.py则可以使用garnet argparser

./build/X86/gem5.opt \
                      configs/example/ruby_random_test.py \
                      --num-cpus=16  \
                      --num-dirs=16  \
                      --network=garnet
                      --topology=Mesh_XY  \
                      --mesh-rows=4

对比x86-parsec-benchmarks.py 和ruby_random_test.py 发现ruby_random_test.py用了ruby，而gem5/src/mem/ruby/network/下有两个文件夹garnet和simple，也对应了 --network=garnet或者simple。
在这里插入图片描述
##如果要配置现有的garnet
gem5/src/mem/ruby/network/garnet/下有很多文件，都是可以调用的。
其中garnet自己的官方描述如下：

将来要改garnet源代码：

gem5/src/mem/ruby/network/garnet/SConscript 下定义了需要编译的c++代码。如果想用自己的router或在NI，就得手写一个新的并且在这里source一下，然后再全部重新一遍gem5。
在这里插入图片描述

yz_弘毅道远

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
4
评论
GEM5 x86 parsec full system configuration x86-parsec-benchmarks.py 设置详解

之前介绍了用默认的python文件作为configuration，跑通结果。如果需要修改配置，则需要了解到底用了哪些东西才能进行更改。本文逐行分析了configs/example/gem5_library/ x86-parsec-benchmarks.py的代码，即官方教程对parsec full system的配置。而且参考了 ruby_random_test.py 的配置，加以分析parsec + garnet的系统配置。
复制链接

扫一扫

专栏目录