Qualcomm® AI Engine Direct 使用手册(5)

150 篇文章 13 订阅
50 篇文章 2 订阅

Qualcomm® AI Engine Direct 使用手册(5)


4.1.2 HTP - QNN HTP 后端扩展

QNN HTP 后端扩展

qnn-net-run 实用程序与后端无关,这意味着它只能使用通用 QNN API。后端扩展功能 方便使用后端特定 API,即自定义配置。有关后端扩展的更多文档 可以在qnn-net-run下找到。请注意,QNN 后端扩展的范围是 仅限于 qnn-net-run。

HTP 后端扩展是一个为 HTP 后端提供自定义选项的接口。还需要启用不同的 性能模式。这些选项和性能模式可以通过提供扩展共享库来行使 libQnnHtpNetRunExtensions.so 和配置文件(如有必要)。

要将后端扩展相关参数与 qnn-net-run 一起使用,请使用 --config_file 参数并提供 JSON 文件的路径。

$ qnn-net-run --model <qnn_model_name.so> \
              --backend <path_to_model_library>/libQnnHtp.so \
              --output_dir <output_dir_for_result> \
              --input_list <path_to_input_list.txt>
              --config_file <path to JSON of backend extensions>

上述配置文件包含通过 JSON 指定的最少参数(例如后端扩展配置),如下所示:

{
    "backend_extensions" :
    {
        "shared_library_path" : "path_to_shared_library",  // give path to shared extensions library (.so)
        "config_file_path" : "path_to_config_file"         // give path to backend config
    }
}

用户可以通过后端配置为 HTP 后端设置自定义选项和不同的性能模式。各种种类 配置中可用的选项如下所示:

{
   "type": "object", "properties": {
     "graphs": {
         "type": "object", "properties": {

           // Corresponds to the graph name provided to QnnGraph_create
           // Used by qnn-net-run during online prepare and qnn-context-binary-generator uses it during offline preparation
           "graph_names": {"type": "array", "items": {"type": "string"}},

           // Provides performance infrastructure configuration options that are memory specific [optional]
           // Used by qnn-net-run during online prepare and qnn-context-binary-generator uses it during offline preparation
           "vtcm_mb": {"type": "integer"},

           // Used to perform computation with half precision i.e. 16 bits [optional] [default: 0]
           // Used by qnn-net-run during online prepare and qnn-context-binary-generator uses it during offline preparation
           "fp16_relaxed_precision": {"type": "integer"},

           // Corresponds to the number of HVX threads to use for a particular graph during an inference.
          // Used by qnn-net-run during online prepare and qnn-context-binary-generator uses it during offline preparation
           "hvx_threads": {"type": "integer"},

           // Set Graph optimization value in range 1 to 3 [optional] [default: 2]
           // 1 = Faster preparation time, less optimal graph, 2 = Longer preparation time, more optimal graph
           // 3 = Longest preparation time, most likely even more optimal graph
           // Used by qnn-net-run during online prepare and qnn-context-binary-generator uses it during offline preparation
           "O": {"type": "number", "multipleOf": 1},

           // Provide deep learning bandwidth compression value 0 or 1 [optional] [default: 0]
           // Used by qnn-net-run during online prepare and qnn-context-binary-generator uses it during offline preparation
           "dlbc": {"type": "number", "multipleOf": 1}
         }
     },
     "devices": {
       "type": "array", "items": {
         "type": "object", "properties": {

           // Selection of the device [optional] [default: 0]
           // Used by qnn-net-run
           "device_id": {"type": "integer"},

           // Selection of the SoC [optional] [default: 0]
           // Used by qnn-net-run and qnn-context-binary-generator
           "soc_id": {"type": "integer"},

           // Set dsp architecture value [optional] [default: NONE]
           // Used by qnn-net-run and qnn-context-binary-generator
           "dsp_arch": {"type": "string"},

           // Specifies the user pd attribute [optional] [default: "unsigned"]
           // Used by qnn-net-run and qnn-context-binary-generator
           "pd_session": {"type": "string"},

           // Used for linting profiling level [optional] [default: not set]
           // Used by qnn-net-run and qnn-context-binary-generator
           "profiling_level": {"type": "string"},

           // Specifies whether to use null context or not. true means using a unique power context id, and false means using null context.
           // NOTE: This parameter is not supported for v68 onwards
           // Used by qnn-net-run
           "use_client_context": {"type": "boolean"},
           "cores": {
             "type": "array", "items": {
               "type": "object", "properties": {

                 // Select the core [optional] [default: 0]
                 // Used by qnn-net-run
                 "core_id": {"type": "integer"},

                 // Provide performance profile [optional] [default: "high_performance"]
                 // Used by qnn-net-run
                 // NOTE: command line perf profile option is now deprecated.
                 "perf_profile": {"type": "string"},

                 // Rpc control latency value in micro second [optional] [default: 100us]
                 // Used by qnn-net-run
                 "rpc_control_latency": {"type": "integer"},

                 // Rpc polling time value in micro second [optional]
                 // [default: 9999 us for burst, high_performance & sustained_high_performance, 0 us for other perf profiles]
                 // Used by qnn-net-run
                 "rpc_polling_time": {"type": "integer"},

                 // Hmx timeout value in micro second [optional] [default: 300000us]
                 // Used by qnn-net-run
                 "hmx_timeout_us": {"type": "integer"}
               }
             }
           }
         }
       }
     },
     "context": {
       "type": "object", "properties": {

         // Used for enabling Weight Sharing [optional] [default: false]
         // Used by qnn-context-binary-generator during offline preparation
         "weight_sharing_enabled": {"type": "boolean"},

         // Used to associate max spill-fill buffer size across multiple contexts within a group [optional] [default: Not Set]
         // Used by qnn-net-run and throughput-net-run during offline preparation. group_id value must be set to 0 for this option to be used.
         "max_spill_fill_buffer_for_group": {"type": "integer"},

         // Specifies the group id to which contexts can be associated [optional] [default: None]
         // Used by qnn-net-run and throughput-net-run during offline preparation.
         "group_id": {"type": "integer"}
       }
     }
   }
}

检查Qnn_SocModel_t设置 soc_id 参数。 注意,这里的图对象从 SDK 2.20 版本开始将被弃用,改为图数组,如下所示:

{
   "graphs": [
     {
        .....
     },
        .....
   ]
}

具有 HTP 后端扩展的性能模式
可以使用 perf_profile 参数通过后端配置启用后端扩展性能模式,如上所示。 有效设置为 low_balanced、balanced、default、high_performance、持续_high_performance、burst、low_power_saver、 power_saver、high_power_saver、extreme_power_saver 和 system_settings。这些性能模式使用不同的配置 核心时钟、总线时钟、Dcvs 和睡眠延迟。有 3 种电压角定义为 TURBO、NOM 和 SVS 它们具有不同的最小和最大频率阈值。除了设置最大和最小电压角之外 目标支持的最大和最小频率。有关性能模式配置的更多详细信息 及参数详细信息,请参考hexagon sdk文档。不同性能模式使用的这些设置如下表所示:

在这里插入图片描述

上表按性能从最高性能 (BURST) 到最低性能 (EXTREME_POWER_SAVER) 排序。 BURST 和 SUSTAINED_HIGH_PERFORMANCE 在执行期间使用计时器,这有助于保持所有推论的高投票率并避免 随后进行上下性能投票,直到超时。它们具有较低的睡眠延迟并在执行期间禁用 DCVS。 DCVS 均可增加 并降低核心/总线时钟速度,同时使用 min_corner 和 max_corner 投票作为下限和上限准则。 BURST 频率最高,投票率最高,性能最好。 POWER_SAVER、LOW_POWER_SAVER 和 HIGH_POWER_SAVER频率较低,不支持投票。它们具有较高的睡眠延迟并在执行期间启用 DCVS。 EXTREME_POWER_SAVER 是性能最低的性能模式,但节省的电量最高。有关性能模式的更多详细信息 这些参考使用的电压角 文件 QnnHtpPerfInfrastruct.h 的程序列表

以下配置可用于设置性能配置文件和 rpc 轮询时间:

{
   "graphs": {
       ...
       ...
   },
   "devices": [
      {
         ...
         "cores":[
            {
                "perf_profile": "burst",    // use this to set any of the above performance profile
                "rpc_polling_time": 9999    // use this to set rpc polling, ranges 0-9999 us
                "rpc_control_latency": 100  // use to set rpc control latency
            }
         ]
      }
   ]
}

请注意,上述配置结构将从 SDK 2.20 版本开始弃用,支持的新配置如下所示:

{
   "graphs": [
     {
       ...
       ...
     }
     ....
   ],
   "devices": [
      {
         ...
         "cores":[
            {
                "perf_profile": "burst",    // use this to set any of the above performance profile
                "rpc_polling_time": 9999    // use this to set rpc polling, ranges 0-9999 us
                "rpc_control_latency": 100  // use to set rpc control latency
            }
         ]
      }
   ]
}
  • 25
    点赞
  • 24
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
高通 qualcomm QXDM 工具使用手册 The QXDM Pro software is provided either over the network or on CD-ROM. The installer sets up the QXDM Pro execution environment, which includes installing application binaries, data files, and documentation; registering COM automation components and file associations; and configuring QXDM Pro for initial use. The QXDM Pro installation consists of two main folders:  QXDM program folder – The path to this folder is set by the user when installing QXDM. The default path offered by the installer is based on the underlying Microsoft operating system program files folder, which is typically C:\Program Files\Qualcomm\QXDM. After installation, this folder will contain subfolders containing the QXDM binaries, the QXDM documentation, and parsing DLL wizards for the Microsoft Visual Studio development environment. The contents of this folder should be considered read-only.  QXDM data folder – The base path to this folder is set by the underlying Microsoft operating system and represents the documents folder shared by all users of the host PC. Typically, the Microsoft Windows shared documents folder is located at C:\Documents and Settings\All Users\Documents. After installation, this folder will contain a Qualcomm\QXDM subfolder. The resulting complete path represents the QXDM data folder. A shortcut to this location named “QXDM Data” is located in the Start Menu under the “QXDM Professional” group. Under this will be several subfolders containing QXDM automation script samples, the QXDM database, implementation files for all QXDM HTML-based displays, temporary QXDM item store format files, user-submitted QXDM extensions, reference dynamic parsing DLLs, and the reference QTI vendor database. Unlike the program folder, the QXDM data folder is designed to accommodate user extensions, such as a user database or a user authored QXDM HTML display. Two folders are utilized in order to be compliant with the current Microsoft Windows Logo Program requirements. The installer creates a QXDM Pro folder in the Windows Start Programs menu that can be run by selecting Start→All Programs→QXDM Professional→QXDM Professional. The installed application binaries and user guides are accessible from this location. Additionally, a shortcut to the QXDM Pro data folder is installed. NOTE: Attempting to install QXDM Pro by running the QXDMInstaller.msi file is not supported. The only supported means of installing QXDM Pro is via the setup.exe program.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值