nsight compute cli 测试程序获得参数都是n/a的解决办法

nsight compute cli 测试程序获得参数都是n/a的解决办法

设备环境

显卡: 1660Ti

CUDA版本: 10.1

CUDA driver版本: 455.45.01

操作系统: Ubuntu18.04

问题描述

利用nsight compute cli对CUDA程序进行分析,在终端中输入:

sudo /usr/local/cuda-10.1/NsightCompute-2019.1/nv-nsight-cu-cli --f ./test 

获得结果如下:

==PROF== Connected to process 4835
==PROF== Profiling "matrix_add_2D" - 1: 0%....50%....100% - 3 passes
Success!
==PROF== Disconnected from process 4835
[4835] test@127.0.0.1
matrix_add_2D, 2020-Dec-08 22:44:46, Context 1, Stream 7
Section: GPU Speed Of Light
---------------------------------------------------------------------- --------------- ------------------------------
Memory Frequency                                                                                              (!) n/a
SOL FB                                                                                                        (!) n/a
Elapsed Cycles                                                                                                (!) n/a
SM Frequency                                                                                                  (!) n/a
Memory [%]                                                                                                      (!) n/a
Duration                                                                                                      (!) n/a
SOL L2                                                                                                        (!) n/a
SOL TEX                                                                                                       (!) n/a
SM Active Cycles                                                                                              (!) n/a
SM [%]                                                                                                        (!) n/a
---------------------------------------------------------------------- --------------- ------------------------------

Section: Compute Workload Analysis
---------------------------------------------------------------------- --------------- ------------------------------
Executed Ipc Active                                                                                           (!) n/a
Executed Ipc Elapsed                                                                                          (!) n/a
Issue Slots Max                                                                                               (!) n/a
Issued Ipc Active                                                                                             (!) n/a
Issue Slots Busy                                                                                              (!) n/a
SM Busy                                                                                                       (!) n/a
---------------------------------------------------------------------- --------------- ------------------------------

Section: Memory Workload Analysis
---------------------------------------------------------------------- --------------- ------------------------------
Memory Throughput                                                                                             (!) n/a
Mem Busy                                                                                                      (!) n/a
Max Bandwidth                                                                                                 (!) n/a
L2 Hit Rate                                                                                                   (!) n/a
Mem Pipes Busy                                                                                                (!) n/a
L1 Hit Rate                                                                                                   (!) n/a
---------------------------------------------------------------------- --------------- ------------------------------

Section: Scheduler Statistics
---------------------------------------------------------------------- --------------- ------------------------------
Active Warps Per Scheduler                                                                                    (!) n/a
Eligible Warps Per Scheduler                                                                                  (!) n/a
No Eligible                                                                                                   (!) n/a
Instructions Per Active Issue Slot                                                                            (!) n/a
Issued Warp Per Scheduler                                                                                     (!) n/a
One or More Eligible                                                                                          (!) n/a
---------------------------------------------------------------------- --------------- ------------------------------

Section: Warp State Statistics
---------------------------------------------------------------------- --------------- ------------------------------
Avg. Not Predicated Off Threads Per Warp                                                                      (!) n/a
Avg. Active Threads Per Warp                                                                                  (!) n/a
Warp Cycles Per Executed Instruction                                                                          (!) n/a
Warp Cycles Per Issued Instruction                                                                            (!) n/a
Warp Cycles Per Issue Active                                                                                  (!) n/a
---------------------------------------------------------------------- --------------- ------------------------------

Section: Instruction Statistics
---------------------------------------------------------------------- --------------- ------------------------------
Avg. Executed Instructions Per Scheduler                                                                      (!) n/a
Executed Instructions                                                                                         (!) n/a
Avg. Issued Instructions Per Scheduler                                                                        (!) n/a
Issued Instructions                                                                                           (!) n/a
---------------------------------------------------------------------- --------------- ------------------------------

Section: Launch Statistics
---------------------------------------------------------------------- --------------- ------------------------------
Block Size                                                                                                      1,024
Grid Size                                                                                                       1,024
Registers Per Thread                                                   register/thread                             16
Shared Memory Configuration Size                                                 Kbyte                          49.15
Dynamic Shared Memory Per Block                                             byte/block                              0
Static Shared Memory Per Block                                              byte/block                              0
Threads                                                                         thread                      1,048,576
Waves Per SM                                                                                                    42.67
---------------------------------------------------------------------- --------------- ------------------------------

Section: Occupancy
---------------------------------------------------------------------- --------------- ------------------------------
Block Limit SM                                                                   block                             16
Block Limit Registers                                                         register                              4
Block Limit Shared Mem                                                            byte                            nan
Block Limit Warps                                                                 warp                              1
Achieved Active Warps Per SM                                                                                  (!) n/a
Achieved Occupancy                                                                                            (!) n/a
Theoretical Active Warps per SM                                             warp/cycle                             32
Theoretical Occupancy                                                                %                            100
---------------------------------------------------------------------- --------------- ------------------------------

可以看到许多信息都显示为n/a,无法获得正确的参数值

解决方式

改变CUDA版本,由v10.1到v10.1 update2,之前看不到的参数就能正确显示。

参考资料

https://developer.nvidia.com/blog/using-nsight-compute-to-inspect-your-kernels/

  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值