nvidia-smi - NVIDIA System Management Interface program

nvidia-smi - NVIDIA System Management Interface program

1. nvidia-smi

deepnorth@deepnorth-amax:~/software$ nvidia-smi
Tue Jul  9 22:24:29 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.14       Driver Version: 430.14       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce RTX 208...  Off  | 00000000:1B:00.0 Off |                  N/A |
| 32%   48C    P0    66W / 250W |      0MiB / 11019MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce RTX 208...  Off  | 00000000:1C:00.0 Off |                  N/A |
| 34%   49C    P0    60W / 250W |      0MiB / 11019MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  GeForce RTX 208...  Off  | 00000000:1D:00.0 Off |                  N/A |
| 34%   49C    P0    63W / 250W |      0MiB / 11019MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   3  GeForce RTX 208...  Off  | 00000000:1E:00.0 Off |                  N/A |
| 33%   49C    P0    54W / 250W |      0MiB / 11019MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   4  GeForce RTX 208...  Off  | 00000000:89:00.0 Off |                  N/A |
| 32%   48C    P0    78W / 250W |      0MiB / 11019MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   5  GeForce RTX 208...  Off  | 00000000:8A:00.0 Off |                  N/A |
| 34%   49C    P0    65W / 250W |      0MiB / 11019MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   6  GeForce RTX 208...  Off  | 00000000:8B:00.0 Off |                  N/A |
| 33%   49C    P0    52W / 250W |      0MiB / 11019MiB |      1%      Default |
+-------------------------------+----------------------+----------------------+
|   7  GeForce RTX 208...  Off  | 00000000:8C:00.0 Off |                  N/A |
| 33%   50C    P0    55W / 250W |      0MiB / 11019MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
deepnorth@deepnorth-amax:~/software$
(base) yongqiang@famu-sys:~$ nvidia-smi
Thu Jan 16 18:55:16 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.48                 Driver Version: 390.48                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 108...  Off  | 00000000:02:00.0  On |                  N/A |
| 58%   76C    P2   120W / 250W |   5139MiB / 11177MiB |      3%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 108...  Off  | 00000000:03:00.0 Off |                  N/A |
| 60%   76C    P2    95W / 250W |   4845MiB / 11178MiB |     24%      Default |
+-------------------------------+----------------------+----------------------+
|   2  GeForce GTX 108...  Off  | 00000000:82:00.0 Off |                  N/A |
| 57%   75C    P2   200W / 250W |   4845MiB / 11178MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   3  GeForce GTX 108...  Off  | 00000000:83:00.0 Off |                  N/A |
| 60%   78C    P2   206W / 250W |   4845MiB / 11178MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1306      G   /usr/lib/xorg/Xorg                            14MiB |
|    0      2865      G   /usr/lib/xorg/Xorg                           155MiB |
|    0     28830      C   ./darknet                                   4957MiB |
|    1     28830      C   ./darknet                                   4833MiB |
|    2     28830      C   ./darknet                                   4833MiB |
|    3     28830      C   ./darknet                                   4833MiB |
+-----------------------------------------------------------------------------+
(base) yongqiang@famu-sys:~$

2. NVIDIA System Management Interface

NVIDIA System Management Interface
https://developer.nvidia.com/nvidia-system-management-interface

The NVIDIA System Management Interface (nvidia-smi) is a command line utility, based on top of the NVIDIA Management Library (NVML), intended to aid in the management and monitoring of NVIDIA GPU devices.

This utility allows administrators to query GPU device state and with the appropriate privileges, permits administrators to modify GPU device state. It is targeted at the TeslaTM, GRIDTM, QuadroTM and Titan X product, though limited support is also available on other NVIDIA GPUs.
该实用程序允许管理员查询 GPU 设备状态,并具有适当的特权,允许管理员修改 GPU 设备状态。

NVIDIA-smi ships with NVIDIA GPU display drivers on Linux, and with 64bit Windows Server 2008 R2 and Windows 7. Nvidia-smi can report query information as XML or human readable plain text to either standard output or a file. For more details, please refer to the nvidia-smi documentation.

intend [ɪnˈtend]:vt. 打算,想要,意指 vi. 有打算
aid [eɪd]:n. 援助,帮助,助手,帮助者 vt. 援助,帮助,有助于 vi. 帮助
appropriate [əˈprəʊprɪət;(for v.)əˈprəʊprɪeɪt]:adj. 适当的,恰当的,合适的 vt. 占用,拨出
privilege [ˈprɪvəlɪdʒ]:n. 特权,优待 vt. 给予…特权,特免
permit [pəˈmɪt]:vi. 许可,允许 vt. 许可,允许 n. 许可证,执照
administrator [ədˈmɪnɪstreɪtə(r)]:n. 管理人,行政官

3. nvidia-smi.txt

nvidia-smi.txt
https://developer.download.nvidia.cn/compute/DCGM/docs/nvidia-smi-367.38.pdf

nvidia-smi (also NVSMI) provides monitoring and management capabilities for each of NVIDIA’s Tesla, Quadro, GRID and GeForce devices from Fermi and higher architecture families. GeForce Titan series devices are supported for most functions with very limited information provided for the remainder of the Geforce brand. NVSMI is a cross platform tool that supports all standard NVIDIA driver-supported Linux distros, as well as 64bit versions of Windows starting with Windows Server 2008 R2. Metrics can be consumed directly by users via stdout, or provided by file via CSV and XML formats for scripting purposes.
nvidia-smi (also NVSMI) 为 Fermi 和更高架构家族的 NVIDIA’s Tesla, Quadro, GRID and GeForce devices 提供监视和管理功能。支持 GeForce Titan 系列设备的大多数功能,而剩余的 Geforce 品牌提供的信息则非常有限。NVSMI 是一个跨平台工具,支持所有标准的 NVIDIA 驱动程序支持的 Linux 发行版以及从 Windows Server 2008 R2 开始的 64 位版本的 Windows。

Note that much of the functionality of NVSMI is provided by the underlying NVML C-based library. See the NVIDIA developer website link below for more information about NVML. NVML-based python bindings are also available.

The output of NVSMI is not guaranteed to be backwards compatible. However, both NVML and the Python bindings are backwards compatible, and should be the first choice when writing any tools that must be maintained across NVIDIA driver releases.
不保证 NVSMI 的输出向后兼容。但是,NVML 和 Python bindings 都向后兼容,并且在编写必须在 NVIDIA 驱动程序版本之间维护的任何工具时,它应该是首选。

NVML SDK: http://developer.nvidia.com/nvidia-management-library-nvml/
Python bindings: http://pypi.python.org/pypi/nvidia-ml-py/

consume [kənˈsjuːm]:vt. 消耗,消费,使…着迷,挥霍 vi. 耗尽,毁灭,耗尽生命
guarantee [ˌɡærənˈtiː]:n. 保证,担保,保证人,保证书,抵押品 vt. 保证,担保

3.1 OPTIONS

GENERAL OPTIONS
-h, --help
Print usage information and exit.

SUMMARY OPTIONS
-L, --list-gpus
List each of the NVIDIA GPUs in the system, along with their UUIDs.

(base) yongqiang@famu-sys:~$ nvidia-smi -L
GPU 0: GeForce GTX 1080 Ti (UUID: GPU-b6764bed-48e1-97ac-e1aa-e94916bbf983)
GPU 1: GeForce GTX 1080 Ti (UUID: GPU-dbec65c2-c9a0-393d-3d3b-2dc2e5bbaaf5)
GPU 2: GeForce GTX 1080 Ti (UUID: GPU-51321451-290e-f3ef-73a9-1571d33166cb)
(base) yongqiang@famu-sys:~$
(base) yongqiang@famu-sys:~$ nvidia-smi --list-gpus
GPU 0: GeForce GTX 1080 Ti (UUID: GPU-b6764bed-48e1-97ac-e1aa-e94916bbf983)
GPU 1: GeForce GTX 1080 Ti (UUID: GPU-dbec65c2-c9a0-393d-3d3b-2dc2e5bbaaf5)
GPU 2: GeForce GTX 1080 Ti (UUID: GPU-51321451-290e-f3ef-73a9-1571d33166cb)
(base) yongqiang@famu-sys:~$

QUERY OPTIONS
-q, --query
Display GPU or Unit info. Displayed info includes all data listed in the (GPU ATTRIBUTES) or (UNIT ATTRIBUTES) sections of this document. Some devices and/or environments don’t support all possible information. Any unsupported data is indicated by a "N/A" in the output. By default information for all available GPUs or Units is displayed. Use the -i option to restrict the output to a single GPU or Unit.
在输出中将通过 N/A 指示所有不受支持的数据。

(base) yongqiang@famu-sys:~$ nvidia-smi -q

==============NVSMI LOG==============

Timestamp                           : Sun Nov 24 13:22:58 2019
Driver Version                      : 390.48

Attached GPUs                       : 3
GPU 00000000:02:00.0
    Product Name                    : GeForce GTX 1080 Ti
    Product Brand                   : GeForce
    Display Mode                    : Disabled
    Display Active                  : Enabled
    Persistence Mode                : Disabled
    Accounting Mode                 : Disabled
    Accounting Mode Buffer Size     : 4000
    Driver Model
        Current                     : N/A
        Pending                     : N/A
    Serial Number                   : N/A
    GPU UUID                        : GPU-b6764bed-48e1-97ac-e1aa-e94916bbf983
    Minor Number                    : 0
    VBIOS Version                   : 86.02.39.00.2A
    MultiGPU Board                  : No
    Board ID                        : 0x200
    GPU Part Number                 : N/A
    Inforom Version
        Image Version               : G001.0000.01.04
        OEM Object                  : 1.1
        ECC Object                  : N/A
        Power Management Object     : N/A
    GPU Operation Mode
        Current                     : N/A
        Pending                     : N/A
    GPU Virtualization Mode
        Virtualization mode         : None
    PCI
        Bus                         : 0x02
        Device                      : 0x00
        Domain                      : 0x0000
        Device Id                   : 0x1B0610DE
        Bus Id                      : 00000000:02:00.0
        Sub System Id               : 0x36091462
        GPU Link Info
            PCIe Generation
                Max                 : 3
                Current             : 1
            Link Width
                Max                 : 16x
                Current             : 16x
        Bridge Chip
            Type                    : N/A
            Firmware                : N/A
        Replays since reset         : 0
        Tx Throughput               : 0 KB/s
        Rx Throughput               : 0 KB/s
    Fan Speed                       : 29 %
    Performance State               : P8
    Clocks Throttle Reasons
        Idle                        : Active
        Applications Clocks Setting : Not Active
        SW Power Cap                : Not Active
        HW Slowdown                 : Not Active
            HW Thermal Slowdown     : Not Active
            HW Power Brake Slowdown : Not Active
        Sync Boost                  : Not Active
        SW Thermal Slowdown         : Not Active
        Display Clock Setting       : Not Active
    FB Memory Usage
        Total                       : 11177 MiB
        Used                        : 156 MiB
        Free                        : 11021 MiB
    BAR1 Memory Usage
        Total                       : 256 MiB
        Used                        : 6 MiB
        Free                        : 250 MiB
    Compute Mode                    : Default
    Utilization
        Gpu                         : 0 %
        Memory                      : 2 %
        Encoder                     : 0 %
        Decoder                     : 0 %
    Encoder Stats
        Active Sessions             : 0
        Average FPS                 : 0
        Average Latency             : 0
    Ecc Mode
        Current                     : N/A
        Pending                     : N/A
    ECC Errors
        Volatile
            Single Bit
                Device Memory       : N/A
                Register File       : N/A
                L1 Cache            : N/A
                L2 Cache            : N/A
                Texture Memory      : N/A
                Texture Shared      : N/A
                CBU                 : N/A
                Total               : N/A
            Double Bit
                Device Memory       : N/A
                Register File       : N/A
                L1 Cache            : N/A
                L2 Cache            : N/A
                Texture Memory      : N/A
                Texture Shared      : N/A
                CBU                 : N/A
                Total               : N/A
        Aggregate
            Single Bit
                Device Memory       : N/A
                Register File       : N/A
                L1 Cache            : N/A
                L2 Cache            : N/A
                Texture Memory      : N/A
                Texture Shared      : N/A
                CBU                 : N/A
                Total               : N/A
            Double Bit
                Device Memory       : N/A
                Register File       : N/A
                L1 Cache            : N/A
                L2 Cache            : N/A
                Texture Memory      : N/A
                Texture Shared      : N/A
                CBU                 : N/A
                Total               : N/A
    Retired Pages
        Single Bit ECC              : N/A
        Double Bit ECC              : N/A
        Pending                     : N/A
    Temperature
        GPU Current Temp            : 15 C
        GPU Shutdown Temp           : 96 C
        GPU Slowdown Temp           : 93 C
        GPU Max Operating Temp      : N/A
        Memory Current Temp         : N/A
        Memory Max Operating Temp   : N/A
    Power Readings
        Power Management            : Supported
        Power Draw                  : 9.41 W
        Power Limit                 : 250.00 W
        Default Power Limit         : 250.00 W
        Enforced Power Limit        : 250.00 W
        Min Power Limit             : 125.00 W
        Max Power Limit             : 300.00 W
    Clocks
        Graphics                    : 139 MHz
        SM                          : 139 MHz
        Memory                      : 405 MHz
        Video                       : 544 MHz
    Applications Clocks
        Graphics                    : N/A
        Memory                      : N/A
    Default Applications Clocks
        Graphics                    : N/A
        Memory                      : N/A
    Max Clocks
        Graphics                    : 1936 MHz
        SM                          : 1936 MHz
        Memory                      : 5505 MHz
        Video                       : 1620 MHz
    Max Customer Boost Clocks
        Graphics                    : N/A
    Clock Policy
        Auto Boost                  : N/A
        Auto Boost Default          : N/A
    Processes
        Process ID                  : 1289
            Type                    : G
            Name                    : /usr/lib/xorg/Xorg
            Used GPU Memory         : 14 MiB
        Process ID                  : 24402
            Type                    : G
            Name                    : /usr/lib/xorg/Xorg
            Used GPU Memory         : 139 MiB

GPU 00000000:03:00.0
    Product Name                    : GeForce GTX 1080 Ti
    Product Brand                   : GeForce
    Display Mode                    : Disabled
    Display Active                  : Disabled
    Persistence Mode                : Disabled
    Accounting Mode                 : Disabled
    Accounting Mode Buffer Size     : 4000
    Driver Model
        Current                     : N/A
        Pending                     : N/A
    Serial Number                   : N/A
    GPU UUID                        : GPU-dbec65c2-c9a0-393d-3d3b-2dc2e5bbaaf5
    Minor Number                    : 1
    VBIOS Version                   : 86.02.39.00.2A
    MultiGPU Board                  : No
    Board ID                        : 0x300
    GPU Part Number                 : N/A
    Inforom Version
        Image Version               : G001.0000.01.04
        OEM Object                  : 1.1
        ECC Object                  : N/A
        Power Management Object     : N/A
    GPU Operation Mode
        Current                     : N/A
        Pending                     : N/A
    GPU Virtualization Mode
        Virtualization mode         : None
    PCI
        Bus                         : 0x03
        Device                      : 0x00
        Domain                      : 0x0000
        Device Id                   : 0x1B0610DE
        Bus Id                      : 00000000:03:00.0
        Sub System Id               : 0x36091462
        GPU Link Info
            PCIe Generation
                Max                 : 3
                Current             : 1
            Link Width
                Max                 : 16x
                Current             : 16x
        Bridge Chip
            Type                    : N/A
            Firmware                : N/A
        Replays since reset         : 0
        Tx Throughput               : 0 KB/s
        Rx Throughput               : 0 KB/s
    Fan Speed                       : 29 %
    Performance State               : P8
    Clocks Throttle Reasons
        Idle                        : Active
        Applications Clocks Setting : Not Active
        SW Power Cap                : Not Active
        HW Slowdown                 : Not Active
            HW Thermal Slowdown     : Not Active
            HW Power Brake Slowdown : Not Active
        Sync Boost                  : Not Active
        SW Thermal Slowdown         : Not Active
        Display Clock Setting       : Not Active
    FB Memory Usage
        Total                       : 11178 MiB
        Used                        : 2 MiB
        Free                        : 11176 MiB
    BAR1 Memory Usage
        Total                       : 256 MiB
        Used                        : 5 MiB
        Free                        : 251 MiB
    Compute Mode                    : Default
    Utilization
        Gpu                         : 0 %
        Memory                      : 0 %
        Encoder                     : 0 %
        Decoder                     : 0 %
    Encoder Stats
        Active Sessions             : 0
        Average FPS                 : 0
        Average Latency             : 0
    Ecc Mode
        Current                     : N/A
        Pending                     : N/A
    ECC Errors
        Volatile
            Single Bit
                Device Memory       : N/A
                Register File       : N/A
                L1 Cache            : N/A
                L2 Cache            : N/A
                Texture Memory      : N/A
                Texture Shared      : N/A
                CBU                 : N/A
                Total               : N/A
            Double Bit
                Device Memory       : N/A
                Register File       : N/A
                L1 Cache            : N/A
                L2 Cache            : N/A
                Texture Memory      : N/A
                Texture Shared      : N/A
                CBU                 : N/A
                Total               : N/A
        Aggregate
            Single Bit
                Device Memory       : N/A
                Register File       : N/A
                L1 Cache            : N/A
                L2 Cache            : N/A
                Texture Memory      : N/A
                Texture Shared      : N/A
                CBU                 : N/A
                Total               : N/A
            Double Bit
                Device Memory       : N/A
                Register File       : N/A
                L1 Cache            : N/A
                L2 Cache            : N/A
                Texture Memory      : N/A
                Texture Shared      : N/A
                CBU                 : N/A
                Total               : N/A
    Retired Pages
        Single Bit ECC              : N/A
        Double Bit ECC              : N/A
        Pending                     : N/A
    Temperature
        GPU Current Temp            : 18 C
        GPU Shutdown Temp           : 96 C
        GPU Slowdown Temp           : 93 C
        GPU Max Operating Temp      : N/A
        Memory Current Temp         : N/A
        Memory Max Operating Temp   : N/A
    Power Readings
        Power Management            : Supported
        Power Draw                  : 8.04 W
        Power Limit                 : 250.00 W
        Default Power Limit         : 250.00 W
        Enforced Power Limit        : 250.00 W
        Min Power Limit             : 125.00 W
        Max Power Limit             : 300.00 W
    Clocks
        Graphics                    : 139 MHz
        SM                          : 139 MHz
        Memory                      : 405 MHz
        Video                       : 544 MHz
    Applications Clocks
        Graphics                    : N/A
        Memory                      : N/A
    Default Applications Clocks
        Graphics                    : N/A
        Memory                      : N/A
    Max Clocks
        Graphics                    : 1936 MHz
        SM                          : 1936 MHz
        Memory                      : 5505 MHz
        Video                       : 1620 MHz
    Max Customer Boost Clocks
        Graphics                    : N/A
    Clock Policy
        Auto Boost                  : N/A
        Auto Boost Default          : N/A
    Processes                       : None

GPU 00000000:82:00.0
    Product Name                    : GeForce GTX 1080 Ti
    Product Brand                   : GeForce
    Display Mode                    : Disabled
    Display Active                  : Disabled
    Persistence Mode                : Disabled
    Accounting Mode                 : Disabled
    Accounting Mode Buffer Size     : 4000
    Driver Model
        Current                     : N/A
        Pending                     : N/A
    Serial Number                   : N/A
    GPU UUID                        : GPU-51321451-290e-f3ef-73a9-1571d33166cb
    Minor Number                    : 2
    VBIOS Version                   : 86.02.39.00.2A
    MultiGPU Board                  : No
    Board ID                        : 0x8200
    GPU Part Number                 : N/A
    Inforom Version
        Image Version               : G001.0000.01.04
        OEM Object                  : 1.1
        ECC Object                  : N/A
        Power Management Object     : N/A
    GPU Operation Mode
        Current                     : N/A
        Pending                     : N/A
    GPU Virtualization Mode
        Virtualization mode         : None
    PCI
        Bus                         : 0x82
        Device                      : 0x00
        Domain                      : 0x0000
        Device Id                   : 0x1B0610DE
        Bus Id                      : 00000000:82:00.0
        Sub System Id               : 0x36091462
        GPU Link Info
            PCIe Generation
                Max                 : 3
                Current             : 1
            Link Width
                Max                 : 16x
                Current             : 16x
        Bridge Chip
            Type                    : N/A
            Firmware                : N/A
        Replays since reset         : 0
        Tx Throughput               : 0 KB/s
        Rx Throughput               : 0 KB/s
    Fan Speed                       : 29 %
    Performance State               : P8
    Clocks Throttle Reasons
        Idle                        : Active
        Applications Clocks Setting : Not Active
        SW Power Cap                : Not Active
        HW Slowdown                 : Not Active
            HW Thermal Slowdown     : Not Active
            HW Power Brake Slowdown : Not Active
        Sync Boost                  : Not Active
        SW Thermal Slowdown         : Not Active
        Display Clock Setting       : Not Active
    FB Memory Usage
        Total                       : 11178 MiB
        Used                        : 2 MiB
        Free                        : 11176 MiB
    BAR1 Memory Usage
        Total                       : 256 MiB
        Used                        : 5 MiB
        Free                        : 251 MiB
    Compute Mode                    : Default
    Utilization
        Gpu                         : 0 %
        Memory                      : 0 %
        Encoder                     : 0 %
        Decoder                     : 0 %
    Encoder Stats
        Active Sessions             : 0
        Average FPS                 : 0
        Average Latency             : 0
    Ecc Mode
        Current                     : N/A
        Pending                     : N/A
    ECC Errors
        Volatile
            Single Bit
                Device Memory       : N/A
                Register File       : N/A
                L1 Cache            : N/A
                L2 Cache            : N/A
                Texture Memory      : N/A
                Texture Shared      : N/A
                CBU                 : N/A
                Total               : N/A
            Double Bit
                Device Memory       : N/A
                Register File       : N/A
                L1 Cache            : N/A
                L2 Cache            : N/A
                Texture Memory      : N/A
                Texture Shared      : N/A
                CBU                 : N/A
                Total               : N/A
        Aggregate
            Single Bit
                Device Memory       : N/A
                Register File       : N/A
                L1 Cache            : N/A
                L2 Cache            : N/A
                Texture Memory      : N/A
                Texture Shared      : N/A
                CBU                 : N/A
                Total               : N/A
            Double Bit
                Device Memory       : N/A
                Register File       : N/A
                L1 Cache            : N/A
                L2 Cache            : N/A
                Texture Memory      : N/A
                Texture Shared      : N/A
                CBU                 : N/A
                Total               : N/A
    Retired Pages
        Single Bit ECC              : N/A
        Double Bit ECC              : N/A
        Pending                     : N/A
    Temperature
        GPU Current Temp            : 22 C
        GPU Shutdown Temp           : 96 C
        GPU Slowdown Temp           : 93 C
        GPU Max Operating Temp      : N/A
        Memory Current Temp         : N/A
        Memory Max Operating Temp   : N/A
    Power Readings
        Power Management            : Supported
        Power Draw                  : 8.82 W
        Power Limit                 : 250.00 W
        Default Power Limit         : 250.00 W
        Enforced Power Limit        : 250.00 W
        Min Power Limit             : 125.00 W
        Max Power Limit             : 300.00 W
    Clocks
        Graphics                    : 139 MHz
        SM                          : 139 MHz
        Memory                      : 405 MHz
        Video                       : 544 MHz
    Applications Clocks
        Graphics                    : N/A
        Memory                      : N/A
    Default Applications Clocks
        Graphics                    : N/A
        Memory                      : N/A
    Max Clocks
        Graphics                    : 1936 MHz
        SM                          : 1936 MHz
        Memory                      : 5505 MHz
        Video                       : 1620 MHz
    Max Customer Boost Clocks
        Graphics                    : N/A
    Clock Policy
        Auto Boost                  : N/A
        Auto Boost Default          : N/A
    Processes                       : None

(base) yongqiang@famu-sys:~$

-l SEC, --loop=SEC
Continuously report query data at the specified interval, rather than the default of just once. The application will sleep in-between queries. Note that on Linux ECC error or XID error events will print out during the sleep period if the -x flag was not specified. Pressing Ctrl+C at any time will abort the loop, which will otherwise run indefinitely. If no argument is specified for the -l form a default interval of 5 seconds is used.
以指定的时间间隔连续报告查询数据,而不是默认值一次。该应用程序将在查询之间休眠。请注意,在 Linux 上,如果未指定 -x 标志,则在睡眠期间将输出 ECC error or XID error 事件。任何时候按 Ctrl+C 都会终止循环,否则循环将无限期地运行。如果没有为 -l 格式指定参数,则使用默认间隔 5 秒。

# 1 second
(base) yongqiang@famu-sys:~$ nvidia-smi -l 1

-lms ms, --loop-ms=ms
Same as -l,\-\-loop but in milliseconds.
单位是 millisecond (ms )。

# 1000 millisecond
(base) yongqiang@famu-sys:~$ nvidia-smi -lms 1000
continuously [kənˈtɪnjuəsli]:adv. 连续不断地
indefinitely [ɪnˈdefɪnətli]:adv. 不确定地,无限期地,模糊地,不明确地

3.2 RETURN VALUE

Return code reflects whether the operation succeeded or failed and what was the reason of failure. (返回代码反映操作是成功还是失败以及失败的原因是什么。)

Return code 0 - Success
Return code 2 - A supplied argument or flag is invalid
Return code 3 - The requested operation is not available on target device
Return code 4 - The current user does not have permission to access this device or perform this operation
Return code 6 - A query to find an object was unsuccessful
Return code 8 - A device’s external power cables are not properly attached
Return code 9 - NVIDIA driver is not loaded
Return code 10 - NVIDIA Kernel detected an interrupt issue with a GPU
Return code 12 - NVML Shared Library couldn’t be found or loaded
Return code 13 - Local version of NVML doesn’t implement this function
Return code 14 - infoROM is corrupted
Return code 15 - The GPU has fallen off the bus or has otherwise become inaccessible
Return code 255 - Other error or internal driver error occurred

corrupt [kəˈrʌpt]:adj. 腐败的,贪污的,堕落的 vt. 使腐烂,使堕落,使恶化 vi. 堕落,腐化,腐烂
occur [əˈkɜː(r)]:vi. 发生,出现,存在
invoke [ɪnˈvəʊk]:vt. 调用,祈求,引起,恳求

3.3 GPU ATTRIBUTES

The following list describes all possible data returned by the -q device query option. Unless otherwise noted all numerical results are base 10 and unitless.
除非另有说明,否则所有数值结果 base 10 且无单位。

Timestamp - 时间戳
The current system timestamp at the time nvidia-smi was invoked. Format is Day-of-week Month Day HH:MM:SS Year.

Driver Version
The version of the installed NVIDIA display driver. This is an alphanumeric string.

alphanumeric [ˌælfənjuːˈmerɪk]:adj. 字母数字的

Attached GPUs
The number of NVIDIA GPUs in the system.

Product Name (Name)
The official product name of the GPU. This is an alphanumeric string. For all products.

Display Mode
A flag that indicates whether a physical display (e.g. monitor) is currently connected to any of the GPU’s connectors. Enabled indicates an attached display. Disabled indicates otherwise.
一个标志,指示当前是否将物理显示器 (例如监视器) 连接到任何 GPU 的连接器。Enabled 表示附加的显示。Disabled 表示相反。

Display Active (Disp.A)
A flag that indicates whether a display is initialized on the GPU’s (e.g. memory is allocated on the device for display). Display can be active even when no monitor is physically attached. Enabled indicates an active display. Disabled indicates otherwise.
一个标志,指示是否在 GPU 上初始化显示 (例如,在设备上分配了内存以进行显示)。即使没有物理连接显示器,显示也可以激活。Enabled 表示活动显示。Disable 表示相反。

(base) yongqiang@famu-sys:~$ nvidia-smi
Mon Nov 25 09:03:32 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.48                 Driver Version: 390.48                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 108...  Off  | 00000000:02:00.0  On |                  N/A |
| 29%   15C    P8     8W / 250W |    156MiB / 11177MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 108...  Off  | 00000000:03:00.0 Off |                  N/A |
| 29%   17C    P8     8W / 250W |      2MiB / 11178MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  GeForce GTX 108...  Off  | 00000000:82:00.0 Off |                  N/A |
| 29%   22C    P8     8W / 250W |      2MiB / 11178MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1289      G   /usr/lib/xorg/Xorg                            14MiB |
|    0     24402      G   /usr/lib/xorg/Xorg                           139MiB |
+-----------------------------------------------------------------------------+
(base) yongqiang@famu-sys:~$

Persistence Mode (Persistence-M)
A flag that indicates whether persistence mode is enabled for the GPU. Value is either Enabled or Disabled. When persistence mode is enabled the NVIDIA driver remains loaded even when no active clients, such as X11 or nvidia-smi, exist. This minimizes the driver load latency associated with running dependent apps, such as CUDA programs. For all CUDA-capable products. Linux only.
一个标志,指示是否为 GPU 启用了持久模式,值是 Enabled or Disabled。启用持久性模式后,即使没有活动的客户端 (such as X11 or nvidia-smi),NVIDIA 驱动程序也会保持加载状态。这样可以最大程度地减少与运行依赖的应用程序 (例如 CUDA 程序) 相关的驱动程序加载延迟。适用于所有支持 CUDA 的产品。仅 Linux。

Accounting Mode
A flag that indicates whether accounting mode is enabled for the GPU Value is either When accounting is enabled statistics are calculated for each compute process running on the GPU. Statistics can be queried during the lifetime or after termination of the process. The execution time of process is reported as 0 while the process is in running state and updated to actual execution time after the process has terminated. See --help-query-accounted-apps for more info.
指示是否为 GPU Value 启用 accounting mode 的标志为启用 accounting 时,将为 GPU 上运行的每个计算进程计算统计信息。可以在生命周期内或过程终止后查询统计信息。进程处于运行状态时,进程的执行时间报告为 0,并在进程终止后更新为实际的执行时间。有关更多信息,请参见 --help-query-accounted-apps

Accounting Mode Buffer Size
Returns the size of the circular buffer that holds list of processes that can be queried for accounting stats. This is the maximum number of processes that accounting information will be stored for before information about oldest processes will get overwritten by information about new processes.
返回循环缓冲区的大小,该缓冲区包含可查询 accounting 统计信息的进程列表。这是在最旧进程的信息被新进程的信息覆盖之前,accounting 信息将被存储的最大进程数。

account [əˈkaʊnt]:n. 账户,解释,账目,账单,理由,描述 vi. 解释,导致,报账 vt. 认为,把...视为
statistics [stə'tɪstɪks],stats:n. 统计,统计学,统计资料

Driver Model
On Windows, the TCC and WDDM driver models are supported. The driver model can be changed with the (-dm) or (-fdm) flags. The TCC driver model is optimized for compute applications. I.E. kernel launch times will be quicker with TCC. The WDDM driver model is designed for graphics applications and is not recommended for compute applications. Linux does not support multiple driver models, and will always have the value of “N/A”.
在 Windows 上,支持 TCC 和 WDDM 驱动程序模型。可以使用 (-dm) or (-fdm) 标志更改驱动程序型号。TCC 驱动程序模型针对计算应用程序进行了优化。即使用TCC,内核启动时间将更快。WDDM 驱动程序模型设计用于图形应用程序,不建议用于计算应用程序。

Current - The driver model currently in use. Always “N/A” on Linux.
Pending - The driver model that will be used on the next reboot. Always “N/A” on Linux.

Serial Number
This number matches the serial number physically printed on each board. It is a globally unique immutable alphanumeric value.

immutable [ɪˈmjuːtəbl]:adj. 不变的,不可变的,不能变的
physically [ˈfɪzɪkli]:adv. 肉体地,身体上地,依据自然规律,按自然法则,根本上
pend [pend]:v. 吊着,悬而未决,待决

Fan Speed (Fan)
The fan speed value is the percent of maximum speed that the device’s fan is currently intended to run at. It ranges from 0 to 100%. Note: The reported speed is the intended fan speed. If the fan is physically blocked and unable to spin, this output will not match the actual fan speed. Many parts do not report fan speeds because they rely on cooling via fans in the surrounding enclosure. For all discrete products with dedicated fans.
风扇速度值是设备的风扇当前打算以最大速度的百分比运行。范围是 0 到 100%。注意:报告的速度是预期的风扇速度。如果风扇被物理阻塞且无法旋转,则此输出将与实际风扇速度不匹配。许多零件没有报告风扇速度,因为它们依靠周围机柜中的风扇进行冷却。适用于所有带有专用风扇的分立产品。

spin [spɪn]:vi. 旋转,纺纱,吐丝,晕眩 vt. 使旋转,纺纱,编造,结网 n. 旋转,疾驰
enclosure [ɪnˈkləʊʒə(r)]:n. 附件,围墙,围场
dedicate [ˈdedɪkeɪt]:vt. 致力,献身,题献

Performance State (Perf)
The current performance state for the GPU. States range from P0 (maximum performance) to P12 (minimum performance).
GPU 的当前性能状态。状态范围从 P0 (最大性能) 到 P12 (最小性能)。
性能状态,从 P0 到 P12,P0 表示最大性能,P12 表示最小性能 (GPU 未工作时为 P0,达到最大工作限度时为 P12)。

Replay counter
This is the internal counter that records various errors on the PCIe bus.
这是内部计数器,用于记录 PCIe 总线上的各种错误。

Tx Throughput
The GPU-centric transmission throughput across the PCIe bus in MB/s over the past 20ms. Only supported on Maxwell architectures and newer.
在过去 20 毫秒内,PCIe 总线上 GPU-centric 的传输吞吐量 (MB/s)。

Rx Throughput
The GPU-centric receive throughput across the PCIe bus in MB/s over the past 20ms. Only supported on Maxwell architectures and newer.
在过去 20 毫秒内,PCIe 总线上 GPU-centric 接收吞吐量 (MB/s)。

replay [ˈriːpleɪ]:v. 重放 (录音带、录像带或电影),重现,重演,(由于未决出胜负而进行的) 重新举行 (比赛),重赛,反复回想 n. 重赛,重放,重演,重演的事物,重复出现的事物
throughput [ˈθruːpʊt]:n. (某一时期内的) 生产量,接待人数,吞吐量
transmission [trænzˈmɪʃn; trænsˈmɪʃn]:n. 传动装置,变速器,传递,传送,播送

Compute Mode (Compute M.)
The compute mode flag indicates whether individual or multiple compute applications may run on the GPU.
计算模式标志指示单个或多个计算应用程序可以在 GPU 上运行。

Default means multiple contexts are allowed per device.
默认 表示每个设备允许多个上下文。

Exclusive Process means only one context is allowed per device, usable from multiple threads at a time.
独占进程 表示每个设备只允许一个上下文,一次可以在多个线程中使用。

Prohibited means no contexts are allowed per device (no compute apps).
禁止 表示每台设备均不允许使用上下文 (无计算应用程序)。

EXCLUSIVE_PROCESS was added in CUDA 4.0. Prior CUDA releases supported only one exclusive mode, which is equivalent to EXCLUSIVE_THREAD in CUDA 4.0 and beyond.
CUDA 4.0 中添加了 EXCLUSIVE_PROCESS。先前的 CUDA 版本仅支持一种独占模式,这等效于 CUDA 4.0 及更高版本中的 EXCLUSIVE_THREAD

For all CUDA-capable products.
适用于所有支持 CUDA 的产品。

exclusive [ɪkˈskluːsɪv]:adj. 独有的,排外的,专一的 n. 独家新闻,独家经营的项目,排外者

Utilization
Utilization rates report how busy each GPU is over time, and can be used to determine how much an application is using the GPUs in the system.
利用率报告了每个 GPU 在一段时间内的繁忙程度,可以用来确定应用程序在系统中使用 GPU 的百分比。

Note: During driver initialization when ECC is enabled one can see high GPU and Memory Utilization readings. This is caused by ECC Memory Scrubbing mechanism that is performed during driver initialization.
注意:在启用 ECC 的驱动程序初始化期间,可以看到较高的 GPU 和内存利用率读取。这是由驱动程序初始化期间执行的 ECC 内存清理机制引起的。

scrub [skrʌb]:n. 矮树,洗擦,擦洗者,矮小的人 vt. 用力擦洗,使净化 vi. 擦洗,进行手臂消毒 adj. 矮小的,临时凑合的,次等的

GPU - Percent of time over the past sample period during which one or more kernels was executing on the GPU. The sample period may be between 1 second and 1/6 second depending on the product.
GPU - 过去采样周期内一个或多个内核在 GPU 上执行的时间百分比。取决于产品,采样时间可能在 1 秒钟到 1/6 秒钟之间。

Memory - Percent of time over the past sample period during which global (device) memory was being read or written. The sample period may be between 1 second and 1/6 second depending on the product.
Memory - 过去采样周期内读取或写入全局 (设备) 内存的的时间百分比。取决于产品,采样时间可能在 1 秒钟到 1/6 秒钟之间。

Encoder - Percent of time over the past sample period during which the GPU’s video encoder was being used. The sampling rate is variable and can be obtained directly via the nvmlDeviceGetEn‐coderUtilization() API
Encoder - 在过去采样周期内使用 GPU 的视频编码器的时间百分比。采样率是可变的,可以直接通过 nvmlDeviceGetEn‐coderUtilization() API获取。

Decoder - Percent of time over the past sample period during which the GPU’s video decoder was being used. The sampling rate is variable and can be obtained directly via the nvmlDeviceGetDe‐coderUtilization() API
Decoder - 在过去采样期间使用 GPU 的视频解码器的时间百分比。采样率是可变的,可以直接通过 nvmlDeviceGetDe‐coderUtilization() API获取

FB Memory Usage
On-board frame buffer memory information. Reported total memory is affected by ECC state. If ECC is enabled the total available memory is decreased by several percent, due to the requisite parity bits. The driver may also reserve a small amount of memory for internal use, even without active work on the GPU. For all products.
板载帧缓冲存储器信息。报告的总内存受 ECC 状态影响。如果启用了 ECC,则由于必需的奇偶校验位,总可用内存将减少百分之几。即使没有在 GPU 上进行积极的工作,驱动程序也可能会保留少量内存供内部使用。For all products.

Total - Total size of FB memory.
Used - Used size of FB memory.
Free - Available size of FB memory.

on-board [ˌɑːn ˈbɔːrd]:adj. 在船 (或飞机、车) 上的,主板 (控制) 的
parity [ˈpærəti]:n. 平价,同等,相等,胎次,分娩
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

Yongqiang Cheng

梦想不是浮躁,而是沉淀和积累。

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值