1. nvidia-smi
显示所有GPU的当前信息状态
nvidia-smi
输出:

2. nvidia-smi dmon
设备监控命令,以滚动条形式显示GPU设备统计信息
nvidia-smi dmon
输出:

- pwr:电源消耗(Power Usage),显示 GPU 当前的功耗情况,单位通常是瓦特(W)。
- gtemp:GPU 温度(GPU Temperature),显示 GPU 芯片的温度,通常以摄氏度(℃)为单位。
- mtemp:内存温度(Memory Temperature),显示显存模块的温度,以摄氏度为单位。
- sm:流处理器利用率(SM Utilization),显示 GPU 流处理器的利用率,反映 GPU 的计算负载情况。
- mem:显存利用率(Memory Utilization),显示显存的利用率,反映了当前任务对显存的占用情况。
- enc:视频编码器利用率(Encoder Utilization),显示 GPU 的视频编码器的利用率。
- dec:视频解码器利用率(Decoder Utilization),显示 GPU 的视频解码器的利用率。
- mclk:内存时钟频率(Memory Clock),显示显存模块的工作频率。
- pclk:GPU 核心时钟频率(Graphics Clock),显示 GPU 核心的工作频率。
3. watch -n 0.5 nvidia-smi
实时监控显存占用情况:
watch -n 0.5 nvidia-smi
输出:

4. nvidia-smi -q
查询所有GPU的当前详细信息
nvidia-smi -q
输出:
==============NVSMI LOG==============
Timestamp : Wed Mar 6 14:28:58 2024
Driver Version : 525.147.05
CUDA Version : 12.0
Attached GPUs : 1
GPU 00000000:01:00.0
Product Name : NVIDIA GeForce RTX 3060
Product Brand : GeForce
Product Architecture : Ampere
Display Mode : Enabled
Display Active : Enabled
Persistence Mode : Disabled
MIG Mode
Current : N/A
Pending : N/A
Accounting Mode : Disabled
Accounting Mode Buffer Size : 4000
Driver Model
Current : N/A
Pending : N/A
Serial Number : N/A
GPU UUID : GPU-ceafd414-247a-959f-1399-7849b3a16e13
Minor Number : 0
VBIOS Version : 94.06.2F.40.11
MultiGPU Board : No
Board ID : 0x100
Board Part Number : N/A
GPU Part Number : 2504-302-A1
Module ID : 1
Inforom Version
Image Version : G001.0000.03.03
OEM Object : 2.0
ECC Object : N/A
Power Management Object : N/A
GPU Operation Mode
Current : N/A
Pending : N/A
GSP Firmware Version : N/A
GPU Virtualization Mode
Virtualization Mode : None
Host VGPU Mode : N/A
IBMNPU
Relaxed Ordering Mode : N/A
PCI
Bus : 0x01
Device : 0x00
Domain : 0x0000
Device Id : 0x250410DE
Bus Id : 00000000:01:00.0
Sub System Id : 0x209017AA
GPU Link Info
PCIe Generation
Max : 4
Current : 4
Device Current : 4
Device Max : 4
Host Max : 5
Link Width
Max : 16x
Current : 16x
Bridge Chip
Type : N/A
Firmware : N/A
Replays Since Reset : 0
Replay Number Rollovers : 0
Tx Throughput : 49000 KB/s
Rx Throughput : 292000 KB/s
Atomic Caps Inbound : N/A
Atomic Caps Outbound : N/A
Fan Speed : 31 %
Performance State : P2
Clocks Throttle Reasons
Idle : Not Active
Applications Clocks Setting : Not Active
SW Power Cap : Not Active
HW Slowdown : Not Active
HW Thermal Slowdown : Not Active
HW Power Brake Slowdown : Not Active
Sync Boost : Not Active
SW Thermal Slowdown : Not Active
Display Clock Setting : Not Active
FB Memory Usage
Total : 12288 MiB
Reserved : 251 MiB
Used : 2031 MiB
Free : 10005 MiB
BAR1 Memory Usage
Total : 16384 MiB
Used : 56 MiB
Free : 16328 MiB
Compute Mode : Default
Utilization
Gpu : 46 %
Memory : 10 %
Encoder : 0 %
Decoder : 0 %
Encoder Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
FBC Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
Ecc Mode
Current : N/A
Pending : N/A
ECC Errors
Volatile
SRAM Correctable : N/A
SRAM Uncorrectable : N/A
DRAM Correctable : N/A
DRAM Uncorrectable : N/A
Aggregate
SRAM Correctable : N/A
SRAM Uncorrectable : N/A
DRAM Correctable : N/A
DRAM Uncorrectable : N/A
Retired Pages
Single Bit ECC : N/A
Double Bit ECC : N/A
Pending Page Blacklist : N/A
Remapped Rows : N/A
Temperature
GPU Current Temp : 45 C
GPU T.Limit Temp : N/A
GPU Shutdown Temp : 98 C
GPU Slowdown Temp : 95 C
GPU Max Operating Temp : 93 C
GPU Target Temperature : 83 C
Memory Current Temp : N/A
Memory Max Operating Temp : N/A
Power Readings
Power Management : Supported
Power Draw : 55.13 W
Power Limit : 170.00 W
Default Power Limit : 170.00 W
Enforced Power Limit : 170.00 W
Min Power Limit : 100.00 W
Max Power Limit : 170.00 W
Clocks
Graphics : 1942 MHz
SM : 1942 MHz
Memory : 7300 MHz
Video : 1695 MHz
Applications Clocks
Graphics : N/A
Memory : N/A
Default Applications Clocks
Graphics : N/A
Memory : N/A
Deferred Clocks
Memory : N/A
Max Clocks
Graphics : 2100 MHz
SM : 2100 MHz
Memory : 7501 MHz
Video : 1950 MHz
Max Customer Boost Clocks
Graphics : N/A
Clock Policy
Auto Boost : N/A
Auto Boost Default : N/A
Voltage
Graphics : 1081.250 mV
Fabric
State : N/A
Status : N/A
Processes
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 1229
Type : G
Name : /usr/lib/xorg/Xorg
Used GPU Memory : 35 MiB
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 1793
Type : G
Name : /usr/lib/xorg/Xorg
Used GPU Memory : 181 MiB
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 1923
Type : G
Name : /usr/bin/gnome-shell
Used GPU Memory : 51 MiB
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 61835
Type : C
Name : python
Used GPU Memory : 494 MiB
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 144693
Type : G
Name : /opt/microsoft/msedge-beta/msedge --type=gpu-process --crashpad-handler-pid=139704 --enable-crash-reporter=,beta --change-stack-guard-on-fork=enable --gpu-preferences=WAAAAAAAAAAgAAAEAAAAAAAAAAAAAAAAAABgAAAAAAA4AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABAAAAGAAAAAAAAAAYAAAAAAAAAAgAAAAAAAAACAAAAAAAAAAIAAAAAAAAAA== --shared-files --field-trial-handle=0,i,14181863785507988006,12357294950646784628,262144 --variations-seed-version
Used GPU Memory : 80 MiB
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 144847
Type : G
Name : /usr/share/code/code --type=gpu-process --crashpad-handler-pid=144830 --enable-crash-reporter=98ca0db7-241d-4178-8ad5-82f5b3c223a3,no_channel --user-data-dir=/home/tianyu/.config/Code --gpu-preferences=WAAAAAAAAAAgAAAEAAAAAAAAAAAAAAAAAABgAAAAAAA4AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABAAAAGAAAAAAAAAAYAAAAAAAAAAgAAAAAAAAACAAAAAAAAAAIAAAAAAAAAA== --shared-files --field-trial-handle=0,i,14574705392034819031,14847682280654467979,262144 --disable-features=CalculateNativeWinOcclusion,SpareRendererForSitePerProcess
Used GPU Memory : 37 MiB
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 145130
Type : C
Name : python3
Used GPU Memory : 838 MiB
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 145184
Type : C+G
Name : /home/tianyu/.conda/envs/dexart/bin/python3
Used GPU Memory : 296 MiB
文章介绍了NVIDIA-SMI工具的功能,如显示GPU的当前状态、功耗、温度、性能利用率等,以及相关命令如nvidia-smidmon用于实时监控和nvidia-smi-q用于查询详细信息。
2413

被折叠的 条评论
为什么被折叠?



