DEVICE MODIFICATION OPTIONS
[any one of]
-pm, --persistence-mode=MODE
Set the persistence mode for the target GPUs. See the (GPU ATTRIBUTES)
section for a description of persistence mode. Requires root. Will
impact all GPUs unless a single GPU is specified using the -i argument.
The effect of this operation is immediate. However, it does not per-
sist across reboots. After each reboot persistence mode will default
to "Disabled". Available on Linux only.
为目标GPU设置持久性模式。有关持久性模式的描述,请参阅(GPU属性)部分。需要root用户。将影响所有GPU,除非使用-i参数指定了单个GPU。这次行动的效果是立竿见影的。然而,它不会在重新启动时持续存在。每次重新启动后,持久性模式将默认为“已禁用”。仅在Linux上可用。
-pl, --power-limit=POWER_LIMIT
Specifies maximum power limit in watts. Accepts integer and floating
point numbers. Only on supported devices from Kepler family. Requires
administrator privileges. Value needs to be between Min and Max Power
Limit as reported by nvidia-smi.
以瓦特为单位指定最大功率限制。接受整数和浮点数字。仅适用于开普勒家族支持的设备。需要管理员权限。根据nvidia smi的报告,该值需要介于最小和最大功率限制之间。
-r, --gpu-reset
Trigger a reset of the GPU. Can be used to clear GPU HW and SW state
in situations that would otherwise require a machine reboot. Typically
useful if a double bit ECC error has occurred. Requires -i switch to
target specific device. Requires root. There can't be any applica-
tions using this particular device (e.g. CUDA application, graphics
application like X server, monitoring application like other instance
of nvidia-smi). There also can't be any compute applications running
on any other GPU in the system. Only on supported devices from Fermi
and Kepler family running on Linux.
GPU reset is not guaranteed to work in all cases. It is not recommended for production environments at this time. In some situations there may
be HW components on the board that fail to revert back to an initial
state following the reset request. This is more likely to be seen on
Fermi-generation products vs. Kepler, and more likely to be seen if the
reset is being performed on a hung GPU.
Following a reset, it is recommended that the health of the GPU be ver-
ified before further use. The nvidia-healthmon tool is a good choice
for this test. If the GPU is not healthy a complete reset should be
instigated by power cycling the node.
Visit http://developer.nvidia.com/gpu-deployment-kit to download the
GDK and nvidia-healthmon.