目录
Linux下Gpu查看
lspci指令
https://linux.die.net/man/8/lspci
lspci is a utility for displaying information about PCI buses in the system and devices connected to them.
lspci | grep -i vga
使用nvidia GPU可以: lspci | grep -i nvidia
Nvidia自带工具
Nvidia自带一个命令行工具可以查看显存的使用情况:
NVIDIA System Management Interface | NVIDIA Developer
watch -n 0.1 nvidia-smi //0.1s更新一次
https://developer.download.nvidia.com/compute/DCGM/docs/nvidia-smi-367.38.pdf
基于该工具获取GPU内存使用率、GPU占用率:
import os
def getUsageRate(gpuId):
bashCmd='nvidia-smi -q -i '+ str(gpuId) + ' -d UTILIZATION |grep Gpu'
status = os.popen(bashCmd).read().replace(' ','')
#print(status)
gpuUsageRate = float(status[status.find(':') + 1 : status.find('%')])
print('gpuUsageRate=', gpuUsageRate)
return gpuUsageRate
def getMemoryRate(gpuId):
bashCmd='nvidia-smi -q -i '+str(gpuId)+' -d UTILIZATION |grep Memory'
memoryStatus = os.popen(bashCmd).read().replace(' ','')
#print(memoryStatus)
memoryUsageRate = float(memoryStatus[memoryStatus.find(':') + 1 : memoryStatus.find('%')])
print('memoryUsageRate=', memoryUsageRate)
if __name__=="__main__":
getUsageRate(2)
getMemoryRate(2)
现成的软件工具
C++开源救命
https://github.com/wilicc/gpu-burn
// runLength : unit is seconds, the time of buring GPU.
// useDoubles: true: use double type as array element type.
// useBytes: -80: 程序会占用GPU当前的剩余memory的80%;
// 1300:程序会占用GPU的1300M byte.
template<class T> void launch(int runLength, bool useDoubles, bool useTensorCores, ssize_t useBytes)
Note: 如果有三个GPU,程序会启动三个进程用于burn 每个GPU, 主进程通过pipe与它们通信;
检测gpu状态(python开源)
https://github.com/wookayin/gpustathttps://github.com/wookayin/gpustat
pytorch
https://pytorch.org/
opengl/vulkan
Others
设置使用哪个gpu: (好像不work)如何指定程序在某个GPU上运行 python_Diana_Z的博客-CSDN博客_python在gpu上运行https://blog.csdn.net/Diana_Z/article/details/89449186