程序kill后仍占用GPU
fuser -v /dev/nvidia*
(sc_yolov5) [zqchen@gpurtx02 ultralytics]$ gpustat
gpurtx02 Thu Aug 24 09:18:31 2023 470.74
[0] Quadro RTX 6000 | 41'C, 0 % | 0 / 24220 MB |
[1] Quadro RTX 6000 | 72'C, 0 % | 17718 / 24220 MB |
(sc_yolov5) [zqchen@gpurtx02 ultralytics]$ fuser -v /dev/nvidia*
USER PID ACCESS COMMAND
/dev/nvidia1: zqchen 80525 F...m yolo
zqchen 80589 F...m yolo
zqchen 80653 F...m yolo
zqchen 80781 F...m yolo
zqchen 80909 F...m yolo
/dev/nvidiactl: zqchen 80525 F...m yolo
zqchen 80589 F...m yolo
zqchen 80653 F...m yolo
zqchen 80781 F...m yolo
zqchen 80909 F...m yolo
/dev/nvidia-uvm: zqchen 80525 F...m yolo
zqchen 80589 F...m yolo
zqchen 80653 F...m yolo
zqchen 80781 F...m yolo
zqchen 80909 F...m yolo
(sc_yolov5) [zqchen@gpurtx02 ultralytics]$ kill 80525
(sc_yolov5) [zqchen@gpurtx02 ultralytics]$ gpustat
gpurtx02 Thu Aug 24 09:19:30 2023 470.74
[0] Quadro RTX 6000 | 41'C, 0 % | 0 / 24220 MB |
[1] Quadro RTX 6000 | 65'C, 0 % | 17718 / 24220 MB |
(sc_yolov5) [zqchen@gpurtx02 ultralytics]$ kill 80589
(sc_yolov5) [zqchen@gpurtx02 ultralytics]$ kill 80653
(sc_yolov5) [zqchen@gpurtx02 ultralytics]$ gpustat
gpurtx02 Thu Aug 24 09:19:49 2023 470.74
[0] Quadro RTX 6000 | 41'C, 0 % | 0 / 24220 MB |
[1] Quadro RTX 6000 | 64'C, 0 % | 17718 / 24220 MB |
(sc_yolov5) [zqchen@gpurtx02 ultralytics]$ kill 80781
(sc_yolov5) [zqchen@gpurtx02 ultralytics]$ kill 80909
(sc_yolov5) [zqchen@gpurtx02 ultralytics]$ gpustat
gpurtx02 Thu Aug 24 09:20:05 2023 470.74
[0] Quadro RTX 6000 | 41'C, 0 % | 0 / 24220 MB |
[1] Quadro RTX 6000 | 63'C, 0 % | 0 / 24220 MB |
(sc_yolov5) [zqchen@gpurtx02 ultralytics]$
程序无法kill
1.寻找主进程,然后kill
2.使用kill -9 pid,强制杀死进程
cd /proc/pid
cat status
kill -9 ppid
(base) [zqchen@gpurtx02 196023]$ nvidia-smi
Wed Nov 8 14:50:04 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.74 Driver Version: 470.74 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Quadro RTX 6000 Off | 00000000:3B:00.0 Off | Off |
| 44% 67C P2 147W / 260W | 12340MiB / 24220MiB | 41% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 Quadro RTX 6000 Off | 00000000:D8:00.0 Off | Off |
| 35% 61C P2 65W / 260W | 21142MiB / 24220MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 210860 C python 12337MiB |
| 1 N/A N/A 196023 C python 21135MiB |
+-----------------------------------------------------------------------------+
(base) [zqchen@gpurtx02 196023]$ kill 196023
(base) [zqchen@gpurtx02 196023]$ fuser -v /dev/nvidia*\
>
USER PID ACCESS COMMAND
/dev/nvidia0: zqchen 196023 F.... python
zqchen 196150 F.... python
zqchen 196214 F.... python
zqchen 196278 F.... python
zqchen 196342 F.... python
zqchen 196406 F.... python
zqchen 196470 F.... python
zqchen 196534 F.... python
zqchen 196598 F.... python
zqchen 196734 F.... python
zqchen 196735 F.... python
zqchen 196736 F.... python
zqchen 196737 F.... python
zqchen 196738 F.... python
zqchen 196739 F.... python
zqchen 196740 F.... python
zqchen 196741 F.... python
zqchen 210860 F...m python
zqchen 210979 F...m python
zqchen 211043 F...m python
zqchen 211107 F...m python
zqchen 211171 F...m python
zqchen 211303 F...m python
zqchen 211304 F...m python
zqchen 211305 F...m python
zqchen 211306 F...m python
zqchen 211307 F...m python
zqchen 211308 F...m python
zqchen 211309 F...m python
zqchen 211310 F...m python
/dev/nvidia1: zqchen 196023 F...m python
zqchen 196150 F...m python
zqchen 196214 F...m python
zqchen 196278 F...m python
zqchen 196342 F...m python
zqchen 196406 F...m python
zqchen 196470 F...m python
zqchen 196534 F...m python
zqchen 196598 F...m python
zqchen 196734 F...m python
zqchen 196735 F...m python
zqchen 196736 F...m python
zqchen 196737 F...m python
zqchen 196738 F...m python
zqchen 196739 F...m python
zqchen 196740 F...m python
zqchen 196741 F...m python
zqchen 210860 F.... python
zqchen 210979 F.... python
zqchen 211043 F.... python
zqchen 211107 F.... python
zqchen 211171 F.... python
zqchen 211303 F.... python
zqchen 211304 F.... python
zqchen 211305 F.... python
zqchen 211306 F.... python
zqchen 211307 F.... python
zqchen 211308 F.... python
zqchen 211309 F.... python
zqchen 211310 F.... python
/dev/nvidiactl: zqchen 196023 F...m python
zqchen 196150 F...m python
zqchen 196214 F...m python
zqchen 196278 F...m python
zqchen 196342 F...m python
zqchen 196406 F...m python
zqchen 196470 F...m python
zqchen 196534 F...m python
zqchen 196598 F...m python
zqchen 196734 F...m python
zqchen 196735 F...m python
zqchen 196736 F...m python
zqchen 196737 F...m python
zqchen 196738 F...m python
zqchen 196739 F...m python
zqchen 196740 F...m python
zqchen 196741 F...m python
zqchen 210860 F...m python
zqchen 210979 F...m python
zqchen 211043 F...m python
zqchen 211107 F...m python
zqchen 211171 F...m python
zqchen 211303 F...m python
zqchen 211304 F...m python
zqchen 211305 F...m python
zqchen 211306 F...m python
zqchen 211307 F...m python
zqchen 211308 F...m python
zqchen 211309 F...m python
zqchen 211310 F...m python
/dev/nvidia-uvm: zqchen 196023 F...m python
zqchen 196150 F...m python
zqchen 196214 F...m python
zqchen 196278 F...m python
zqchen 196342 F...m python
zqchen 196406 F...m python
zqchen 196470 F...m python
zqchen 196534 F...m python
zqchen 196598 F...m python
zqchen 196734 F...m python
zqchen 196735 F...m python
zqchen 196736 F...m python
zqchen 196737 F...m python
zqchen 196738 F...m python
zqchen 196739 F...m python
zqchen 196740 F...m python
zqchen 196741 F...m python
zqchen 210860 F...m python
zqchen 210979 F...m python
zqchen 211043 F...m python
zqchen 211107 F...m python
zqchen 211171 F...m python
zqchen 211303 F...m python
zqchen 211304 F...m python
zqchen 211305 F...m python
zqchen 211306 F...m python
zqchen 211307 F...m python
zqchen 211308 F...m python
zqchen 211309 F...m python
zqchen 211310 F...m python
(base) [zqchen@gpurtx02 196023]$
(base) [zqchen@gpurtx02 196023]$
(base) [zqchen@gpurtx02 196023]$ cd /proc/196023
(base) [zqchen@gpurtx02 196023]$ cat status
Name: python
Umask: 0002
State: T (stopped)
Tgid: 196023
Ngid: 196023
Pid: 196023
PPid: 279238
TracerPid: 0
Uid: 1009 1009 1009 1009
Gid: 1009 1009 1009 1009
FDSize: 256
Groups: 1009
VmPeak: 46481776 kB
VmSize: 41804152 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM: 5610268 kB
VmRSS: 5456768 kB
RssAnon: 2810948 kB
RssFile: 495072 kB
RssShmem: 2150748 kB
VmData: 35882532 kB
VmStk: 132 kB
VmExe: 1816 kB
VmLib: 2125912 kB
VmPTE: 16184 kB
VmSwap: 0 kB
Threads: 119
SigQ: 11/512190
SigPnd: 0000000000000000
ShdPnd: 0000000000014000
SigBlk: 0000000000000000
SigIgn: 0000000001001000
SigCgt: 0000000180010002
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: 0000001fffffffff
CapAmb: 0000000000000000
NoNewPrivs: 0
Seccomp: 0
Speculation_Store_Bypass: thread vulnerable
Cpus_allowed: 00000000,00000000,00000000,00000000,00000000,00000000,00000000,ffffffff,ffffffff
Cpus_allowed_list: 0-63
Mems_allowed: 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000003
Mems_allowed_list: 0-1
voluntary_ctxt_switches: 1968
nonvoluntary_ctxt_switches: 605
(base) [zqchen@gpurtx02 196023]$ kill 279238
(base) [zqchen@gpurtx02 196023]$ kill 196023
(base) [zqchen@gpurtx02 196023]$ nvidia-smi
Wed Nov 8 14:50:59 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.74 Driver Version: 470.74 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Quadro RTX 6000 Off | 00000000:3B:00.0 Off | Off |
| 44% 68C P2 118W / 260W | 12340MiB / 24220MiB | 43% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 Quadro RTX 6000 Off | 00000000:D8:00.0 Off | Off |
| 35% 61C P2 65W / 260W | 21142MiB / 24220MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 210860 C python 12337MiB |
| 1 N/A N/A 196023 C python 21135MiB |
+-----------------------------------------------------------------------------+
(base) [zqchen@gpurtx02 196023]$ kill -9 196023
(base) [zqchen@gpurtx02 196023]$ kill -9 279238
(base) [zqchen@gpurtx02 196023]$ nvidia-smi
Wed Nov 8 14:51:53 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.74 Driver Version: 470.74 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Quadro RTX 6000 Off | 00000000:3B:00.0 Off | Off |
| 47% 64C P0 74W / 260W | 0MiB / 24220MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 Quadro RTX 6000 Off | 00000000:D8:00.0 Off | Off |
| 34% 57C P0 58W / 260W | 0MiB / 24220MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+