kernel之间切换
- 查看当前所在的kernel
(cuda-gdb) cuda device sm warp lane block thread
block (0,0,0), thread (0,0,0), device 0, sm 0, warp 0, lane 0
(cuda-gdb) cuda kernel block thread
kernel 1, block (0,0,0), thread (0,0,0)
(cuda-gdb) cuda kernel
kernel 1
- 切换kernel
(cuda-gdb) cuda device 0 sm 1 warp 2 lane 3
[Switching focus to CUDA kernel 1, grid 2, block (8,0,0), thread
(67,0,0), device 0, sm 1, warp 2, lane 3]
374 int totalThreads = gridDim.x * blockDim.x;
(cuda-gdb) cuda thread (15)
[Switching focus to CUDA kernel 1, grid 2, block (8,0,0), thread
(15,0,0), device 0, sm 1, warp 0, lane 15]
374 int totalThreads = gridDim.x * blockDim.x;
(cuda-gdb) cuda block 1 thread 3
[Switching focus to CUDA kernel 1, grid 2, block (1,0,0), thread (3,0,0),
device 0, sm 3, warp 0, lane 3]
374 int totalThreads = gridDim.x * blockDim.
设置断点
1.根据symbolic
(cuda-gdb) break my_function
(cuda-gdb) break my_class::my_method
对于模版函数,类型信息必须提供
(cuda-gdb) break int my_templatized_function<int>(int)
可以使用下面的方法找到函数名
(cuda-gdb) set demangle-style none
(cuda-gdb) info function my_function_name
(cuda-gdb) set demangle-style auto
2.根据文件行数
(cuda-gdb) break my_file.cu:185
3, 程序开始执行的时设置断点
(cuda-gdb) set cuda break_on_launch application
查看程序状态
参数类型和内容
(cuda-gdb) print &array
$1 = (@shared int (*)[0]) 0x20
(cuda-gdb) print array[0]@4
$2 = {0, 128, 64, 192}
也可以通过指针访问+偏移打印内容
(cuda-gdb) print *(@shared int*)0x20
$3 = 0
(cuda-gdb) print *(@shared int*)0x24
$4 = 128
(cuda-gdb) print *(@shared int*)0x28
$5 = 64
查看当前设备的状态
info cuda devices or …
info cuda sms SM
info cuda threads
info cuda threads breakpoint all
info cuda thredas breakpoint 2 lane 1
# 设备
devices
# sm
sms
# warps
warps
# lanes
lanes
# kernels
查看当前激活的kernel
blocks
当前kernel 所有激活的block
threads
当前kernel所有激活的线程
launch trace
查看当前kernel的父亲kernel
launch children
查看当前kernel的子kernel
contexts
查看所有的context
查看汇编
(cuda-gdb) x/4i $pc-32
0xa689a8 <acos_main(acosParams)+824>: MOV R0, c[0x0][0x34]
0xa689b8 <acos_main(acosParams)+840>: MOV R3, c[0x0][0x28]
0xa689c0 <acos_main(acosParams)+848>: IMUL R2, R0, R3
=> 0xa689c8 <acos_main(acosParams)+856>: MOV R0, c[0x0][0x28]
查看寄存器
(cuda-gdb) info registers $R0 $R1 $R2 $R3
R0 0xf0 240
R1 0xfffc48 16776264
R2 0x7800 30720
R3 0x80 128
调试MPI程序
在代码的开始加入:
{
int i = 0;
char host[256];
printf("PID %d on node %s is ready for attach\n",
getpid(), host);
fflush(stdout);
while (0 == i) {
sleep(5);
}
}
执行mpi程序
mpirun -np 2 -host nv1,nv2 a.out
使用cuda-gdb
cuda-gdb --pid 5488