Windbg定位死锁
![此博文包含图片](https://i-blog.csdnimg.cn/blog_migrate/a4c26d1e5885305701be709a3d33442f.gif)
【转】原文链接 http://blog.csdn.net/hgy413/article/details/7572097#comments
先上个代码,自己随手写的:
- #include
-
- CRITICAL_SECTION
cs1; - CRITICAL_SECTION
cs2; -
- DWORD
__stdcall thread1(LPVOID lp) - {
-
EnterCriticalSection(&cs1); -
Sleep(10); -
EnterCriticalSection(&cs2); -
-
return 0; - }
-
- DWORD
__stdcall thread2(LPVOID lp) - {
-
EnterCriticalSection(&cs2); -
Sleep(10); -
EnterCriticalSection(&cs1); -
-
return 0; - }
-
- int
main() - {
-
InitializeCriticalSectio n(&cs1); -
InitializeCriticalSectio n(&cs2); -
-
CreateThread(NULL, 0, thread1, 0, 0, NULL); -
CreateThread(NULL, 0, thread2, 0, 0, NULL); -
-
system("pause"); -
return 0; -
- }
运行,生成release版本,去掉pdb,运行,程序停住了,windbg加载到进程,
先用~*kb查看下所有的线程堆栈:
0:003> ~*kb
0 Id: 1a98.24c Suspend: 1 Teb: 7ffdf000 Unfrozen ChildEBP RetAddr Args to Child 0012fddc 7c92df5a 7c8025db 00000044 00000000 ntdll!KiFastSystemCallRet 0012fde0 7c8025db 00000044 00000000 00000000 ntdll!NtWaitForSingleObject+0xc 0012fe44 7c802542 00000044 ffffffff 00000000 kernel32!WaitForSingleObjectEx+0xa8 0012fe58 7854bd40 00000044 ffffffff 00000000 kernel32!WaitForSingleObject+0x12 0012fedc 7854c702 00000000 00392b98 00392de0 MSVCR90!_dospawn+0x1d1 [f:\dd\vctools\crt_bld\self_x86\crt\src\dospawn.c @ 215] 0012ff00 7854c84b 00000000 00392b98 0012ff5c MSVCR90!comexecmd+0x60 [f:\dd\vctools\crt_bld\self_x86\crt\src\spawnve.c @ 137] 0012ff38 7854cc71 00000000 00392b98 0012ff5c MSVCR90!_spawnve+0x12a [f:\dd\vctools\crt_bld\self_x86\crt\src\spawnve.c @ 273] 0012ff70 004010a8 004020f4 00000001 00401218 MSVCR90!system+0x8e [f:\dd\vctools\crt_bld\self_x86\crt\src\system.c @ 87] WARNING: Stack unwind information not available. Following frames may be wrong. 0012ffc0 7c817077 00300031 0032002d 7ffdc000 test2+0x10a8 0012fff0 00000000 00401360 00000000 78746341 kernel32!BaseProcessStart+0x23
1 Id: 1a98.1588 Suspend: 1 Teb: 7ffde000 Unfrozen ChildEBP RetAddr Args to Child 0050ff14 7c92df5a 7c939b23 0000002c 00000000 ntdll!KiFastSystemCallRet 0050ff18 7c939b23 0000002c 00000000 00000000 ntdll!NtWaitForSingleObject+0xc 0050ffa0 7c921046 00403370 0040101d 00403370 ntdll!RtlpWaitForCriticalSecti on+0x132 0050ffa8 0040101d 00403370 000203a8 7c80b729 ntdll!RtlEnterCriticalSection+0x46 WARNING: Stack unwind information not available. Following frames may be wrong. 0050ffec 00000000 00401000 00000000 00000000 test2+0x101d
2 Id: 1a98.185c Suspend: 1 Teb: 7ffdd000 Unfrozen ChildEBP RetAddr Args to Child 0060ff14 7c92df5a 7c939b23 00000034 00000000 ntdll!KiFastSystemCallRet 0060ff18 7c939b23 00000034 00000000 00000000 ntdll!NtWaitForSingleObject+0xc 0060ffa0 7c921046 00403388 0040104d 00403388 ntdll!RtlpWaitForCriticalSecti on+0x132 0060ffa8 0040104d 00403388 000203a8 7c80b729 ntdll!RtlEnterCriticalSection+0x46 WARNING: Stack unwind information not available. Following frames may be wrong. 0060ffec 00000000 00401030 00000000 00000000 test2+0x104d
#3 Id: 1a98.159c Suspend: 1 Teb: 7ffdb000 Unfrozen ChildEBP RetAddr Args to Child 003dffc8 7c972119 00000005 00000004 00000001 ntdll!DbgBreakPoint 003dfff4 00000000 00000000 00000000 00000000 ntdll!DbgUiRemoteBreakin+0x2d
- 我们注意到1号线程的线程堆栈是从ntdll!RtlEnterCriticalSection中开始的,那么ntdll!RtlEnterCriticalSection又是什么函数的入口呢,首先猜到的是EnterCriticalSection,这个函数是kernel32.dll中的,为了验证猜测,我们用dump查看到kernel32.dll的导出函数:
果然如此,
1. !cs
!cs 扩展显示一个或多个临界区(critical section)或者整个临界区树
前面说的ntdll!RtlEnterCriticalSection的第一个参数是临界区的地址,事实上用uf反汇编它,可以看到是ret 4,说明就只有一个参数
那么,
-
!cs Address 指定要显示的临界区地址。如果省略该参数,调试器显示当前进程中所有临界区。
- 0:003>
~1kb - ChildEBP
RetAddr Args to Child - 0050ff14
7c92df5a 7c939b23 0000002c 00000000 ntdll!KiFastSystemCallRet - 0050ff18
7c939b23 0000002c 00000000 00000000 ntdll!NtWaitForSingleObject+0xc - 0050ffa0
7c921046 00403370 0040101d 00403370 ntdll!RtlpWaitForCriticalSecti on+0x132 - 0050ffa8
0040101d 00403370 000203a8 7c80b729 ntdll!RtlEnterCriticalSection+0x46 - WARNING:
Stack unwind information not available. Following frames may be wrong. - 0050ffec
00000000 00401000 00000000 00000000 test2+0x101d - 0:003>
!cs 00403370 - -----------------------------------------
- Critical
section = 0x00403370 (test2+0x3370) - DebugInfo
= 0x7c99e9e0 - LOCKED
- LockCount
= 0x1 - OwningThread
= 0x0000185c - RecursionCount
= 0x1 - LockSemaphore
= 0x2C - SpinCount
= 0x00000000
- 这里LockCount为1意思为除了一个线程拥有它外,另外还有一个线程在等待它,它是由EnterCriticalSection增加,LeaveCriticalSection来减小的,比如我再加一点代码:
- DWORD
__stdcall thread3(LPVOID lp) - {
-
EnterCriticalSection(&cs2); -
Sleep(10); -
EnterCriticalSection(&cs1); -
-
return 0; - }
-
- int
main() - {
-
InitializeCriticalSectio n(&cs1); -
InitializeCriticalSectio n(&cs2); -
-
CreateThread(NULL, 0, thread1, 0, 0, NULL); -
CreateThread(NULL, 0, thread2, 0, 0, NULL); -
CreateThread(NULL, 0, thread3, 0, 0, NULL); -
-
system("pause"); -
return 0; -
- }
这时运行windbg:
- 0:004>
~1kb - ChildEBP
RetAddr Args to Child - 0051fe48
7c92df5a 7c939b23 00000034 00000000 ntdll!KiFastSystemCallRet - 0051fe4c
7c939b23 00000034 00000000 00000000 ntdll!NtWaitForSingleObject+0xc - 0051fed4
7c921046 00417140 00411420 00417140 ntdll!RtlpWaitForCriticalSecti on+0x132 - ***
WARNING: Unable to verify checksum for D:\Project1\test2\Debug\test2.exe - 0051fedc
00411420 00417140 00000000 00000000 ntdll!RtlEnterCriticalSection+0x46 - 0051ffb4
7c80b729 00000000 00000000 00000000 test2!thread1+0x50 [d:\project1\test2\test2\test2.cpp @ 10] - 0051ffec
00000000 00411122 00000000 00000000 kernel32!BaseThreadStart+0x37 - 0:004>
!cs 00417140 - -----------------------------------------
- Critical
section = 0x00417140 (test2!cs2+0x0) - DebugInfo
= 0x7c99ea00 - LOCKED
- LockCount
= 0x2 - OwningThread
= 0x00001f60 - RecursionCount
= 0x1 - LockSemaphore
= 0x34 - SpinCount
= 0x00000000
可以发现LockCount变成了2,如果临界区是有信号的,则显示NOT LOCKED(值为-1)
OwningThread表示拥有这个临界区的线程ID,RecursionCount表示拥有线程调了几次EnterCriticalSection,这其实也影响到了LockCount,如果拥有线程多调用一次EnterCriticalSection,那么 LockCount也会相应加1,因为LockCount标识了任意线程调用EnterCriticalSection请求这个互斥量的次数减1,(所以0-1=-1为NOT LOCKED)当然,前面如果调用了LeaveCriticalSection,那么 LockCount也会相应减1
我们继续看原有的程序:
~~[TID]线程 ID 为 TID 的线程。(中括号是必需的,而且在第二个~和左括号间不能有空格)
- 0:003>
~~[0x0000185c] -
2 Id: 1a98.185c Suspend: 1 Teb: 7ffdd000 Unfrozen -
Start: test2+0x1030 (00401030) -
Priority: 0 Priority class: 32 Affinity: f
- 0:003>
~2kb - ChildEBP
RetAddr Args to Child - 0060ff14
7c92df5a 7c939b23 00000034 00000000 ntdll!KiFastSystemCallRet - 0060ff18
7c939b23 00000034 00000000 00000000 ntdll!NtWaitForSingleObject+0xc - 0060ffa0
7c921046 00403388 0040104d 00403388 ntdll!RtlpWaitForCriticalSecti on+0x132 - 0060ffa8
0040104d 00403388 000203a8 7c80b729 ntdll!RtlEnterCriticalSection+0x46 - WARNING:
Stack unwind information not available. Following frames may be wrong. - 0060ffec
00000000 00401030 00000000 00000000 test2+0x104d - 0:003>
!cs 00403388 - -----------------------------------------
- Critical
section = 0x00403388 (test2+0x3388) - DebugInfo
= 0x7c99e9c0 - LOCKED
- LockCount
= 0x1 - OwningThread
= 0x00001588 - RecursionCount
= 0x1 - LockSemaphore
= 0x34 - SpinCount
= 0x00000000 - 0:003>
~~[0x00001588] -
1 Id: 1a98.1588 Suspend: 1 Teb: 7ffde000 Unfrozen -
Start: test2+0x1000 (00401000) -
Priority: 0 Priority class: 32 Affinity: f
原来2号线程等待的临界区拥有者是1号线程,所以经典的死锁现象出现了!!!!!!!!!!!!!!!!!!!!!!!!
下面继续介绍下!cs的扩展:
-
!cs -s 如果可能的话,显示每个临界区的初始堆栈回溯。
!cs -l 仅显示锁定的临界区。
- 0:003>
!cs -l - -----------------------------------------
- DebugInfo
= 0x7c99e9c0 - Critical
section = 0x00403388 (test2+0x3388) - LOCKED
- LockCount
= 0x1 - OwningThread
= 0x00001588 - RecursionCount
= 0x1 - LockSemaphore
= 0x34 - SpinCount
= 0x00000000 - -----------------------------------------
- DebugInfo
= 0x7c99e9e0 - Critical
section = 0x00403370 (test2+0x3370) - LOCKED
- LockCount
= 0x1 - OwningThread
= 0x0000185c - RecursionCount
= 0x1 - LockSemaphore
= 0x2C - SpinCount
= 0x00000000
!cs starAddress EndAddress指定要搜索临界区的地址范围
- 0:003>
!cs 0x00400000 0x00500000 - -----------------------------------------
- DebugInfo
= 0x7c99e9c0 - Critical
section = 0x00403388 (test2+0x3388) - LOCKED
- LockCount
= 0x1 - OwningThread
= 0x00001588 - RecursionCount
= 0x1 - LockSemaphore
= 0x34 - SpinCount
= 0x00000000 - -----------------------------------------
- DebugInfo
= 0x7c99e9e0 - Critical
section = 0x00403370 (test2+0x3370) - LOCKED
- LockCount
= 0x1 - OwningThread
= 0x0000185c - RecursionCount
= 0x1 - LockSemaphore
= 0x2C - SpinCount
= 0x00000000
-
!cs -o 对所有显示出来的已锁定的临界区,显示所有者的堆栈。
!cs -?显示该命令的帮助文本。
- 0:003>
!cs -? - !cs
[-s] [-l] [-o] - dump all the active critical sections in the current process. - !cs
[-s] [-o] address - dump critical section at this address. - !cs
[-s] [-l] [-o] address1 address2 - dump all the active critical sections in this range. - !cs
[-s] [-o] -d address - dump critical section corresponding to DebugInfo at this address. -
- "-s"
will dump the critical section initialization stack trace if it is available. -
- "-l"
will dump only the locked critical sections. -
- "-o"
will dump the critical section owner's stack.