方法1
1、首先确认下是哪个进程出现死锁的情况,如界面卡死、点击没有反应等,抓取对应进程的dump
2、查看是否存在
0:000> kv
ChildEBP RetAddr Args to Child
0019f3b8 77736b0c 77722253 00000108 00000000 ntdll!KiFastSystemCallRet (FPO: [0,0,0])
0019f3bc 77722253 00000108 00000000 00000000 ntdll!ZwWaitForSingleObject+0xc (FPO: [3,0,0])
0019f420 77722137 00000000 00000000 00000001 ntdll!RtlpWaitOnCriticalSection+0x13e (FPO: [Non-Fpo])
0019f448 777504a0 777c8340 774893be 00000000 ntdll!RtlEnterCriticalSection+0x150 (FPO: [Non-Fpo])
0019f480 759e7b43 00000001 00000000 0019f4a8 ntdll!LdrLockLoaderLock+0xe4 (FPO: [Non-Fpo])
0019f4cc 6e723027 6e6f0000 0019f4e4 00000104 KERNELBASE!GetModuleFileNameW+0x75 (FPO: [Non-Fpo])
...
3、从上面看线程在等待一个临界区,使用!cs进行分析
0:000> !cs 777c8340
-----------------------------------------
Critical section = 0x777c8340 (ntdll!LdrpLoaderLock+0x0)
DebugInfo = 0x777c8540
LOCKED
LockCount = 0x15
WaiterWoken = No
OwningThread = 0x00001a40
RecursionCount = 0x1
LockSemaphore = 0x108
SpinCount = 0x00000000
4、可以看出0号线程等待的是ntdll!LdrpLoaderLock 锁,再看下这个锁被哪个线程占用了
0:000> ~~[0x00001a40]
30 Id: 1c14.1a40 Suspend: 1 Teb: 7ff8d000 Unfrozen
Start: CloudEngine!CreateCloudEngineLocker+0x2561 (702db4f0)
Priority: -2 Priority class: 32 Affinity: f
5、从上面看该锁被30号线程占用,然后再来分析30号线程的状态
0:000> ~30 kv
ChildEBP RetAddr Args to Child
0c8ed9f8 77736b0c 77722253 00000bf4 00000000 ntdll!KiFastSystemCallRet (FPO: [0,0,0])
0c8ed9fc 77722253 00000bf4 00000000 00000000 ntdll!ZwWaitForSingleObject+0xc (FPO: [3,0,0])
0c8eda60 77722137 00000000 00000000 00000000 ntdll!RtlpWaitOnCriticalSection+0x13e (FPO: [Non-Fpo])
0c8eda88 753b2fa0 753c4060 00000000 0bcf70bc ntdll!RtlEnterCriticalSection+0x150 (FPO: [Non-Fpo])
0c8edab0 753b2de2 0b8a5b84 00000000 0bcf70b8 bcrypt!LoadProvider+0x32 (FPO: [Non-Fpo])
0c8edb04 6dc79472 0c8edb24 6dc794a0 00000000 bcrypt!BCryptOpenAlgorithmProvider+0x12c (FPO: [Non-Fpo])
0c8edb28 7764c167 6dcc055c 0c8edb50 6dc1ce76 msxml3!AutoInitSalt::AutoInitSalt+0x1f (FPO: [Non-Fpo])
0c8edb34 6dc1ce76 6dc1cea4 6dc1cf10 00000001 msvcrt!_initterm+0x13 (FPO: [Non-Fpo])
0c8edb50 6dbf1594 6dbf0000 00000000 00000000 msxml3!_CRT_INIT+0xc3 (FPO: [Non-Fpo])
0c8edbb0 6dc19709 6dbf0000 00000001 00000000 msxml3!_CRT_INIT+0x22a (FPO: [Non-Fpo])
0c8edbcc 77748cc8 6dbf0000 00000001 00000000 msxml3!InitDllMain+0x8e (FPO: [Non-Fpo])
...
6、从上面看,30号线程也在等待一个临界区,该临界区被41号线程占用
0:030> !cs 753c4060
-----------------------------------------
Critical section = 0x753c4060 (bcrypt!g_csLoaderLock+0x0)
DebugInfo = 0x0b8c1b20
LOCKED
LockCount = 0x2
WaiterWoken = No
OwningThread = 0x00000398
RecursionCount = 0x1
LockSemaphore = 0xBF4
SpinCount = 0x00000000
0:030> ~~[0x00000398]
41 Id: 1c14.398 Suspend: 1 Teb: 7ff82000 Unfrozen
Start: cloudsec3!C360EngTaskMgr::_threadProc (6dfafb90)
Priority: 0 Priority class: 32 Affinity: f
7、看下41号线程
0:030> ~41 kv
ChildEBP RetAddr Args to Child
0df3bfac 77736b0c 77722253 00000108 00000000 ntdll!KiFastSystemCallRet (FPO: [0,0,0])
0df3bfb0 77722253 00000108 00000000 00000000 ntdll!ZwWaitForSingleObject+0xc (FPO: [3,0,0])
0df3c014 77722137 00000000 00000000 0df3c07c ntdll!RtlpWaitOnCriticalSection+0x13e (FPO: [Non-Fpo])
0df3c03c 7774fd17 777c8340 7aa2a7e6 753b2738 ntdll!RtlEnterCriticalSection+0x150 (FPO: [Non-Fpo])
0df3c0d8 77752563 74f50000 0df3c114 00000000 ntdll!LdrGetProcedureAddressEx+0x159 (FPO: [Non-Fpo])
0df3c0f4 759e6d18 74f50000 0df3c114 00000000 ntdll!LdrGetProcedureAddress+0x18 (FPO: [Non-Fpo])
0df3c11c 753b26c5 74f50000 753b2738 092e1dac KERNELBASE!GetProcAddress+0x44 (FPO: [Non-Fpo])
0df3c130 753b2fce 74f50000 092e1dac 00000000 bcrypt!_LoadInterface+0x77 (FPO: [Non-Fpo])
0df3c164 753b2de2 092e1dac 00000000 0bcf7018 bcrypt!LoadProvider+0x76 (FPO: [Non-Fpo])
0df3c1b8 75b1ada1 0df3c214 758cc800 00000000 bcrypt!BCryptOpenAlgorithmProvider+0x12c (FPO: [Non-Fpo])
0df3c234 75b184e4 0b8093c0 00000006 75b180fc wintrust!SIPObjectPE_::DigestFile+0x6a (FPO: [Non-Fpo])
0df3c264 75b1b471 75b161c8 0df3c3b4 08cdb278 wintrust!SIPObject_::CreateIndirectData+0x11a (FPO: [Non-Fpo])
0df3c2a4 75b18158 0df3c338 0df3c3b4 08cdb278 wintrust!SIPObjectPE_::CreateIndirectData+0x5d (FPO: [Non-Fpo])
0df3c2c4 758db43a 0df3c338 0df3c3b4 08cdb278 wintrust!InboxCryptSIPCreateIndirectData+0x4e (FPO: [Non-Fpo])
0df3c318 75b32f33 0df3c338 0df3c3b4 08cdb278 crypt32!CryptSIPCreateIndirectData+0x63 (FPO: [Non-Fpo])
0df3c3cc 75b18719 75b18728 00000770 0df3c61c wintrust!_CatAdminCalcHashFromFileHandle+0x14f (FPO: [Non-Fpo])
0df3c3e8 73d5087a 00000770 0df3c61c 0df3c408 wintrust!CryptCATAdminCalcHashFromFileHandle+0x1c (FPO: [Non-Fpo])
...
8、看出41号线程也在等待一个临界区
0:041> !cs 777c8340
-----------------------------------------
Critical section = 0x777c8340 (ntdll!LdrpLoaderLock+0x0)
DebugInfo = 0x777c8540
LOCKED
LockCount = 0x15
WaiterWoken = No
OwningThread = 0x00001a40
RecursionCount = 0x1
LockSemaphore = 0x108
SpinCount = 0x00000000
0:041> ~~[0x00001a40]
30 Id: 1c14.1a40 Suspend: 1 Teb: 7ff8d000 Unfrozen
Start: CloudEngine!CreateCloudEngineLocker+0x2561 (702db4f0)
Priority: -2 Priority class: 32 Affinity: f
41号线程的要访问的临界区被30号线程占用,30号线程访问的临界区被41号线程占用,主线程要访问的临界区也被30号线程占用。
死锁发生在调用系统API时,猜测是系统的bug
方法2
1、首先确认下是哪个进程出现死锁的情况,如界面卡死、点击没有反应等,抓取对应进程的dump
2、执行!locks命令查看所有的线程占用的锁
CritSec xxx!AtqActiveContextList+a8 at 68629100
LockCount 2 ——有多少个线程在等待这个临界区
RecursionCount 1
OwningThread 5b0
EntryCount 2
ContentionCount 2
*** Locked
这个线程等待的锁是68629100,被5b0线程占用
3、运行~列出所有线程的信息
0 Id: 1054.f54 Suspend: 1 Teb: 7efdb000 Unfrozen
1 Id: 1054.20c Suspend: 1 Teb: 7efd8000 Unfrozen
2 Id: 1054.1048 Suspend: 1 Teb: 7efd5000 Unfrozen
3 Id: 1054.104c Suspend: 1 Teb: 7ef9d000 Unfrozen
4 Id: 1054.a58 Suspend: 1 Teb: 7ef9a000 Unfrozen
5 Id: 1054.a60 Suspend: 1 Teb: 7ef97000 Unfrozen
6 Id: 1054.ea4 Suspend: 1 Teb: 7ef94000 Unfrozen
7 Id: 1054.3b4 Suspend: 1 Teb: 7ef91000 Unfrozen
8 Id: 1054.494 Suspend: 1 Teb: 7ef8e000 Unfrozen
9 Id: 1054.5b0 Suspend: 1 Teb: 7ef8b000 Unfrozen
10 Id: 1054.1010 Suspend: 1 Teb: 7ef82000 Unfrozen
11 Id: 1054.13f0 Suspend: 1 Teb: 7ef7f000 Unfrozen
12 Id: 1054.13e8 Suspend: 1 Teb: 7ef7c000 Unfrozen
13 Id: 1054.13cc Suspend: 1 Teb: 7ef79000 Unfrozen
14 Id: 1054.13b8 Suspend: 1 Teb: 7ef76000 Unfrozen
15 Id: 1054.a80 Suspend: 1 Teb: 7ef88000 Unfrozen
16 Id: 1054.fcc Suspend: 1 Teb: 7ef85000 Unfrozen
17 Id: 1054.e10 Suspend: 1 Teb: 7ef73000 Unfrozen
18 Id: 1054.edc Suspend: 1 Teb: 7ef70000 Unfrozen
19 Id: 1054.cbc Suspend: 1 Teb: 7ef6d000 Unfrozen
20 Id: 1054.c2c Suspend: 1 Teb: 7ef6a000 Unfrozen
21 Id: 1054.cc4 Suspend: 1 Teb: 7ef67000 Unfrozen
22 Id: 1054.544 Suspend: 1 Teb: 7ef64000 Unfrozen
23 Id: 1054.794 Suspend: 1 Teb: 7ef61000 Unfrozen
24 Id: 1054.950 Suspend: 1 Teb: 7ef5e000 Unfrozen
Id: 前的数字是thread id,Id: 后的数字1054是process id,这里可知5b0对应的thread id是9
4、查看9的堆栈信息
9 id: 1054.5b0 Suspend: 0 Teb 7ef8b000 Unfrozen
ChildEBP RetAddr Args to Child
014cfe64 77f6cc7b 00000460 00000000 00000000 ntdll!NtWaitForSingleObject+0xb
014cfed8 77f67456 0024e750 6833adb8 0024e750 ntdll!RtlpWaitForCriticalSection+0xaa
014cfee0 6833adb8 0024e750 80000000 01f21cb8 ntdll!RtlEnterCriticalSection+0x46
014cfef4 6833ad8f 01f21cb8 000a41f0 014cff20 xxx!DereferenceUserDataAndKill+0x24
014cff04 6833324a 01f21cb8 00000000 00000079 xxx!ProcessUserAsyncIoCompletion+0x2a
014cff20 68627260 01f21e0c 00000000 00000079 xxx!ProcessAtqCompletion+0x32
014cff40 686249a5 000a41f0 00000001 686290e8 xxx!I_TimeOutContext+0x87
014cff5c 68621ea7 00000000 00000001 0000001e xxx!AtqProcessTimeoutOfRequests_33+0x4f
014cff70 68621e66 68629148 000ad1b8 686230c0 xxx!I_AtqTimeOutWorker+0x30
014cff7c 686230c0 00000000 00000001 000c000a xxx!I_AtqTimeoutCompletion+0x38
014cffb8 77f04f2c 00000000 00000001 000c000a xxx!SchedulerThread_297+0x2f
00000001 000003e6 00000000 00000001 000c000a kernel32!BaseThreadStart+0x51
这里也可以通过windbg的View——Process and Threads的对应关系查找到对应的线程id
5、重复方法1的操作进行检查即可
方法3
1、确认出现死锁的进程
2、打开windows的任务管理器
3、选择性能项——打开左下角的资源监视器
4、找到死锁的进程,右键选择“分析等待链”,如果发生死锁的话,可以从这里获取到死锁的线程id信息,再综合dump去分析