Dump分析模式1: Multiple Exceptions(多线程异常)

The first pattern I’m going to introduce today is Multiple Exceptions. This pattern captures the known fact that there could be as many exceptions (”crashes”) as many threads in a process. The following UML diagram depicts the relationship between Process, Thread and Exception entities:

第一种模式叫: Multiple Exceptions(多线程异常), 这个模式对应的事实是进程中有很多线程,因此也会有很多异常(崩溃). 如下是进程,线程和异常三者的UML.

 

Every process in Windows has at least one execution thread so there could be at least one exception per thread (like invalid memory reference) if things go wrong. There could be second exception in that thread if exception handling code experiences another exception or the first exception was handled and you have another one and so on.

windows中每个进程至少有一个线程,这样当出现问题时至少有一个线程会有异常(: 无效地址引用). 如果异常处理代码也出现异常, 这时候甚至会出现第二个异常. 抑或是第一个异常被处理后你再次遇到异常.

So what is the general solution to that common problem when an application or service crashes and you have a crash dump file (common recurrent problem) from a customer (specific context)? The general solution is to look at all threads and their stacks and do not rely on what tools say.

因此,对于这种程序或者服务在你的用户那里(特定环境)崩溃后产生的崩溃文件(一般可重现的问题)你如何处理? 通用的解决方法是查看所有线程的调用栈, 而不是只依赖工具的提示.

Here is a concrete example from one of the dumps I got today:

Internet Explorer crashed and I opened it in WinDbg and ran ‘!analyze -v’ command. This is what I got in my WinDbg output:

这里有一个具体例子:

IE崩溃了, 我打开WinDbg, 输入‘!analyze -v’命令, 以下是WinDbg的输出:

ExceptionAddress: 7c822583 (ntdll!DbgBreakPoint)
   ExceptionCode: 80000003 (Break instruction exception)
  ExceptionFlags: 00000000
NumberParameters: 3
   Parameter[0]: 00000000
   Parameter[1]: 8fb834b8
   Parameter[2]: 00000003

Break instruction, you might think, shows that the dump was taken manually from the running application and there was no crash - the customer sent the wrong dump or misunderstood instructions. However I looked at all threads and noticed the following two stacks (threads 15 and 16):

中断指令, 你可能会觉得dump是手动做的, 而且程序运行的很好, 没有出现崩溃-用户发错dump,或是操作错了. 然而当我查看所有的线程后, 我注意到以下两个调用栈(线程1516):

0:016>~*kL
...
15  Id: 1734.8f4 Suspend: 1 Teb: 7ffab000 Unfrozen
ntdll!KiFastSystemCallRet
ntdll!NtRaiseHardError+0xc
kernel32!UnhandledExceptionFilter+0x54b
kernel32!BaseThreadStart+0x4a
kernel32!_except_handler3+0x61
ntdll!ExecuteHandler2+0x26
ntdll!ExecuteHandler+0x24
ntdll!KiUserExceptionDispatcher+0xe
componentA!xxx
componentB!xxx
mshtml!xxx
kernel32!BaseThreadStart+0x34

# 16  Id: 1734.11a4 Suspend: 1 Teb: 7ffaa000 Unfrozen
ntdll!DbgBreakPoint
ntdll!DbgUiRemoteBreakin+0x36

So we see here that the real crash happened in componentA.dll and componentB.dll or mshtml.dll might have influenced that. Why this happened? The customer might have dumped Internet Explorer manually while it was displaying an exception message box. The following reference says that ZwRaiseHardError displays a message box containing an error message:

这里我们可以看到真正的崩溃可能发生在componentA.dll componentB.dll 或者 mshtml.dll也对它有影响. 为什么会这样? 客户可能是在程序显示异常消息框的时候dump. 下面这本书里记录了ZwRaiseHardError会弹出一个包含错误信息的消息框.

Windows NT/2000 Native API Reference

 

Or perhaps something else happened. Many cases where we see multiple thread exceptions in one process dump happened because crashed threads displayed message boxes like Visual C++ debug message box and preventing that process from termination. In our dump under discussion WinDbg automatic analysis command recognized only the last breakpoint exception (shown as # 16). In conclusion we shouldn’t rely on ”automatic analysis” often anyway and probably should write our own extension to list possible multiple exceptions (based on some heuristics I will talk about later).

这也有可能是别的情况. 很多时候当我们遇到多线程异常, 一个进程进行了dump, 而崩溃的线程可能正显示错误提示框, VC++调试信息框, 因此防止了进程终止. 刚才我们讨论的dump, WinDbg的自动分析只识别了最后的断点异常(进程16). 所以, 我们不能只依赖自动分析, 可能的话需要自己进行扩展(基于一些方法, 随后我会讲到), 列出可能出现问题的线程的异常.

- Dmitry Vostokov @ DumpAnalysis.org -

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值