Move another blog here

Thursday, August 12, 2010

Case study: crash in unloaded module

 

Some time ago we got a problem that our application crashed in an unloaded module. The development spent a lot of time on analyzing the fault until we finally noticed that the fault was in an UNLOADED module.

This is the case that the DLL is unloaded but some resources, - i.e., threads and/or memory variables that were allocated in the DLL, are not yet freed. After the DLL is unloaded, the process still wants to access the resources which are no more available. In this situation, you may find it’s hard to understand the crash because the code seems to be very “beautiful”, it shouldn’t crash here. The crash has nothing to do with the code where the crash happens, remember, the complete module is now unloaded.

For this kind of fault, WinDbg will report an access violation exception in UNLOADED module. The crash dump saves also the handle of unloaded modules which make it easier for WinDbg to locate the exception address – in the range of a certain module.

 

Used tool: WinDbg.
Problem: the application crashes several seconds after the system starts up.
Debugger output: ( for security information, the module name has been replaced with XXX and the Image name replaced with MyApp.)


0:019>!analyze -v
FAULTING_IP:
XXX+9d2f
017a9d2f 0000            add byte ptr [eax],al
EXCEPTION_RECORD:  ffffffff--(.exr 0xffffffffffffffff)
ExceptionAddress: 017a9d2f(<Unloaded_XXX.DLL>+0x00009d2f
  ExceptionCode: c0000005 (Access violation)
  ExceptionFlags: 00000000
NumberParameters: 2
  Parameter[0]: 00000000
  Parameter[1]: 0x17a9d2f
Attempt to read from address 017a9d2f
DEFAULT_BUCKET_ID:  WRONG_SYMBOLS
PROCESS_NAME: MyAPP.exe
MODULE_NAME: XXX
FAULTING_MODULE: 7c900000 ntdll
DEBUG_FLR_IMAGE_TIMESTAMP: 48a9711f
ERROR_CODE: (NTSTATUS) 0xc0000005 - The instruction at "0x%08lx" referenced memory at "0x%08lx". The memory could not be "%s".
READ_ADDRESS:   017a9d2f
BUGCHECK_STR: ACCESS_VIOLATION
LAST_CONTROL_TRANSFER:    from 00000000 to 00000000
STACK_TEXT:
0018a168  00000000 00000000 00000000 00000000 0x0
FAULTING_THREAD:    000006c4
FAILED_INSTRUCTION_ADDRESS:
XXX+9d2f
017a9d2f    0000        add byte ptr [eax],al
FOLLOWING_IP:
XXX+9d2f
017a9d2f    0000       add byte ptr [eax],al
SYMBOL_NAME:    XXX+9d2f
FOLLOWUP_NAME:   MachineOwner
IMAGE_NAME: XXX.DLL
STACK_COMMAND:    ~19s;  .ecxr ; kb
BUCKET_ID:    WRONG_SYMBOLS
FAILURE_BUCKET_ID:   XXX.DLL!base_address_c0000005_WRONG_SYMBOLS
Followup: MachineOwner
-------------
0:019>lm
start         end                  module name
00340000 0035b000        MODULE1  (private pdb symbols) D:/symbol/Module1.pdb
... ...
01680000 0168e000        MODULE2   (private pdb symbols) D:/symbol/Module2.pdb
01aa0000 01abd000       MODULE3 (private pdb symbols) D:/symbol/Module3.pdb
... ...
Unloaded modules:
017a0000 017c6000        XXX.DLL
017a0000 017c6000        XXX.DLL
02c50000 02c76000        XXX.DLL
... ...

From the red texts we know that the exception appears in a module which has been already unloaded from the process.

This is caused by a thread in the unloaded DLL that is not yet totally stopped. The situation appears especially when CPU is heavily loaded. The time ticker assigned to that thread is not long enough to execute its stop code before the process unloads the DLL (A call to function CloseModule).

To fix the problem, the function call CloseModule must be blocked before all threads in this DLL are stopped.

Note: The DLL is used as load-time dynamic linking, e.g. the DLL is loaded by an explicit function call to this DLL. The DLL is unloaded after another function call, here we say CloseModule. If your DLL is loaded by an explicit library call to ::LoadLibrary, Windows does automatically increment the module reference counter. The situation in this article won’t appear.

To best develop a DLL, please refer to MSDN DLL best practice:
 
http://www.microsoft.com/whdc/driver/kernel/DLL_bestproc.mspx.

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值