debugging custom filters for unhandled exceptions

xiaoqiangvs007

于 2007-06-21 16:52:00 发布

阅读量1.5k

点赞数

分类专栏： Debug 文章标签： debugging filter application exception components function

Debug 专栏收录该内容

8 篇文章 0 订阅

订阅专栏

转自http://www.debuginfo.com/articles/easywindbg.html#startupmessages

Updated: 09.10.2005

Introduction
Debugging custom filters
Somebody is overwriting my filter!
Enforcing your own filter
Sample code
Conclusion

Introduction

When our application crashes on the customer's site, we need as much information about the problem as possible. System tools like Dr. Watson can help to collect the necessary information, but their effectiveness depends on the configuration of the target system. If we do not want to depend on system configuration, custom filter for unhandled exceptions is a good solution. With the help of the custom filter, we can get notified about unhandled exceptions in the application, create detailed crash report, and sometimes even automatically send it to developers for investigation.

Unfortunately, custom filters for unhandled exceptions are not easy to debug. In addition, sometimes it can be difficult to ensure that our filter is properly registered, because other components of the application (including some system DLLs) might want to register their own filters. In this article, I will show how to overcome these problems and make our filters debuggable and reliable.

Debugging custom filters

So we have implemented a custom filter for unhandled exceptions (here is an example) and registered it using SetUnhandledExceptionFilter function. We simulate an unhandled exception, and our filter is invoked, but it looks like it has a bug – say, it does not properly create the minidump. We want to debug the filter, so we run the application under debugger and set a breakpoint at the beginning of the filter function. The exception is raised again, but there is a surprise – the breakpoint in our filter is not hit, and instead the debugger reports a second chance exception.

Second chance exception dialog

Pressing “Continue” does not help. What happened? The answer is simple – custom filters for unhandled exceptions are not called at all when the application is running under debugger. A bit later we will determine exactly why it happens, but now lets look for alternative ways of debugging our filter.

KB173652 offers some help, describing two possible approaches. The first approach is pretty simple, and can be used in nearly all situations where we cannot run the application under debugger – use tracing. It means that our old good friends – OutputDebugString, TRACE, ATLTRACE and other members of this family – can help us again.

Another approach is to modify our application (for debugging purposes only!) to execute the filter from __except clause:

__try
{
    FaultyFunc();
}
__except( MyCustomFilter( GetExceptionInformation() ) )
{
    _tprintf( _T("Exception handled./n") );
}

(see complete example here)

Filters called inside __except clauses are not skipped when the application is running under debugger. When FaultyFunc raises an exception, breakpoint in our filter will be hit and we will be able to debug the filter. An obvious limitation of this approach is that it is intrusive – we have to write additional code only to debug the filter.

There is also another intrusive approach to debugging filters (much less intrusive, though) – modify the filter to show a message box (or print a console message and then sleep) when it is called. Then we can run the application (but this time not under debugger), simulate an unhandled exception, and wait until the filter gets called and displays the message box. After that we can attach a debugger to the application, set breakpoint in the filter, and dismiss the message box. Our breakpoint will be hit, and we will be able to debug the filter. An example of this approach can be found here.

But is it really necessary to write additional code only to make debugging of the filter possible? May be there is a non-intrusive way of debugging a custom filter? Yes, there is one, and I am going to describe it now.

At the beginning, lets explore how the operating system calls custom filters for unhandled exceptions. If we look at the call stack of the first thread of any Win32 application, we can see that execution always starts at kernel32!BaseProcessStart function. This function receives a pointer to the application's main entry point (usually mainCRTStartup or WinMainCRTStartup in C++ applications) and calls it inside of __try..__except block:

VOID BaseProcessStart( PPROCESS_START_ROUTINE pfnStartAddr )
{
    __try
    {
        ExitThread( (pfnStartAddr)() );
    }
    __except( UnhandledExceptionFilter( GetExceptionInformation()) )
    {
        ExitProcess( GetExceptionCode() );
    }
}

(this code is a bit simplified; more complete code can be found in Figure 7 of this article)

Execution of all other threads in the process starts with a similar function – kernel32!BaseThreadStart:

VOID BaseThreadStart( PTHREAD_START_ROUTINE pfnStartAddr, PVOID pParam )
{
    __try
    {
        ExitThread( (pfnStartAddr)(pParam) );
    }
    __except( UnhandledExceptionFilter(GetExceptionInformation()) )
    {
        ExitProcess( GetExceptionCode() );
    }
}

As we can see, if any of the application's threads raises an exception and does not handle it, the control will be passed to kernel32!UnhandledExceptionFilter function. This function offers the last chance processing for all unhandled exceptions. I will not provide the whole list of actions performed by kernel32!UnhandledExceptionFilter (it deserves an article of its own; many details can be found here), but here are the most important steps:

Check if the exception was raised because of an attempt to write into a read-only memory page inside .rsrc section, and correct the problem by making the memory page writeable.
If the application is running under debugger, do not handle the exception and pass control to the debugger.
Call the registered custom filter for unhandled exceptions.
If the filter chose not to handle the exception, launch the registered just-in-time debugger to debug the application.

Here is the pseudocode of the relevant parts of kernel32!UnhandledExceptionFilter:

LONG UnhandledExceptionFilter( EXCEPTION_POINTERS* pep )
{
    DWORD rv;

    EXCEPTION_RECORD* per = pep->ExceptionRecord;

    
    // Check for read-only resource access

    if( ( per->ExceptionCode == EXCEPTION_ACCESS_VIOLATION ) && 
         ( per->ExceptionInformation[0] != 0 ) )
    {
        rv = BasepCheckForReadOnlyResource( per->ExceptionInformation[1] );

        if( rv == EXCEPTION_CONTINUE_EXECUTION )
            return EXCEPTION_CONTINUE_EXECUTION;
    }


    // Is the process running under debugger ? 

    DWORD DebugPort = 0;

    rv = NtQueryInformationProcess( GetCurrentProcess(), ProcessDebugPort, 
                                    &DebugPort, sizeof( DebugPort ), 0 );

    if( ( rv >= 0 ) && ( DebugPort != 0 ) )
    {
        // Yes, it is -> Pass exception to the debugger
        return EXCEPTION_CONTINUE_SEARCH; 
    }


    // Is custom filter for unhandled exceptions registered ?

    if( BasepCurrentTopLevelFilter != 0 )
    {
        // Yes, it is -> Call the custom filter

        rv = (BasepCurrentTopLevelFilter)(pep);

        if( rv == EXCEPTION_EXECUTE_HANDLER )
            return EXCEPTION_EXECUTE_HANDLER;

        if( rv == EXCEPTION_CONTINUE_EXECUTION )
            return EXCEPTION_CONTINUE_EXECUTION;
    }


    // Proceed to other tasks (check error mode, start JIT debugger, etc.)
    ...

}

Look! The function uses NtQueryInformationProcess to determine whether the application is running under debugger, and whether it should skip calling the custom filter. If we modify the value returned by NtQueryInformationProcess (DebugPort), we can “ask” it to continue and call our filter, even when the application is running under debugger.

So the solution for our problem looks simple – run the application under debugger as usual, but instead of setting a breakpoint in our custom filter, set the breakpoint in kernel32!UnhandledExceptionFilter function. When the breakpoint is hit, we should step through the function until the call to NtQueryInformationProcess returns, and modify the value returned by NtQueryInformationProcess (in 3rd parameter) to zero – to make it look like the application is not really running under debugger.

The only problem is that we don't have the source code of kernel32.dll, so we have to step through disassembly. Now lets take a look at the disassembly of kernel32!UnhandledExceptionFilter function, to familiarize ourselves with its layout and see where we should go and what we should modify.

In Windows 2000 SP4, the relevant parts of the disassembly look like this:

// Setup the stack frame 

7c51bd2b   push    ebp
7c51bd2c   mov     ebp,esp
7c51bd2e   push    0xff
7c51bd30   push    0x7c505788
7c51bd35   push    0x7c4ff0b4
7c51bd3a   mov     eax,fs:[00000000]
7c51bd40   push    eax
7c51bd41   mov     fs:[00000000],esp
7c51bd48   push    ecx
7c51bd49   push    ecx
7c51bd4a   sub     esp,0x2d8
7c51bd50   push    ebx
7c51bd51   push    esi
7c51bd52   push    edi
7c51bd53   mov     [ebp-0x18],esp

// Call BasepCheckForReadOnlyResource, if necessary

7c51bd56   mov     esi,[ebp+0x8]
7c51bd59   mov     eax,[esi]
7c51bd5b   cmp     dword ptr [eax],0xc0000005
7c51bd61   jnz KERNEL32!UnhandledExceptionFilter+0x53 (7c51bd8a)
7c51bd63   xor     ebx,ebx
7c51bd65   cmp     [eax+0x14],ebx
7c51bd68   jz KERNEL32!UnhandledExceptionFilter+0x55 (7c51bd8c)
7c51bd6a   push    dword ptr [eax+0x18]
7c51bd6d   call KERNEL32!BasepCheckForReadOnlyResource (7c51bc52)
7c51bd72   cmp     eax,0xffffffff
7c51bd75   jnz KERNEL32!UnhandledExceptionFilter+0x55 (7c51bd8c)
7c51bd77   or      eax,eax
7c51bd79   mov     ecx,[ebp-0x10]
7c51bd7c   mov     fs:[00000000],ecx
7c51bd83   pop     edi
7c51bd84   pop     esi
7c51bd85   pop     ebx
7c51bd86   leave
7c51bd87   ret     0x4

// Call NtQueryInformationProcess to determine whether the process is being debugged

7c51bd8a   xor     ebx,ebx
7c51bd8c   mov     [ebp-0x38],ebx
7c51bd8f   push    ebx
7c51bd90   push    0x4
7c51bd92   lea     eax,[ebp-0x38]
7c51bd95   push    eax
7c51bd96   push    0x7
7c51bd98   call    KERNEL32!GetCurrentProcess (7c4fe0c8)
7c51bd9d   push    eax
7c51bd9e   call dword ptr [KERNEL32!_imp__NtQueryInformationProcess (7c4e10b8)]
7c51bda4   cmp     eax,ebx
7c51bda6   jl KERNEL32!UnhandledExceptionFilter+0x7a (7c51bdb1)

// Check the value returned by NtQueryInformationProcess (DebugPort) 
// 
// If you want to call the custom filter even when the application is running 
// under debugger, set [ebp-0x38] to zero
// 

1--> 7c51bda8   cmp     [ebp-0x38],ebx
7c51bdab   jne KERNEL32!UnhandledExceptionFilter+0x2c3 (7c51bfda)

// Call the custom filter

7c51bdb1   mov eax,[KERNEL32!BasepCurrentTopLevelFilter (7c54144c)]
7c51bdb6   cmp     eax,ebx
7c51bdb8   jz KERNEL32!UnhandledExceptionFilter+0x98 (7c51bdc7)
7c51bdba   push    esi
2--> 7c51bdbb   call    eax
7c51bdbd   cmp     eax,0x1
7c51bdc0   jz KERNEL32!UnhandledExceptionFilter+0x2e1 (7c51bd79)
7c51bdc2   cmp     eax,0xffffffff
7c51bdc5   jz KERNEL32!UnhandledExceptionFilter+0x2e1 (7c51bd79)

We should set our breakpoint at the beginning of the function. Then we should step through the function's disassembly until we reach the line marked with “1-->”, and set to zero the value stored at EBP-0x38 (in VS debugger, you can enter “EBP-0x38, x” into Watch window to obtain the address). After we have set the value, we should step again until the line marked with “2-->”, where our filter is about to be called. Press F11, and we are inside our filter! (Of course, nobody prevents us from optimizing this process with a couple of additional breakpoints)

What about Windows XP and Windows Server 2003? The disassembly of kernel32!UnhandledExceptionFilter function looks very similar to the one on Windows 2000. Recent service packs introduce some additional steps to the logic of this function (we will talk about some of them later in this article), but our approach remains the same – modify the value returned by NtQueryInformationProcess, then step until we reach the call to our filter, and step into it.

Somebody is overwriting my filter!

Now we know how to debug our filter. But sometimes we can end up in a situation when our filter is not called at all. Why could it happen? Of course, the first suspect is the call to SetUnhandledExceptionFilter function – if it is not called properly, our filter is not registered. But it is difficult to miss this call, right?

After we have verified that our application calls SetUnhandledExceptionFilter properly, we start suspecting that somebody else overwrites our filter. Yes, there can be only one registered filter for the whole process, so any DLL or, for example, a 3rd party control could call SetUnhandledExceptionFilter and register its own filter, disabling ours.

Is there a way to determine whether our filter is registered or not? Yes, there is. The pointer to the currently registered filter is stored in BasepCurrentTopLevelFilter global variable, which is located in kernel32.dll. If we have symbols for kernel32.dll (and symbol server can help us to obtain them), we can use the debugger to see which filter is registered. In Visual Studio debugger, we can use the following expression in Watch window to see the address of the currently registered custom filter:

*(unsigned long*){,,kernel32.dll}_BasepCurrentTopLevelFilter, x

In WinDbg, 'dds' command is very convenient:

> dds kernel32!BasepCurrentTopLevelFilter L1
7c54144c  00401050 MyApp!MyCustomFilter [c:/test/myapp.cpp @ 63]

(note that if incremental linking is enabled, BasepCurrentTopLevelFilter can point to a thunk, but even in that case it is easy to identify the real filter)

Unfortunately, this method is not going to work since Windows XP SP2 and Windows Server 2003 SP1, because the pointer to the filter function is stored in encoded form. Yes, if we look at the disassembly of SetUnhandledExceptionFilter function in Windows XP SP2, we can see that the pointer is passed to EncodePointer function, and the resulting encoded value is stored in BasepCurrentTopLevelFilter variable. Fortunately, after exploring the disassembly of ntdll!RtlEncodePointer function (kernel32!EncodePointer is forwarded to ntdll!RtlEncodePointer), it becomes clear that the encoding performed actually means XORing the pointer value with a process-wide cookie. Further examination reveals that the cookie is stored as part of _EPROCESS structure in kernel memory (_EPROCESS.Cookie), and therefore we can use our favourite kernel debugger (I like WinDbg with LiveKd) to display this value. For example:

> !process <pid> 0
> dt nt!_EPROCESS <address> Cookie

(<address> is the address of _EPROCESS structure, obtained from the output of !process command)

If we don't want to resort to kernel debugger to decode the pointer, we can simply run the application under debugger and set breakpoint at SetUnhandledExceptionFilter to see who is registering custom filters for unhandled exceptions (of course, data breakpoint at kernel32!BasepCurrentTopLevelFilter could be even more reliable source of information, in case if somebody attempts to modify the variable directly).

Here is the output shown by WinDbg when debugging this sample application, after I set a breakpoint at SetUnhandledExceptionFilter using the following command:

0:000> bp kernel32!SetUnhandledExceptionFilter "k;g"
0:000> g
ChildEBP RetAddr  
0012f820 780020fd KERNEL32!SetUnhandledExceptionFilter
0012f828 780011b3 msvcrt!__CxxSetUnhandledExceptionFilter+0xb
0012f830 78001e29 msvcrt!_initterm+0xf
0012f83c 780010ec msvcrt!_cinit+0x1a
0012f8dc 77f86215 msvcrt!_CRTDLL_INIT+0xec
0012f8fc 77f86f17 ntdll!LdrpCallInitRoutine+0x14
0012f97c 77f8b845 ntdll!LdrpRunInitializeRoutines+0x1df
0012fc98 77f8c295 ntdll!LdrpInitializeProcess+0x802
0012fd1c 77fa15d3 ntdll!LdrpInitialize+0x207
00000000 00000000 ntdll!KiUserApcDispatcher+0x7
ChildEBP RetAddr  
0012feb8 0041563e KERNEL32!SetUnhandledExceptionFilter
0012fec4 00415946 MyApp!__CxxSetUnhandledExceptionFilter+0xe
0012fed0 00415699 MyApp!_initterm_e+0x26
0012fee4 00412343 MyApp!_cinit+0x29
0012ffc0 7c4e87f5 MyApp!mainCRTStartup+0x133
0012fff0 00000000 KERNEL32!BaseProcessStart+0x3d
ChildEBP RetAddr  
0012fe04 00411b7b KERNEL32!SetUnhandledExceptionFilter
0012fedc 00412380 MyApp!main+0x2b
0012ffc0 7c4e87f5 MyApp!mainCRTStartup+0x170
0012fff0 00000000 KERNEL32!BaseProcessStart+0x3d

The output looks interesting – we can see not less than three attempts to register a custom filter, and two of them originate from ... CRT library! This brings us to the next part of our discussion – we will try to determine what components of our application might want to install a custom filter. For now, I can identify three categories:

CRT library
.NET runtime
3rd party components

Lets approach them one by one, and start with CRT library. It turns out that CRT library relies on a custom filter for unhandled exceptions to implement support for terminate() and related functionality (remember that C++ exceptions are implemented with the help of SEH exceptions). Whenever an unhandled C++ exception is thrown by the application, the custom filter installed by CRT library will catch it and call terminate(). The application can use set_terminate() function to register a terminate handler and get notified about unhandled C++ exceptions. If there is no registered handler, or if the handler returns control, the application is terminated by abort().

Fortunately, the custom filter installed by CRT library is a good citizen (by the way, its name is _CxxUnhandledExceptionFilter), and it does not attempt to handle all possible types of exceptions. If it catches an exception other than “Microsoft C++ Exception” (its code is 0xE06D7363), it calls the previously registered filter, and lets it process the exception.

Filter chaining

When we register a custom filter using SetUnhandledExceptionFilter, the function returns a pointer to the previously registered filter. This pointer gives us the option to call the previous filter from ours, if necessary. Why would we want to do it? For example, if our filter only cares about some specific kinds of exceptions, and does not want to take responsibility of others. A good example of this is the filter installed by CRT library, which handles Microsoft C++ exception (exception code 0xE06D7363) and passes all other exceptions to the previously registered filter. Also, if we want to unregister our filter, we can call SetUnhandledExceptionFilter again and pass it the pointer to the previous filter, thus re-registering it.

In theory, this feature allows an application and its components to implement a chain of custom filters for unhandled exceptions. As long as every filter in the chain plays by the rules and calls the previous filter, every interested component can be notified about unhandled exceptions and react properly. You can find an example of filter chaining here.

In practice, as usual, there are some difficulties. After we have got a pointer to the previous filter, it is not always possible to ensure that the filter can be called safely. What if the filter was registered by a DLL, which is now unloaded? If it is still loaded, is it ready to process the exception? If we don't have the intimate knowledge and control over lifetimes of the application's components, it is probably better not to rely on filter chains, and have only one filter which is responsible for application-wide error handling policy.

Another interesting issue with filters installed by CRT library is that there can be several instances of CRT inside our process. This is because one CRT library is linked with our main executable, and other instances of CRT can be linked with DLLs loaded by our application. Each of them will try to install its own filter for unhandled exceptions, potentially overwriting our filter.

This situation gets especially tricky in the following scenario:

1. Our application registers its custom filter at startup.

2. After a while, the application loads a DLL linked with its own version of CRT, which registers its own filter.

It means that even if we registered our filter successfully, it can be overridden any time the application loads a DLL.

.NET runtime also uses a custom filter (mscorwks!ComUnhandledExceptionFilter). I don't know .NET as well as Win32, but it is clear that at least the following aspects of .NET functionality are implemented with the help of this filter:

Managed debugger support (the filter notifies the debugger about unhandled managed exceptions)
AppDomain.UnhandledException event
Just-in-time debugging (managed)

And finally, third party components can also install their own filters for unhandled exceptions. Is it good or bad? As always, it depends on the situation. In general, I think that it is not a desirable behavior, because usually the application itself, and not one of its components, should define the policy for error handling and reporting.

Enforcing your own filter

I will reiterate my last sentence: in most cases the application, and not one of its components, should be responsible for application-wide error handling and reporting policy. In particular, it means that if the application registers a filter for unhandled exceptions, nobody else should override it by registering its own filter. How can we achieve it? I don't know a 100% reliable solution, but there are some good opportunities. Lets discuss them.

It's obvious that if we want to make sure that our filter is always registered, we should prevent other parties from registering their filters. API hooking looks like a possible solution – we can hook SetUnhandledExceptionFilter function and reject all attempts to use it after we have registered our filter.

Nowadays, there are two main approaches to API hooking:

Import Address Table (IAT) hooking
Detours-like hooking

IAT approach has a problem which makes it unreliable – if the caller obtained the address of the target function (SetUnhandledExceptionFilter in our case) using GetProcAddress function, the call will not go through IAT, and our hook function will not be called.

Detours-like hooking relies on patching the beginning of the target function itself, and therefore it is more reliable – it will not miss calls that can be missed by IAT hooks. The problem with Detours is that it is not publicly available (needs license for commercial use), and implementing a similar, but home grown solution just for hooking SetUnhandledExceptionFilter is not always feasible.

There is one more approach possible, and it is much easier to implement than the previous two. After we have registered our own filter, we can patch the beginning of SetUnhandledExceptionFilter function so that it will not be able to register filters anymore. For example, the following sequence of assembly instructions can be used:

xor eax, eax
ret 0x4

EnforceFilter example demonstrates this approach.

Sample code

Complete sample code for the article can be found here.

Conclusion

Custom filters for unhandled exceptions are difficult to debug because they are not called when the application is running under debugger. But if we know how the operating system processes unhandled exceptions, we can change the default behavior and make our filter debuggable.

If our application consists of a large set of components, it can be difficult to implement a reliable error handling policy, because multiple components may want to be notified about unhandled exceptions. But if we know how the custom filters for unhahdled exceptions are registered, we can ensure that our filter always prevails.