经典文章-API Hook Revealed - 1

转:http://hi.baidu.com/techicey/blog/item/3a7d46c8f58fe5107f3e6fac.html

经典文章-API Hook Revealed - 1
2008-11-18 19:35
原文:

Intercepting Win32 API calls has always been a challenging subject among most of the Windows developers and I have to admit, it's been one of my favorite topics. The term Hooking represents a fundamental technique of getting control over a particular piece of code execution. It provides an straightforward mechanism that can easily alter the operating system's behavior as well as 3rd party products, without having their source code available.

Many modern systems draw the attention to their ability to utilize existing Windows applications by employing spying techniques. A key motivation for hooking, is not only to contribute to advanced functionalities, but also to inject user-supplied code for debugging purposes.

Unlike some relatively "old" operating systems like DOS and Windows 3.xx, the present Windows OS as NT/2K and 9x provide sophisticated mechanisms to separate address spaces of each process. This architecture offers a real memory protection, thus no application is able to corrupt the address space of another process or in the worse case even to crash the operating system itself. This fact makes a lot harder the development of system-aware hooks.

My motivation for writing this article was the need for a really simple hooking framework, that will offer an easy to use interface and ability to capture different APIs. It intends to reveal some of the tricks that can help you to write your own spying system. It suggests a single solution how to build a set for hooking Win32 API functions on NT/2K as well as 98/Me (shortly named in the article 9x) family Windows. For the sake of simplicity I decided not to add a support do UNICODE. However, with some minor modifications of the code you could easily accomplish this task.

Spying of applications provides many advantages:

  1. API function's monitoring
    The ability to control API function calls is extremely helpful and enables developers to track down specific "invisible" actions that occur during the API call. It contributes to comprehensive validation of parameters as well as reports problems that usually remain overlooked behind the scene. For instance sometimes, it might be very helpful to monitor memory related API functions for catching resource leaks.
  2. Debugging and reverse engineering
    Besides the standard methods for debugging API hooking has a deserved reputation for being one of the most popular debugging mechanisms. Many developers employ the API hooking technique in order to identify different component implementations and their relationships. API interception is very powerful way of getting information about a binary executable.
  3. Peering inside operating system
    Often developers are keen to know operating system in dept and are inspired by the role of being a "debugger". Hooking is also quite useful technique for decoding undocumented or poorly documented APIs.
  4. Extending originally offered functionalities by embedding custom modules into external Windows applications Re-routing the normal code execution by injecting hooks can provide an easy way to change and extend existing module functionalities. For example many 3rd party products sometimes don't meet specific security requirements and have to be adjusted to your specific needs. Spying of applications allows developers to add sophisticated pre- and post-processing around the original API functions. This ability is an extremely useful for altering the behavior of the already compiled code.

Functional requirements of a hooking system

There are few important decisions that have to be made, before you start implementing any kind of API hooking system. First of all, you should determine whether to hook a single application or to install a system-aware engine. For instance if you would like to monitor just one application, you don't need to install a system-wide hook but if your job is to track down all calls to TerminateProcess() or WriteProcessMemory() the only way to do so is to have a system-aware hook. What approach you will choose depends on the particular situation and addresses specific problems.

General design of an API spying framework

Usually a Hook system is composed of at least two parts - a Hook Server and a Driver. The Hook Server is responsible for injecting the Driver into targeted processes at the appropriate moment. It also administers the driver and optionally can receive information from the Driver about its activities whereas the Driver module that performs the actual interception.  
This design is rough and beyond doubt doesn't cover all possible implementations. However it outlines the boundaries of a hook framework.

Once you have the requirement specification of a hook framework, there are few design points you should take into account:

  • What applications do you need to hook
  • How to inject the DLL into targeted processes or which implanting technique to follow
  • Which interception mechanism to use

I hope next the few sections will provide answers to those issues.

Injecting techniques

  1. Registry
    In order to inject a DLL into processes that link with USER32.DLL, you simply can add the DLL name to the value of the following registry key:

    HKEY_LOCAL_MACHINE/Software/Microsoft/Windows NT/CurrentVersion/Windows/AppInit_DLLs

    Its value contains a single DLL name or group of DLLs separated either by comma or spaces. According to MSDN documentation [7], all DLLs specified by the value of that key are loaded by each Windows-based application running within the current logon session. It is interesting that the actual loading of these DLLs occurs as a part of USER32's initialization. USER32 reads the value of mentioned registry key and calls LoadLibrary() for these DLLs in its DllMain code. However this trick applies only to applications that use USER32.DLL. Another restriction is that this built-in mechanism is supported only by NT and 2K operating systems. Although it is a harmless way to inject a DLL into a Windows processes there are few shortcomings:

    • In order to activate/deactivate the injection process you have to reboot Windows.
    • The DLL you want to inject will be mapped only into these processes that use USER32.DLL, thus you cannot expect to get your hook injected into console applications, since they usually don't import functions from USER32.DLL.
    • On the other hand you don't have any control over the injection process. It means that it is implanted into every single GUI application, regardless you want it or not. It is a redundant overhead especially if you intend to hook few applications only. For more details see [2] "Injecting a DLL Using the Registry"
  2. System-wide Windows Hooks
    Certainly a very popular technique for injecting DLL into a targeted process relies on provided by Windows Hooks. As pointed out in MSDN a hook is a trap in the system message-handling mechanism. An application can install a custom filter function to monitor the message traffic in the system and process certain types of messages before they reach the target window procedure.

    A hook is normally implemented in a DLL in order to meet the basic requirement for system-wide hooks. The basic concept of that sort of hooks is that the hook callback procedure is executed in the address spaces of each hooked up process in the system. To install a hook you call SetWindowsHookEx() with the appropriate parameters. Once the application installs a system-wide hook, the operating system maps the DLL into the address space in each of its client processes. Therefore global variables within the DLL will be "per-process" and cannot be shared among the processes that have loaded the hook DLL. All variables that contain shared data must be placed in a shared data section. The diagram bellow shows an example of a hook registered by Hook Server and injected into the address spaces named "Application one" and "Application two".

    Figure 1

    A system-wide hook is registered just ones when SetWindowsHookEx() is executed. If no error occurs a handle to the hook is returned. The returned value is required at the end of the custom hook function when a call to CallNextHookEx() has to be made. After a successful call to SetWindowsHookEx() , the operating system injects the DLL automatically (but not necessary immediately) into all processes that meet the requirements for this particular hook filter. Let's have a closer look at the following dummy WH_GETMESSAGE filter function:

    Collapse Copy Code
    //
    ---------------------------------------------------------------------------
    // GetMsgProc
    //
    // Filter function for the WH_GETMESSAGE - it's just a dummy function
    // ---------------------------------------------------------------------------
    LRESULT CALLBACK GetMsgProc(
    int code, // hook code
    WPARAM wParam, // removal option
    LPARAM lParam // message
    )
    {
    // We must pass the all messages on to CallNextHookEx.
    return ::CallNextHookEx(sg_hGetMsgHook, code, wParam, lParam);
    }

    A system-wide hook is loaded by multiple processes that don't share the same address space.

    For instance hook handle sg_hGetMsgHook , that is obtained by SetWindowsHookEx() and is used as parameter in CallNextHookEx() must be used virtually in all address spaces. It means that its value must be shared among hooked processes as well as the Hook Server application. In order to make this variable "visible" to all processes we should store it in the shared data section.

    The following is an example of employing #pragma data_seg() . Here I would like to mention that the data within the shared section must be initialized, otherwise the variables will be assigned to the default data segment and #pragma data_seg() will have no effect.

    Collapse Copy Code
    //
    ---------------------------------------------------------------------------
    // Shared by all processes variables
    // ---------------------------------------------------------------------------
    #pragma data_seg(" .HKT" )
    HHOOK sg_hGetMsgHook = NULL;
    BOOL sg_bHookInstalled = FALSE;
    // We get this from the application who calls SetWindowsHookEx()'s wrapper
    HWND sg_hwndServer = NULL;
    #pragma data_seg()
    You should add a SECTIONS statement to the DLL's DEF file as well
    Collapse Copy Code
    SECTIONS
    .HKT Read Write Shared
    or use
    Collapse Copy Code
    #pragma
     comment(linker, "
    /section:.HKT, rws"
    )

    Once a hook DLL is loaded into the address space of the targeted process, there is no way to unload it unless the Hook Server calls UnhookWindowsHookEx() or the hooked application shuts down. When the Hook Server calls UnhookWindowsHookEx() the operating system loops through an internal list with all processes which have been forced to load the hook DLL. The operating system decrements the DLL's lock count and when it becomes 0, the DLL is automatically unmapped from the process's address space.

    Here are some of the advantages of this approach:

    • This mechanism is supported by NT/2K and 9x Windows family and hopefully will be maintained by future Windows versions as well.
    • Unlike the registry mechanism of injecting DLLs this method allows DLL to be unloaded when Hook Server decides that DLL is no longer needed and makes a call to UnhookWindowsHookEx()

    Although I consider Windows Hooks as very handy injection technique, it comes with its own disadvantages:

    • Windows Hooks can degrade significantly the entire performance of the system, because they increase the amount of processing the system must perform for each message.
    • It requires lot of efforts to debug system-wide Windows Hooks. However if you use more than one instance of VC++ running in the same time, it would simplify the debugging process for more complex scenarios.
    • Last but not least, this kind of hooks affect the processing of the whole system and under certain circumstances (say a bug) you must reboot your machine in order to recover it.
  3. Injecting DLL by using CreateRemoteThread() API function
    Well, this is my favorite one. Unfortunately it is supported only by NT and Windows 2K operating systems. It is bizarre, that you are allowed to call (link with) this API on Win 9x as well, but it just returns NULL without doing anything.

    Injecting DLLs by remote threads is Jeffrey Ritcher's idea and is well documented in his article [9] "Load Your 32-bit DLL into Another Process's Address Space Using INJLIB".

    The basic concept is quite simple, but very elegant. Any process can load a DLL dynamically using LoadLibrary() API. The issue is how do we force an external process to call LoadLibrary() on our behalf, if we don't have any access to process's threads? Well, there is a function, called CreateRemoteThread() that addresses creating a remote thread. Here comes the trick - have a look at the signature of thread function, whose pointer is passed as parameter (i.e. LPTHREAD_START_ROUTINE ) to the CreateRemoteThread() :

    Collapse Copy Code
    DWORD WINAPI ThreadProc(LPVOID lpParameter);
    And here is the prototype of LoadLibrary API
    Collapse Copy Code
    HMODULE WINAPI LoadLibrary(LPCTSTR lpFileName);
    Yes, they do have "identical" pattern. They use the same calling convention WINAPI , they both accept one parameter and the size of returned value is the same. This match gives us a hint that we can use LoadLibrary() as thread function, which will be executed after the remote thread has been created. Let's have a look at the following sample code:
    Collapse Copy Code
    hThread = ::CreateRemoteThread(
    hProcessForHooking,
    NULL,
    0 ,
    pfnLoadLibrary,
    " C://HookTool.dll" ,
    0 ,
    NULL);
    By using GetProcAddress() API we get the address of the LoadLibrary() API. The dodgy thing here is that Kernel32.DLL is mapped always to the same address space of each process, thus the address of LoadLibrary() function has the same value in address space of any running process. This ensures that we pass a valid pointer (i.e. pfnLoadLibrary ) as parameter of CreateRemoteThread() .

    As parameter of the thread function we use the full path name of the DLL, casting it to LPVOID . When the remote thread is resumed, it passes the name of the DLL to the ThreadFunction (i.e. LoadLibrary ). That's the whole trick with regard to using remote threads for injection purposes.

    There is an important thing we should consider, if implanting through CreateRemoteThread() API. Every time before the injector application operate on the virtual memory of the targeted process and makes a call to CreateRemoteThread() , it first opens the process using OpenProcess() API and passes PROCESS_ALL_ACCESS flag as parameter. This flag is used when we want to get maximum access rights to this process. In this scenario OpenProcess() will return NULL for some of the processes with low ID number. This error (although we use a valid process ID) is caused by not running under security context that has enough permissions. If you think for a moment about it, you will realize that it makes perfect sense. All those restricted processes are part of the operating system and a normal application shouldn't be allowed to operate on them. What would happen if some application has a bug and accidentally attempts to terminate an operating system's process? To prevent the operating system from that kind of eventual crashes, it is required that a given application must have sufficient privileges to execute APIs that might alter operating system behavior. To get access to the system resources (e.g. smss.exe, winlogon.exe, services.exe, etc) through OpenProcess() invocation, you must be granted the debug privilege. This ability is extremely powerful and offers a way to access the system resources, that are normally restricted. Adjusting the process privileges is a trivial task and can be described with the following logical operations:

    • Open the process token with permissions needed to adjust privileges
    • Given a privilege's name "SeDebugPrivilege" , we should locate its local LUID mapping. The privileges are specified by name and can be found in Platform SDK file winnt.h
    • Adjust the token in order to enable the "SeDebugPrivilege" privilege by calling AdjustTokenPrivileges() API
    • Close obtained by OpenProcessToken() process token handle
    For more details about changing privileges see [10] "Using privilege".
  4. Implanting through BHO add-ins
    Sometimes you will need to inject a custom code inside Internet Explorer only. Fortunately Microsoft provides an easy and well documented way for this purpose - Browser Helper Objects. A BHO is implemented as COM DLL and once it is properly registered, each time when IE is launched it loads all COM components that have implemented IObjectWithSite interface.
  5. MS Office add-ins
    Similarly, to the BHOs, if you need to implant in MS Office applications code of your own, you can take the advantage of provided standard mechanism by implementing MS Office add-ins. There are many available samples that show how to implement this kind of add-ins.

Interception mechanisms

Injecting a DLL into the address space of an external process is a key element of a spying system. It provides an excellent opportunity to have a control over process's thread activities. However it is not sufficient to have the DLL injected if you want to intercept API function calls within the process.

This part of the article intends to make a brief review of several available real-world hooking aspects. It focuses on the basic outline for each one of them, exposing their advantages and disadvantages.

In terms of the level where the hook is applied, there are two mechanisms for API spying - Kernel level and User level spying. To get better understanding of these two levels you must be aware of the relationship between the Win32 subsystem API and the Native API. Following figure demonstrates where the different hooks are set and illustrates the module relationships and their dependencies on Windows 2K:

Figure 2

The major implementation difference between them is that interceptor engine for kernel-level hooking is wrapped up as a kernel-mode driver, whereas user-level hooking usually employs user-mode DLL.

  1. NT Kernel level hooking
    There are several methods for achieving hooking of NT system services in kernel mode. The most popular interception mechanism was originally demonstrated by Mark Russinovich and Bryce Cogswell in their article [3] "Windows NT System-Call Hooking". Their basic idea is to inject an interception mechanism for monitoring NT system calls just bellow the user mode. This technique is very powerful and provides an extremely flexible method for hooking the point that all user-mode threads pass through before they are serviced by the OS kernel.

    You can find an excellent design and implementation in "Undocumented Windows 2000 Secrets" as well. In his great book Sven Schreiber explains how to build a kernel-level hooking framework from scratch [5].

    Another comprehensive analysis and brilliant implementation has been provided by Prasad Dabak in his book "Undocumented Windows NT" [17].

    However, all these hooking strategies, remain out of the scope of this article.

  2. Win32 User level hooking
    1. Windows subclassing.
      This method is suitable for situations where the application's behavior might be changed by new implementation of the window procedure. To accomplish this task you simply call SetWindowLongPtr() with GWLP_WNDPROC parameter and pass the pointer to your own window procedure. Once you have the new subclass procedure set up, every time when Windows dispatches a message to a specified window, it looks for the address of the window's procedure associated with the particular window and calls your procedure instead of the original one.

      The drawback of this mechanism is that subclassing is available only within the boundaries of a specific process. In other words an application should not subclass a window class created by another process.

      Usually this approach is applicable when you hook an application through add-in (i.e. DLL / In-Proc COM component) and you can obtain the handle to the window whose procedure you would like to replace.

      For example, some time ago I wrote a simple add-in for IE (Browser Helper Object) that replaces the original pop-up menu provided by IE using subclassing.

    2. Proxy DLL (Trojan DLL)
      An easy way for hacking API is just to replace a DLL with one that has the same name and exports all the symbols of the original one. This technique can be effortlessly implemented using function forwarders. A function forwarder basically is an entry in the DLL's export section that delegates a function call to another DLL's function.

      You can accomplish this task by simply using #pragma comment :

      Collapse Copy Code
      #pragma
       comment(linker, "
      /export:DoSomething=DllImpl.ActuallyDoSomething"
      )

      However, if you decide to employ this method, you should take the responsibility of providing compatibilities with newer versions of the original library. For more details see [13a] section "Export forwarding" and [2] "Function Forwarders".

    3. Code overwriting
      There are several methods that are based on code overwriting. One of them changes the address of the function used by CALL instruction. This method is difficult, and error prone. The basic idea beneath is to track down all CALL instructions in the memory and replace the addresses of the original function with user supplied one.

      Another method of code overwriting requires a more complicated implementation. Briefly, the concept of this approach is to locate the address of the original API function and to change first few bytes of this function with a JMP instruction that redirects the call to the custom supplied API function. This method is extremely tricky and involves a sequence of restoring and hooking operations for each individual call. It's important to point out that if the function is in unhooked mode and another call is made during that stage, the system won't be able to capture that second call.
      The major problem is that it contradicts with the rules of a multithreaded environment.

      However, there is a smart solution that solves some of the issues and provides a sophisticated way for achieving most of the goals of an API interceptor. In case you are interested you might peek at [12] Detours implementation.

    4. Spying by a debugger
      An alternative to hooking API functions is to place a debugging breakpoint into the target function. However there are several drawbacks for this method. The major issue with this approach is that debugging exceptions suspend all application threads. It requires also a debugger process that will handle this exception. Another problem is caused by the fact that when the debugger terminates, the debugger is automatically shut down by Windows.
    5. Spying by altering of the Import Address Table
      This technique was originally published by Matt Pietrek and than elaborated by Jeffrey Ritcher ([2] "API Hooking by Manipulating a Module's Import Section") and John Robbins ([4] "Hooking Imported Functions"). It is very robust, simple and quite easy to implement. It also meets most of the requirements of a hooking framework that targets Windows NT/2K and 9x operating systems. The concept of this technique relies on the elegant structure of the Portable Executable (PE) Windows file format. To understand how this method works, you should be familiar with some of the basics behind PE file format, which is an extension of Common Object File Format (COFF). Matt Pietrek reveals the PE format in details in his wonderful articles - [6] "Peering Inside the PE.", and [13a/b] "An In-Depth Look into the Win32 PE file format". I will give you a brief overview of the PE specification, just enough to get the idea of hooking by manipulation of the Import Address Table.

      In general an PE binary file is organized, so that it has all code and data sections in a layout that conform to the virtual memory representation of an executable. PE file format is composed of several logical sections. Each of them maintains specific type of data and addresses particular needs of the OS loader.

      The section .idata , I would like to focus your attention on, contains information about Import Address Table. This part of the PE structure is particularly very crucial for building a spy program based on altering IAT.
      Each executable that conforms with PE format has layout roughly described by the figure below.

      Figure 3

      The program loader is responsible for loading an application along with all its linked DLLs into the memory. Since the address where each DLL is loaded into, cannot be known in advance, the loader is not able to determine the actual address of each imported function. The loader must perform some extra work to ensure that the program will call successfully each imported function. But going through each executable image in the memory and fixing up the addresses of all imported functions one by one would take unreasonable amount of processing time and cause huge performance degradation. So, how does the loader resolves this challenge? The key point is that each call to an imported function must be dispatched to the same address, where the function code resides into the memory. Each call to an imported function is in fact an indirect call, routed through IAT by an indirect JMP instruction. The benefit of this design is that the loader doesn't have to search through the whole image of the file. The solution appears to be quite simple - it just fixes-up the addresses of all imports inside the IAT. Here is an example of a snapshot PE File structure of a simple Win32 Application, taken with the help of the [8] PEView utility. As you can see TestApp import table contains two imported by GDI32.DLL function - TextOutA() and GetStockObject() .

      Figure 4

      Actually the hooking process of an imported function is not that complex as it looks at first sight. In a nutshell an interception system that uses IAT patching has to discover the location that holds the address of imported function and replace it with the address of an user supplied function by overwriting it. An important requirement is that the newly provided function must have exactly the same signature as the original one. Here are the logical steps of a replacing cycle:
      • Locate the import section from the IAT of each loaded by the process DLL module as well as the process itself
      • Find the IMAGE_IMPORT_DESCRIPTOR chunk of the DLL that exports that function. Practically speaking, usually we search this entry by the name of the DLL
      • Locate the IMAGE_THUNK_DATA which holds the original address of the imported function
      • Replace the function address with the user supplied one

      By changing the address of the imported function inside the IAT, we ensure that all calls to the hooked function will be re-routed to the function interceptor.

      Replacing the pointer inside the IAT is that .idata section doesn't necessarily have to be a writable section. This requires that we must ensure that .idata section can be modified. This task can be accomplished by using VirtualProtect() API.

      Another issue that deserves attention is related to the GetProcAddress() API behavior on Windows 9x system. When an application calls this API outside the debugger it returns a pointer to the function. However if you call this function within from the debugger it actually returns different address than it would when the call is made outside the debugger. It is caused by the fact that that inside the debugger each call to GetProcAddress() returns a wrapper to the real pointer. Returned by GetProcAddress() value points to PUSH instruction followed by the actual address. This means that on Windows 9x when we loop through the thunks, we must check whether the address of examined function is a PUSH instruction (0x68 on x86 platforms) and accordingly get the proper value of the address function.

      Windows 9x doesn't implement copy-on-write, thus operating system attempts to keep away the debuggers from stepping into functions above the 2-GB frontier. That is the reason why GetProcAddress() returns a debug thunk instead of the actual address. John Robbins discusses this problem in [4] "Hooking Imported Functions".

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值