Performing analysis on a Visual Basic (VB) script, or when Visual Basic is paired with the .NET Framework, becomes an exercise of source code analysis. Unfortunately when Visual Basic is compiled to a Windows Portable Executable (PE) file it can become a nightmare for many malware analysts and reverse engineers.
Why is it used by malware?
Visual Basic binaries have a reputation for making an analysts job difficult due to the many aspects of its compilation that differ from standard C/C++ binaries. To analyze a VB PE binary it helps to be familiar with the VB scripting syntax and semantics since their constructs will appear throughout the binary’s disassembly. VB binaries have their own API interpreted by Microsoft’s VB virtual machine (VB 6.0 uses msvbvm60.dll). Many of the APIs are wrappers for more commonly used Win32 APIs leveraged from other system DLLs.
Reverse engineering VB binaries will often involve reverse engineering VB internals for various VB APIs, a task dreaded by many. The entry point of a VB program diverts from the typical C/C++ or even Borland Delphi binary. There is no mainCRTStartup or WinMainCRTStartup function that initializes the C runtime and calls the developer defined main or WinMain function. Instead the Entry Point (EP) looks like this:
004014A4 start:
004014A4 push offset dword_40159C
004014A9 call ThunRTMain
004014A9 ; -----------------------------------------------------------------
004014AE dw 0
004014B0 dd 0
004014B4 dd 30h, 40h, 0
004014C0 dd 0E8235672h, 403451C6h, 0AAF1D6B9h, 88BB31A6h, 0
Piecing this together DllFunctionCall argument is the structure defined below:
typedef struct _DynamicHandles {
0x00 DWORD dwUnknown;
0x04 HANDLE hModule;
0x08 VOID * fnAddress
0x0C
} DynamicHandles;
typedef struct _DllFunctionCallStruct {
0x00 LPCSTR lpDllName;
0x04 LPTSTR lpExportName;
0x08
0x09
// 4 bytes means it is a LPTSTR *
// 2 bytes means it is a WORD (the export's Ordinal)
0x0A char sizeOfExportName;
0x0B
0x0C DynamicHandles sHandleData;
0x10
} DllFunctionCallStruct;
Putting it all Together
Great, we understand enough of the structure passed into DllFunctionCall, but how does this benefit us? It will aid us in locating dynamically loaded API functions in a VB binary. Most VB binaries making use of DllFunctionCall will have wrapper functions that follow this format:
mov eax, dword_ZZZZZZZZ
or eax, eax
jz short loc_XXXXXXXX
jmp eax
loc_XXXXXX:
push YYYYYYYYh
mov eax, offset DllFunctionCall
call eax ; DllFunctionCall
jmp eax
The memory address 0xYYYYYYYY represents the address of the DllFunctionCallStruct. This structure is usually saved as a global variable. The sHandleData field within the DllFunctionCallStruct points to another global variable in memory. The fnAddress field within the DynamicHandles structure is accessed directly via the offset dword_ZZZZZZZZ. If the exported function has not been loaded into memory yet then DllFunctionCall will be invoked, thereby populating the value stored at dword_ZZZZZZZZ, and any sequential calls will directly call the exported function.
In malware, dozens or even hundreds of these wrapper functions can be found. Going through each reference to DllFunctionCall, applying the DllFunctionCallStruct and DynamicHandles structures, labelling the structure and direct address to the fnAddress field, and defining/renaming the function is a lot of work. To get around this cumbersome task I’ve created a IDA Python script that will perform these monotonous tasks and print out a listing of all the dynamically loaded API used by the binary.