Reference: Advanced Windows Debugging by MARIO HEWARDT, DANIEL PRAVAT
If you guys have done a lot of unmanaged programming using C/C++, you should’ve suffered a lot from memory corruption. As indicated in the above referenced book written by MSFT technical fellows, a memory corruption is one of the most intractable forms of programming error for two reasons. First, the source of the corruption and the manifestation might be far apart, making it difficult to correlate cause and effect. Second, symptoms appear under unusual conditions, making it hard to consistently reproduce the error.
Fundamentally, memory corruption occurs when one or both of the following are true.
1. The executing thread writes to a block of memory that it does not own.
2. The executing thread writes to a block of memory that it does own, but corrupts the state of that memory block.
Today, I am going to show you a demonstration of troubleshooting the 1st scenario with a sample. I am not going through every details just want to share my thinking of this problem.
Suppose that we have written an application which will do some stuff like making SQL connections and it unfortunately crashed intermittently. We don’t have the source code on hands and are just able to get the following call stack from the crash dump captured using adplus:
0:000> kv
ChildEBP RetAddr Args to Child
WARNING: Frame IP not in any known module. Following frames may be wrong.
0012ff80 6e6f4374 7463656e 536e6f69 6e697274 0x6e616369
0012ffc0 77e6f23b 00000000 00000000 7ffd4000 0x6e6f4374
0012fff0 00000000 0040130b 00000000 78746341 kernel32!BaseProcessStart+0x23 (FPO: [Non-Fpo]) (CONV: stdcall) [d:/nt/base/win32/client/support.c @ 838]
0:000> r
eax=00000000 ebx=00000000 ecx=781425fb edx=00250000 esi=00000001 edi=00403380
eip=6e616369 esp=0012ff84 ebp=0012ffc0 iopl=0 nv up ei pl nz na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010202
6e616369 ?? ???
We don’t know what’s happened and what we notice is the EIP seems to be incorrect. What shall we do? At this time, we happened to find the symbol file for this application during last release and the crash seems to be somewhere in “SOC__!HelperFunction”. The next thing we can figure out is using WinDBG to attach to the process and doing live debugging and set break point on this HelperFunction and see if we can find any clues. We set the break point using bp and bl shows it succeeded.
0:000> bp SOC__!HelperFunction
0:000> bl
1 e 00401052 0001 (0001) 0:**** SOC__!HelperFunction+0x12
We can use uf to check the assembly code as we don’t have source code on hands.
0:000> uf SOC__!HelperFunction
SOC__!HelperFunction [z:/work/books/iisdebug/training/proj/soc++/soc++/soc++.cpp @ 26]:
26 00401040 83ec24 sub esp,0x24
26 00401043 a100304000 mov eax,[SOC__!__security_cookie (00403000)]
26 00401048 33c4 xor eax,esp
26 0040104a 89442420 mov [esp+0x20],eax
26 0040104e 8b442428 mov eax,[esp+0x28]
29 00401052 8d1424 lea edx,[esp]
29 00401055 2bd0 sub edx,eax
SOC__!HelperFunction+0x17 [z:/work/books/iisdebug/training/proj/soc++/soc++/soc++.cpp @ 29]:
29 00401057 8a08 mov cl,[eax]
29 00401059 880c02 mov [edx+eax],cl
29 0040105c 83c001 add eax,0x1
29 0040105f 84c9 test cl,cl
29 00401061 75f4 jnz SOC__!HelperFunction+0x17 (00401057)
SOC__!HelperFunction+0x23 [z:/work/books/iisdebug/training/proj/soc++/soc++/soc++.cpp @ 35]:
35 00401063 8d0424 lea eax,[esp]
35 00401066 50 push eax
35 00401067 684c214000 push 0x40214c
35 0040106c ff15a0204000 call dword ptr [SOC__!_imp__printf (004020a0)]
36 00401072 8b4c2428 mov ecx,[esp+0x28]
36 00401076 83c408 add esp,0x8
36 00401079 33cc xor ecx,esp
36 0040107b e804000000 call SOC__!__security_check_cookie (00401084)
36 00401080 83c424 add esp,0x24
36 00401083 c3 ret
After checking the assembly code carefully, we got to see it seems to be in a loop moving [eax] to [edx+eax]. And next once the bp is hit, we can go step by step and use dds to dump the stack:
0:000> g
Breakpoint 1 hit
eax=003a2831 ebx=00000000 ecx=781c37e4 edx=00000000 esi=003a27e8 edi=0040337c
eip=00401052 esp=0012ff50 ebp=0012ffc0 iopl=0 nv up ei pl nz na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000202
SOC__!HelperFunction+0x12:
00401052 8d1424 lea edx,[esp] ss:0023:0012ff50=00000003
0012ff30 0012ffb0
0012ff34 78138cd9 MSVCR80!_except_handler4
0012ff38 5cef4f46
0012ff3c fffffffe
0012ff40 781425fb MSVCR80!printf+0x9b [f:/sp/vctools/crt_bld/self_x86/crt/src/printf.c @ 73]
0012ff44 00401072 SOC__!HelperFunction+0x32 [z:/work/books/iisdebug/training/proj/soc++/soc++/soc++.cpp @ 36]
0012ff48 0040214c SOC__!`string'
0012ff4c 0012ff50
0012ff50 73696854
0012ff54 794d7349
0012ff58 79726556
0012ff5c 72747845
0012ff60 6c656d65
0012ff64 70755379
0012ff68 614d7265
0012ff6c 66696e67
0012ff70 6e616369
0012ff74 6e6f4374
0012ff78 7463656e
0012ff7c 536e6f69
0012ff80 6e697274
0012ff84 726f4667
0012ff88 6144794d
0012ff8c 6f536174
0012ff90 65637275
0012ff94 00000000
0012ff98 00000000
0012ff9c 7ffdf000
0012ffa0 00000000
0012ffa4 00000000
We seem to be able to see a well-informed string 6e616369 which is exactly the value of EIP. Well, let go ahead and check further on what is stored starting from ESP.
0:000> dc 0012ff50
0012ff50 73696854 794d7349 79726556 72747845 ThisIsMyVeryExtr
0012ff60 6c656d65 70755379 614d7265 66696e67 emelySuperMagnif
0012ff70 6e616369 6e6f4374 7463656e 536e6f69 icantConnectionS
0012ff80 6e697274 726f4667 6144794d 6f536174 tringForMyDataSo
0012ff90 65637275 00000000 00000000 7ffdf000 urce............
0012ffa0 00000000 00000000 0012ff90 74fd542c ............,T.t
0012ffb0 0012ffe0 00401741 24223a8a 00000000 ....A.@..:"$....
0012ffc0 0012fff0 77e6f23b 00000000 00000000 ....;..w........
Here we go! It shows “ican” overwrites the return address of HelperFunction in the stack and in turn make the EIP incorrect! It’s a very normal cause of incorrect EIP, especially when you have some sort of string copy operations.
However, if we have the source code, we can get to know what has happened more clearly.
Code example:
#define MAX_CONN_LEN 30
void HelperFunction(char* pszConnectionString);
int main(int argc, char* argv[])
{
getchar();
if (argc==2)
{
HelperFunction(argv[1]);
printf ("Connection to %s established/n",argv[1]);
}
else
{
printf ("Please specify connection string on the command line/n");
}
return 0;
}
void HelperFunction(char* pszConnectionString)
{
char pszCopy[MAX_CONN_LEN];
strcpy(pszCopy, pszConnectionString);
//
// ...
// Establish connection
// ...
//
printf ("****%s****", pszCopy);
}
It’s a little sample which demonstrates the basic debugging tips of troubleshooting a stack corruption. It’s simple but really helpful in some cases. Hopefully, this will bring to you guys some ideas when you face any similar issue. J