What is your favourite debugger? If you would ask me this question, I would probably reply that it is “Visual Studio + WinDbg”. I like Visual Studio for its natural and productive interface. I like that it allows me to get the necessary information quickly and, well, visually. But unfortunately, some kinds of information cannot be easily obtained with Visual Studio debugger. For example, what if I need to know which thread is holding a particular critical section? Or which function occupies most of the space on the stack? Here comes WinDbg. Its commands can provide answers to these and many other interesting questions that arise during debugging sessions. And I even do not have to close Visual Studio to attach WinDbg to the target application – thanks to WinDbg's support for noninvasive debugging (discussed later in this article), we can take advantage of Visual Studio GUI and WinDbg commands at the same time.
The only problem is that WinDbg is not too easy to use. It takes time to adapt to its user interface, and even more time to master the commands. But what if you need it today, right now, to debug an urgent problem? Is there a quick and easy way? Yes, there is. CDB, the little brother of WinDbg, exposes the same functionality with a simple, command line based interface. In this article, I will show you how to take advantage of CDB and start using it right now to complement Visual Studio debugger. You will see how to setup and configure CDB, and how to use it to solve real life problems. In addition, I will provide you with a set of batch files, which will hide most of the remaining complexities of CDB's command line interface and save you a lot of typing.
Of course, before we can start using CDB, we have to install and configure it. WinDbg and CDB are distributed as part of Debugging Tools for Windows package, which can be downloaded here. The installation is simple, and unless you are going to develop applications with the help of WinDbg SDK, you can simply accept the default settings. (But if you are going to use the SDK, you have to select custom setup and enable SDK installation; it is also recommended to use an installation directory whose name does not contain spaces). After the installation has been completed, the installation directory should contain all the necessary files, including WinDbg (windbg.exe) and CDB (cdb.exe).
Debugging Tools also support “xcopy” style of installation. After you have installed it on one machine, you do not necessarily have to run the setup again to install it on other machines. It is enough to collect all the files in the installation directory and copy them onto the target machine or a network share.
Some important WinDbg commands cannot function properly without access to up-to-date symbols for operating system DLLs. In the past, we could obtain the necessary symbols by downloading a large package from Microsoft's FTP server. It was time consuming, and symbols could easily become outdated (and therefore useless) after installing an update for the operating system. Fortunately, nowadays there is a much simpler way to obtain symbols – symbol server. This technology, supported by WinDbg and Visual Studio debuggers, allows to download up-to-date symbols on demand from a server maintained by Microsoft. With symbol server, we do not have to download the complete symbol package, because the debugger knows which DLLs it is going to inspect, and therefore can download symbols only for those DLLs. If symbols become outdated after an operating system update has been installed, the debugger will notice it and download the necessary symbols again.
To make symbol server feature work, we should let the debugger know the path to the symbol server. The simplest way to do it is to specify the symbol server path in _NT_SYMBOL_PATH environment variable. The following path should be used: "srv*c:/symbolcache*http://msdl.microsoft.com/download/symbols" (c:/symbolcache directory will be used as a cache for symbol files downloaded from the symbol server; you can use any local or network directory path that is valid on your system). For example:
set _NT_SYMBOL_PATH=srv*c:/symbols*http://msdl.microsoft.com/download/symbols
After you have set _NT_SYMBOL_PATH environment variable to the proper value, symbol server feature is ready for use. More information about symbol server technology, related settings and, if necessary, troubleshooting tips can be found in WinDbg documentation (Debuggers | Symbols section).
Additional configuration steps are needed to access symbol server from behind a proxy server that requires you to log in. See CDB and proxy servers section of this article for more information. |
When we start learning a new debugger, the first question usually is: how to start the debugging session? As most debuggers do, CDB allows us to start a new instance of the application or attach to an already running process. Starting a new instance is as simple as the following:
cdb c:/myapp.exe
If we want to attach to an already running process, one of the following options can be used:
Options | Description | Example |
---|
-p Pid
| This option allows to attach to the process with the specified process id. The process id can be obtained from Task Manager or other similar tool. |
cdb -p 1034
|
-pn ExeName
| This option allows to attach to the process with the specified name of its main executable file (.exe). This option is usually more convenient than “-p Pid”, because we usually know the name of our application's main executable, and do not have to look for it in Task Manager. But this option cannot be used if more than one process with the given executable name is currently running (CDB will report an error). |
cdb -pn myapp.exe
|
-psn ServiceName
| This option allows to attach to the process that contains the specified service. For example, if you want to attach to, say, Windows Management Instrumentation service, you should use WinMgmt as the service name. |
cdb -psn MyService
|
CDB can also be used to analyze crash dumps. To open a crash dump, use -z option:
cdb -z DumpFile
For example:
cdb -z c:/myapp.dmp
After we have started a new debugging session, CDB displays its own command prompt. You can use this prompt to enter and execute any command supported by CDB.
'q' command ends debugging session and exits CDB:
0:000> q
quit:
>
Warning: When you end the debugging session and exit CDB, the debuggee will also be terminated by the operating system. If you want to exit CDB and keep the debuggee running, you can use .detach command (supported only in Windows XP and newer operating systems), or use CDB in noninvasive mode (discussed below). |
While it is possible to use CDB command prompt to execute debugger commands, it is often faster to specify the necessary commands on the command line, using -c option.
cdb -pn myapp.exe -c "command1;command2"
(commands are separated with semicolons)
For example, the following command line will attach CDB to our application, display the list of loaded modules, and exit:
cdb -pn myapp.exe -c "lm;q"
Note the use of 'q' command at the end of the command list – it allows us to automatically close CDB after all debugger commands have been executed.
By default, when we use CDB to debug an already running process, it attaches as a fully functional debugger (using Win32 Debugging API). It is possible to set breakpoints, step through the code, get notified about various debugging events (such as exceptions, module load/unload, thread start/exit, and so on). But Visual Studio debugger allows us to do the same, and offers a much better user interface. In addition, only one debugger can be attached to the process at a time. Does it mean that if we are already debugging an application with Visual Studio debugger, we cannot use CDB to obtain additional information about the same application? No, it doesn't, because CDB also supports noninvasive debugging mode.
When CDB attaches to the target process in noninvasive mode, it does not use Win32 Debugging API. Instead, it simply suspends all threads in the target process and starts executing commands specified by the user. After all commands have been executed, just before CDB itself terminates, it resumes the suspended threads. As a result, the target process can continue running as if it wasn't debugged at all. Even if the target process is already being debugged by a fully functional debugger like Visual Studio, CDB still can attach to it in noninvasive mode and obtain the necessary information. After CDB has finished its work and detached, we can continue debugging the application in Visual Studio debugger.
How to use CDB in noninvasive mode? Using -pv command line option. For example, the following command will attach to our application noninvasively, display the list of loaded modules, and exit. The application will continue running.
cdb -pv -pn myapp.exe -c "lm;q"
The output of CDB commands can be long, and it can be inconvenient to read it from the console window. It would be much better to save the output to a log file, and CDB allows us to do it with the help of -loga and -logo options ('-loga <filename>' appends the output to the end of the specified file, while '-logo <filename>' overwrites the file if it already exists).
Lets enhance our sample command (that lists modules in the target process) with logging capability and save the output to out.txt file in the current directory:
cdb -pv -pn myapp.exe -logo out.txt -c "lm;q"
Another important command line option exposed by CDB is -lines. This option turns on the source line information support, which, for example, allows CDB to display source file names and line numbers when reporting call stacks. (By default, source line support is turned off, and CDB does not display source file/line information).
If you are going to use CDB from behind a proxy server that requires you to log in, symbol server access will not work by default. The reason is that in the default configuration CDB is not allowed to show proxy server's login prompt when it is trying to connect to the symbol server. To change this behavior and make symbol server access work, two commands should be added to the beginning of the command list:
!sym prompts;.reload
For example:
cdb -pv -pn myapp.exe -logo out.txt -c "!sym prompts;.reload;lm;q"
When CDB launches a new application, attaches to an existing process, or opens a crash dump, it shows a sequence of startup messages. These messages are followed by the output of CDB commands (which can be specified using -c option, or entered manually). Usually the startup messages serve only informational purposes; but if something goes wrong, they will contain the description of the problem, sometimes followed by recommendations on how to solve it.
For example, the following output contains a message that informs us that symbol path is not set, and as a result some debugger commands may not work:
D:/Progs/DbgTools>cdb myapp.exe
Microsoft (R) Windows Debugger Version 6.5.0003.7
Copyright (c) Microsoft Corporation. All rights reserved.
CommandLine: myapp.exe
Symbol search path is: *** Invalid ***
****************************************************************************
* Symbol loading may be unreliable without a symbol search path. *
* Use .symfix to have the debugger choose a symbol path. *
* After setting your symbol path, use .reload to refresh symbol locations. *
****************************************************************************
Here is the list of CDB command line templates that we will use throughout the remainder of this article (we will always use the same templates, and usually only the list of commands inside -c option will change, depending on the problem we are trying to solve).
Attach to a running process (by process id) in noninvasive mode, execute a set of commands and save the output in out.txt file:
cdb -pv -p <processid> -logo out.txt -lines -c "command1;command2;...;commandN;q"
Attach to a running process (by executable name) in noninvasive mode, execute a set of commands and save the output in out.txt file:
cdb -pv -pn <exename> -logo out.txt -lines -c "command1;command2;...;commandN;q"
Attach to a running process (by service name) in noninvasive mode, execute a set of commands and save the output in out.txt file:
cdb -pv -psn <servicename> -logo out.txt -lines -c "command1;command2;...;commandN;q"
Open a crash dump file, execute a set of commands and save the output to out.txt file:
cdb -z <dumpfile> -logo out.txt -lines -c "command1;command2;...;commandN;q"
If we are going to use CDB from behind a proxy server that requires us to login, two additional commands should be added to make symbol server access work. For example:
cdb -pv -pn <exename> -logo out.txt -lines -c "!sym prompts;.reload;command1;command2;...;commandN;q"
Looks like a lot of typing? Not really. Later in this article I will present a set of batch files, which will hide the repeating command line options and minimize the amount of information you have to enter manually.
When our application appears hung or unresponsive, the natural question is: what is it currently doing? Where it got stuck? Of course, we can attach Visual Studio debugger to the application and inspect the call stacks of all threads. But we can do the same with CDB, and much quicker. The following command attaches CDB to the application noninvasively, prints all call stacks to the console and to the log file, and exits:
cdb -pv -pn myapp.exe -logo out.txt -lines -c "~*kb;q"
('kb' command asks CDB to print the call stack of the current thread; '~*' prefix asks the debugger to repeat 'kb' for all existing threads in the process).
DeadLockDemo.cpp file contains a sample application that demonstrates a typical deadlock scenario. If you compile it and run, its worker threads will get stuck very soon. If we run the abovementioned command to see what the application's threads are doing, we will see something like the following (here, and in the future, startup messages are omitted):
. 0 Id: 6fc.4fc Suspend: 1 Teb: 7ffdf000 Unfrozen
ChildEBP RetAddr Args to Child
0012fdf8 7c90d85c 7c8023ed 00000000 0012fe2c ntdll!KiFastSystemCallRet
0012fdfc 7c8023ed 00000000 0012fe2c 0012ff54 ntdll!NtDelayExecution+0xc
0012fe54 7c802451 0036ee80 00000000 0012ff54 kernel32!SleepEx+0x61
0012fe64 004308a9 0036ee80 a0f63080 01c63442 kernel32!Sleep+0xf
0012ff54 00432342 00000001 003336e8 003337c8 DeadLockDemo!wmain+0xd9
[c:/tests/deadlockdemo/deadlockdemo.cpp @ 154]
0012ffb8 004320fd 0012fff0 7c816d4f a0f63080 DeadLockDemo!__tmainCRTStartup+0x232
[f:/rtm/vctools/crt_bld/self_x86/crt/src/crt0.c @ 318]
0012ffc0 7c816d4f a0f63080 01c63442 7ffdd000 DeadLockDemo!wmainCRTStartup+0xd
[f:/rtm/vctools/crt_bld/self_x86/crt/src/crt0.c @ 187]
0012fff0 00000000 0042e5aa 00000000 78746341 kernel32!BaseProcessStart+0x23
1 Id: 6fc.3d8 Suspend: 1 Teb: 7ffde000 Unfrozen
ChildEBP RetAddr Args to Child
005afc14 7c90e9c0 7c91901b 000007d4 00000000 ntdll!KiFastSystemCallRet
005afc18 7c91901b 000007d4 00000000 00000000 ntdll!ZwWaitForSingleObject+0xc
005afca0 7c90104b 004a0638 00430b7f 004a0638 ntdll!RtlpWaitForCriticalSection+0x132
005afca8 00430b7f 004a0638 005afe6c 005afe78 ntdll!RtlEnterCriticalSection+0x46
005afd8c 00430b15 005aff60 005afe78 003330a0 DeadLockDemo!CCriticalSection::Lock+0x2f
[c:/tests/deadlockdemo/deadlockdemo.cpp @ 62]
005afe6c 004309f1 004a0638 f3d065d5 00334fc8 DeadLockDemo!CCritSecLock::CCritSecLock+0x35
[c:/tests/deadlockdemo/deadlockdemo.cpp @ 90]
005aff6c 004311b1 00000000 f3d06511 00334fc8 DeadLockDemo!ThreadOne+0xa1
[c:/tests/deadlockdemo/deadlockdemo.cpp @ 182]
005affa8 00431122 00000000 005affec 7c80b50b DeadLockDemo!_callthreadstartex+0x51
[f:/rtm/vctools/crt_bld/self_x86/crt/src/threadex.c @ 348]
005affb4 7c80b50b 003330a0 00334fc8 00330001 DeadLockDemo!_threadstartex+0xa2
[f:/rtm/vctools/crt_bld/self_x86/crt/src/threadex.c @ 331]
005affec 00000000 00431080 003330a0 00000000 kernel32!BaseThreadStart+0x37
2 Id: 6fc.284 Suspend: 1 Teb: 7ffdc000 Unfrozen
ChildEBP RetAddr Args to Child
006afc14 7c90e9c0 7c91901b 000007d8 00000000 ntdll!KiFastSystemCallRet
006afc18 7c91901b 000007d8 00000000 00000000 ntdll!ZwWaitForSingleObject+0xc
006afca0 7c90104b 004a0620 00430b7f 004a0620 ntdll!RtlpWaitForCriticalSection+0x132
006afca8 00430b7f 004a0620 006afe6c 006afe78 ntdll!RtlEnterCriticalSection+0x46
006afd8c 00430b15 006aff60 006afe78 003332e0 DeadLockDemo!CCriticalSection::Lock+0x2f
[c:/tests/deadlockdemo/deadlockdemo.cpp @ 62]
006afe6c 00430d11 004a0620 f3e065d5 00334fc8 DeadLockDemo!CCritSecLock::CCritSecLock+0x35
[c:/tests/deadlockdemo/deadlockdemo.cpp @ 90]
006aff6c 004311b1 00000000 f3e06511 00334fc8 DeadLockDemo!ThreadTwo+0xa1
[c:/tests/deadlockdemo/deadlockdemo.cpp @ 202]
006affa8 00431122 00000000 006affec 7c80b50b DeadLockDemo!_callthreadstartex+0x51
[f:/rtm/vctools/crt_bld/self_x86/crt/src/threadex.c @ 348]
006affb4 7c80b50b 003332e0 00334fc8 00330001 DeadLockDemo!_threadstartex+0xa2
[f:/rtm/vctools/crt_bld/self_x86/crt/src/threadex.c @ 331]
006affec 00000000 00431080 003332e0 00000000 kernel32!BaseThreadStart+0x37
The call stack (and source line numbers) suggest that ThreadOne is holding critical section CritSecOne and is waiting for critical section CritSecTwo, while ThreadTwo is holding critical section CritSecTwo and is waiting for critical section CritSecOne. This is an example of the classical “lock acquisition order” deadlock, where two threads need to acquire the same set of synchronization objects and do it in different order. If you want to avoid deadlocks of this kind, make sure that all threads acquire the necessary synchronization objects in the same order (in the sample, both ThreadOne and ThreadTwo could agree to acquire CritSecOne first and CritSecTwo next to avoid the deadlock).
By default, 'kb' command displays only the first 20 frames of the call stack. If you want to see a larger number of stack frames, you can explicitly override this limit (e.g., 'kb100' command asks the debugger to display up to 100 stack frames). In a live WinDbg session, it is also possible to use .kframes command to change the default limit for all subsequent commands. |
Our sample application contained only three simple threads, and it wasn't difficult to identify the ones responsible for the deadlock. In large applications, it can be more difficult to identify the suspicious threads and prove their guilt. How should we approach it? In most cases, we already know a thread that isn't functioning properly (otherwise, how could we notice that the application is misbehaving?). Usually this thread is waiting on a synchronization object that is not available by some reason. Why is this object not available? Very often we can answer this question if we know which thread is currently holding this object (owns it, in other words). If the object happens to be a critical section, !locks command can help us to identify its current owner. When used without parameters, this command displays the list of critical sections that are currently held by the application's threads. Free critical sections are not included in the output.
Let's see !locks command in action:
cdb -pv -pn myapp.exe -logo out.txt -lines -c "!locks;q"
Here is the output of this command (also for DeadLockDemo.cpp sample):
CritSec DeadLockDemo!CritSecOne+0 at 004A0620
LockCount 1
RecursionCount 1
OwningThread 3d8
EntryCount 1
ContentionCount 1
*** Locked
CritSec DeadLockDemo!CritSecTwo+0 at 004A0638
LockCount 1
RecursionCount 1
OwningThread 284
EntryCount 1
ContentionCount 1
*** Locked
Scanned 40 critical sections
Looking at the output of !locks command (OwningThread field in particular), we can conclude that critical section CritSecOne is held by the thread whose id is 0x3d8, and critical section CritSecTwo is held by thread 0x284. The output of 'kb' command (in the previous picture) allows to identify the threads with these ids.
If the application uses other kinds of synchronization objects (e.g. mutexes), it is more difficult to identify their owners (kernel debugger is required), and I will reserve it for a future article.
For most kinds of software applications, too high CPU consumption (up to 100% on a single-CPU system, according to Task Manager) is a clear sign of a bug. Usually it means that one of the application's threads has entered an infinite loop. Of course, the natural way to debug this problem is to attach Visual Studio debugger to the process and check what the offending thread is doing. But how can we determine which thread to check? CDB offers us an easy and convenient solution - !runaway command. When used without parameters, this command displays the times spent by each of the application's threads executing user mode code (additional parameters can also show the times spent in kernel mode, and the times elapsed since the moment when a thread was started).
Here is how to use this command with CDB:
cdb -pv -pn myapp.exe -logo out.txt -c "!runaway;q"
Here is a sample output of !runaway command:
0:000> !runaway
User Mode Time
Thread Time
1:358 0 days 0:00:47.408
2:150 0 days 0:00:03.495
0:d8 0 days 0:00:00.000
It looks like the thread with id 0x358 utilizes most of the CPU time. But this information is not yet enough to prove that thread 0x358 is guilty, because the command displays the CPU time spent by the thread during its whole lifetime. What we need is to see how the threads' CPU times change. Let's run the same command again. This time, we could see something like the following:
0:000> !runaway
User Mode Time
Thread Time
1:358 0 days 0:00:47.408
2:150 0 days 0:00:06.859
0:d8 0 days 0:00:00.000
Now we should compare this output with the output from the previous run, and find the thread whose CPU time has increased the most. In the sample application, it definitely is the thread 0x150. Now we can attach Visual Studio debugger to the application, switch to this thread and check why it is spinning.
CDB can also be very useful when we want to find the reason of a stack overflow exception. Of course, uncontrolled recursion is the most typical reason of stack overflows, and it is usually enough to look at the call stack of the offending thread to find the place where it went out of control. Visual Studio can do it just fine, so why use CDB? Lets think about more complicated cases. For example, what if our application contains an algorithm that relies on recursion? We put significant efforts into designing the algorithm and keeping recursion under control in all possible situations, but sometimes the stack still overflows. Why? Probably, because some functions used by the algorithm occupy too much space on the stack under certain conditions. How can we determine the amount of stack space occupied by a function? Unfortunately, Visual Studio debugger does not offer an easy way to do it.
It is also possible that an application raises a stack overflow exception even when the call stack does not show any signs of recursion. For example, take a look at StackOvfDemo.cpp sample. If you compile it and run under debugger, stack overflow will soon occur. But the call stack at the moment of the exception looks innocent:
StackOvfDemo.exe!_woutput
StackOvfDemo.exe!wprintf
StackOvfDemo.exe!ProcessStringW
StackOvfDemo.exe!ProcessStrings
StackOvfDemo.exe!main
StackOvfDemo.exe!mainCRTStartup
KERNEL32.DLL!_BaseProcessStart@4
Obviously, one of the functions on the call stack is using too much stack space. But how can we find this function? Of course, with the help of CDB – its 'kf' command allows to display the number of bytes occupied by every function on the call stack. While the application is still stopped in Visual Studio debugger, lets run the following command:
cdb -pv -pn stackovfdemo.exe -logo out.txt -c "~*kf;q"
(Be aware that by default 'kf' reports only the last 20 frames on the call stack, as we have already discussed in Debugging Deadlocks section. If you want to display more than 20 frames, change ~*kf to, for example, ~*kf1000. Also note that ~*kf will report the call stacks of all threads. If the application contains lots of threads, it can be undesirable, and the command can be changed to '~~[tid]kf', where 'tid' is the thread id of the target thread (for example, '~~[0x3a8]kf'))
This command would display something like this:
. 0 Id: 210.3a8 Suspend: 1 Teb: 7ffde000 Unfrozen
Memory ChildEBP RetAddr
00033440 0041aca5 StackOvfDemo!_woutput+0x22
44 00033484 00415eed StackOvfDemo!wprintf+0x85
d8 0003355c 00415cc5 StackOvfDemo!ProcessStringW+0x2d
fc878 0012fdd4 00415a44 StackOvfDemo!ProcessStrings+0xe5
108 0012fedc 0041c043 StackOvfDemo!main+0x64
e4 0012ffc0 7c4e87f5 StackOvfDemo!mainCRTStartup+0x183
30 0012fff0 00000000 KERNEL32!BaseProcessStart+0x3d
Pay attention to the first column – it reports the number of bytes occupied by the corresponding function on the stack. Obviously, ProcessStrings function is using the lion's share of the available stack space, and is therefore responsible for stack overflow.
If you wonder why ProcessStrings function requires so much space on the stack, here is the explanation. This function uses ATL's A2W macro to convert strings from ANSI to Unicode, and this macro uses _alloca function internally to allocate memory on the stack. The memory allocated with _alloca is released only when its caller (ProcessStrings in this case) returns. Until ProcessStrings returns control, every subsequent call to A2W (and therefore _alloca) will allocate additional space on the stack, quickly exhausting it.
Bottom line: avoid using _alloca in a loop.
、、、、、、、、、、、、、、、、、、
、、、、、、、、、、、、、、、、、、
windbg the easy way (step 2)
2008-10-08 09:19
When debugging a problem that is not easy to reproduce, I sometimes want to make a snapshot of the application's state (memory contents, the list of open handles, and so on) and save it in a file for further analysis. It can be useful when, for example, I suspect that the current state can contain the key to the problem I am trying to solve, but want to continue running the application to see how the situation develops. Sometimes I make a series of snapshots, one after another, so that I could compare them later and see how some data structures change while the application is running. And I always create a snapshot when I have finally managed to reproduce the problem, to make sure that I don't lose valueable information if, for example, I close the debugging session by mistake. Probably, it is not difficult to guess that when I say “snapshot” I actually mean “minidump”, because minidumps proved to be very convenient for saving the application state at any moment of time.
Here is the command line that can be used to create a minidump:
cdb -pv -pn myapp.exe -c ".dump /m c:/myapp.dmp;q"
Let's take a closer look at .dump command. In the example above, this command receives only one option (/m) and is followed by the name of the minidump file. /m option is used to specify what kinds of information should be included into the minidump. The most important (in my opinion) variants of /m option are listed in the following table:
Option | Description | Example |
---|
/m | This option is used by default. It creates a standard minidump, equivalent to MiniDumpNormal minidump type. The resulting minidump is usually very small, and therefore this option is useful if you want to transfer the minidump over a slow network. But unfortunately, small size of the minidump also means that in most cases it does not contain enough information for serious analysis (you can find more information about minidump contents in this article) |
.dump /m c:/myapp.dmp
| /ma | Minidump with all possible options (complete memory contents, handles, unloaded modules, etc.), and the resulting minidump can be huge. This option is the best for local debugging, if disk space is not limited. |
.dump /ma c:/myapp.dmp
|
/mFhutwd
| Minidump with data sections, non-shared read/write memory pages and other useful information. This option can be used if you want to collect as much information as possible, but still need to keep the minidump relatively small (and compressible). |
.dump /mFhutwd c:/myapp.dmp
|
The following command creates a minidump that includes all possible kinds of information:
cdb -pv -pn myapp.exe -c ".dump /ma c:/myapp.dmp;q"
What if we want to create a new minidump and overwrite the existing one? By default, .dump command does not allow to do it – it complains that the file with the given name already exists. To change the default behavior and overwrite the existing minidump file, we can use /o option:
cdb -pv -pn myapp.exe -c ".dump /ma /o c:/myapp.dmp;q"
If we want to create a series of minidumps, one after another, it can be handy to name the minidump files so that the names reflect the time when the minidumps were created. Good news are that .dump command can do it automatically, if we specify /u option. For example, the following command could produce a minidump called myapp_02CC_2006-01-28_04-11-18-171_0158.dmp (0158 is the process id):
cdb -pv -pn myapp.exe -c ".dump /m /u c:/myapp.dmp;q"
.dump command also supports other interesting options (you can find them in the documentation).
If you want to create a minidump of a process that is running under Visual Studio debugger, I would recommend to temporarily disable all breakpoints in Visual Studio before creating the dump. If breakpoints are not disabled, the minidump will contain breakpoint instructions (int 3) inserted by Visual Studio debugger into the code of the target process. |
CDB can also be used to automate crash dump analysis. The automation is possible because we usually have to perform the same set of operations when we start analysing a crash dump. What operations? It depends on the kind of the crash dump. I would separate all crash dumps into two main categories:
- crash dumps with exception information
- crash dumps without exception information
Crash dumps with exception information are usually created when the application raises an unhandled exception and invokes a just-in-time debugger (Dr. Watson, NTSD, or other) or creates a minidump in the custom filter for unhandled exceptions. Exception information is written into the crash dump so that we could determine the type of the exception and the place in the code where it occurred. Crash dumps without exception information are usually created manually, when we want to create a snapshot of the process for further analysis (for example, with the help of techniques described in the previous chapter of this article).
When we start debugging a crash dump with exception information, we usually want to know the following:
- the place in the code where the exception occurred (address, source file and line number)
- call stack at the moment when the exception was raised
- values of function parameters and local variables, for some or all functions on the call stack
WinDbg and CDB support a very useful command for crash dump debugging - !analyze. This command analyzes exception information in the crash dump, determines the place where the exception occurred, the call stack, and displays detailed report. Here is how to use this command:
cdb -z c:/myapp.dmp -logo out.txt -lines -c "!analyze -v;q"
(-v option asks !analyze to display verbose output)
CrashDemo.cpp sample demonstrates how to use a custom filter to catch unhandled exceptions and create minidumps. If you compile it and run, and then use the abovementioned CDB command to analyze the resulting minidump, you will get an output similar to the following:
0:001> !analyze -v ******************************************************************************* * * * Exception Analysis * * * *******************************************************************************
FAULTING_IP: CrashDemo!TestFunc+2e [c:/tests/crashdemo/crashdemo.cpp @ 124] 004309de c70000000000 mov dword ptr [eax],0x0
EXCEPTION_RECORD: ffffffff -- (.exr ffffffffffffffff) .exr ffffffffffffffff ExceptionAddress: 004309de (CrashDemo!TestFunc+0x0000002e) ExceptionCode: c0000005 (Access violation) ExceptionFlags: 00000000 NumberParameters: 2 Parameter[0]: 00000001 Parameter[1]: 00000000 Attempt to write to address 00000000
DEFAULT_BUCKET_ID: APPLICATION_FAULT
PROCESS_NAME: CrashDemo.exe
ERROR_CODE: (NTSTATUS) 0xc0000005 - The instruction at "0x%08lx" referenced memory at "0x%08lx". The memory could not be "%s".
WRITE_ADDRESS: 00000000
BUGCHECK_STR: ACCESS_VIOLATION
LAST_CONTROL_TRANSFER: from 0043096e to 004309de
STACK_TEXT: 006afe88 0043096e 00000000 00354130 00350001 CrashDemo!TestFunc+0x2e [c:/tests/crashdemo/crashdemo.cpp @ 124] 006aff6c 00430f31 00000000 52319518 00354130 CrashDemo!WorkerThread+0x5e [c:/tests/crashdemo/crashdemo.cpp @ 115] 006affa8 00430ea2 00000000 006affec 7c80b50b CrashDemo!_callthreadstartex+0x51 [f:/rtm/vctools/crt_bld/self_x86/crt/src/threadex.c @ 348] 006affb4 7c80b50b 00355188 00354130 00350001 CrashDemo!_threadstartex+0xa2 [f:/rtm/vctools/crt_bld/self_x86/crt/src/threadex.c @ 331] 006affec 00000000 00430e00 00355188 00000000 kernel32!BaseThreadStart+0x37
FOLLOWUP_IP: CrashDemo!TestFunc+2e [c:/tests/crashdemo/crashdemo.cpp @ 124] 004309de c70000000000 mov dword ptr [eax],0x0
SYMBOL_STACK_INDEX: 0
FOLLOWUP_NAME: MachineOwner
SYMBOL_NAME: CrashDemo!TestFunc+2e
MODULE_NAME: CrashDemo
IMAGE_NAME: CrashDemo.exe
DEBUG_FLR_IMAGE_TIMESTAMP: 43dc6ee7
STACK_COMMAND: .ecxr ; kb
FAILURE_BUCKET_ID: ACCESS_VIOLATION_CrashDemo!TestFunc+2e
BUCKET_ID: ACCESS_VIOLATION_CrashDemo!TestFunc+2e
Followup: MachineOwner ---------
Pay attention to the blocks of the text shown in bold. The first block reports the address and the type of the exception. The second block reports the call stack. And the third block gives us additional information on how to access the exception information stored in the crash dump.
Now we know the place where the exception occurred, and can even see the call stack. It's time to get the values of function parameters and local variables. Before we start, let's pay attention to the third selected block of information reported by !analyze. To repeat, the block contains the following:
STACK_COMMAND: .ecxr ; kb
We already know 'kb' command (it displays the call stack). But what is .ecxr? This command asks the debugger to switch the current context to the one stored in the crash dump's exception information. After we have executed .ecxr, and only after that, we can reliably get access to the call stack and the values of local variables at the moment when the exception was raised.
After we have asked the debugger to use the exception context, we can use 'dv' command to display the values of function parameters and local variables. Since we usually want to see this information for every function on the call stack, we should actually use '!for_each_frame dv /t' command (/t option asks 'dv' to show type information, which is also useful). (And of course, we have to remember that in optimized builds some local variables can be optimized away, enregistered, or reused to store other data throughout the lifetime of the function, and as a result the values reported by 'dv' command can be incorrect).
Here is the final command line for analysis of crash dumps with exception information:
cdb -z c:/myapp.dmp -logo out.txt -lines -c "!analyze -v;.ecxr;!for_each_frame dv /t;q"
And here is the sample output of '!for_each_frame dv /t' command:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 00 006afe88 0043096e CrashDemo!TestFunc+0x2e [c:/tests/crashdemo/crashdemo.cpp @ 124] int * pParam = 0x00000000 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 01 006aff6c 00430f31 CrashDemo!WorkerThread+0x5e [c:/tests/crashdemo/crashdemo.cpp @ 115] void * lpParam = 0x00000000 int * TempPtr = 0x00000000 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 02 006affa8 00430ea2 CrashDemo!_callthreadstartex+0x51 [f:/rtm/vctools/crt_bld/self_x86/crt/src/threadex.c @ 348] struct _tiddata * ptd = 0x00355188 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 03 006affb4 7c80b50b CrashDemo!_threadstartex+0xa2 [f:/rtm/vctools/crt_bld/self_x86/crt/src/threadex.c @ 331] void * ptd = 0x00355188 struct _tiddata * _ptd = 0x00000000 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 04 006affec 00000000 kernel32!BaseThreadStart+0x37 Unable to enumerate locals, HRESULT 0x80004005 Private symbols (symbols.pri) are required for locals. Type ".hh dbgerr005" for details. _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 00 006afe88 0043096e CrashDemo!TestFunc+0x2e [c:/tests/crashdemo/crashdemo.cpp @ 124]
If the minidump does not include the complete contents of the target process' memory, the debugger will be able to analyze the dump only if it can find exactly the same versions of executable modules that were loaded by the target process. In some cases, you will have to help the debugger to locate those modules – by specifying the module search path. Detailed information about module search path and related issues can be found in this article. |
Now let's proceed to crash dumps without exception information. When we are starting to analyze such a dump, we usually want to know the call stacks of all threads. Here is how to get this information:
cdb -z c:/myapp.dmp -logo out.txt -lines -c "~*kb;q"
What to do if we don't know whether the crash dump contains exception information or not? For minidumps, it is possible to use MiniDumpView to print the contents of the dump and see if it contains exception information. For old-style 'full user dumps', probably the only option is to start as if the dump contains exception information, and see if !analyze reports something meaningful.
There is an interesting special case – it is possible that the crash dump was created because of an unhandled exception, but does not contain exception information by some reason. It is still possible to find out the place where the exception occurred, with the help of the following procedure:
- Print the call stacks of all threads (using CDB command shown above).
- Find out the thread whose call stack contains kernel32!UnhandledExceptionFilter function.
- Use the fact that the first parameter of UnhandledExceptionFilter function contains a pointer to EXCEPTION_POINTERS structure.
Here is the declaration of EXCEPTION_POINTERS structure:
typedef struct _EXCEPTION_POINTERS { PEXCEPTION_RECORD ExceptionRecord; PCONTEXT ContextRecord; } EXCEPTION_POINTERS, *PEXCEPTION_POINTERS;
If we know the address of this structure, we can take the pointer to the exception context (stored in ContextRecord field), pass it to .cxr command and thus switch the debugger context to the place where the exception occurred. After .cxr command has been executed, we can use, for example, 'kb' command to get the call stack at the moment of the exception. Here is an example:
1. Print call stacks of all threads.
cdb -z c:/myapp.dmp -logo out.txt -c "~*kb;q"
0:000> ~*kb
. 0 Id: 6c4.73c Suspend: 1 Teb: 7ffdf000 Unfrozen ChildEBP RetAddr Args to Child 0012fdf8 7c90d85c 7c8023ed 00000000 0012fe2c ntdll!KiFastSystemCallRet 0012fdfc 7c8023ed 00000000 0012fe2c 0012ff54 ntdll!NtDelayExecution+0xc 0012fe54 7c802451 0036ee80 00000000 0012ff54 kernel32!SleepEx+0x61 0012fe64 00430856 0036ee80 00330033 00300037 kernel32!Sleep+0xf 0012ff54 00431702 00000001 00352ed0 00352fb0 CrashDemo!wmain+0x96 0012ffb8 004314bd 0012fff0 7c816d4f 00330033 CrashDemo!__tmainCRTStartup+0x232 0012ffc0 7c816d4f 00330033 00300037 7ffd9000 CrashDemo!wmainCRTStartup+0xd 0012fff0 00000000 0042e5a5 00000000 00000000 kernel32!BaseProcessStart+0x23
1 Id: 6c4.5cc Suspend: 1 Teb: 7ffde000 Unfrozen ChildEBP RetAddr Args to Child 006af6e4 7c90e273 7c863130 d0000144 00000004 ntdll!KiFastSystemCallRet 006af6e8 7c863130 d0000144 00000004 00000000 ntdll!NtRaiseHardError+0xc 006af96c 00438951 006af9e0 5d343834 00000000 kernel32!UnhandledExceptionFilter+0x59c 006af990 00430f2a c0000005 006af9e0 0044ad30 CrashDemo!_XcptFilter+0x61 006af99c 0044ad30 00000000 00000000 00000000 CrashDemo!_callthreadstartex+0x7a 006af9b0 00438c67 00430f13 0049a230 00000000 CrashDemo!_EH4_CallFilterFunc+0x12 006af9e8 7c9037bf 006afad4 006aff98 006afaf0 CrashDemo!_except_handler4+0xb7 006afa0c 7c90378b 006afad4 006aff98 006afaf0 ntdll!ExecuteHandler2+0x26 006afabc 7c90eafa 00000000 006afaf0 006afad4 ntdll!ExecuteHandler+0x24 006afabc 004309be 00000000 006afaf0 006afad4 ntdll!KiUserExceptionDispatcher+0xe 006afe88 0043094e 00000000 00354130 00350001 CrashDemo!TestFunc+0x2e 006aff6c 00430f01 00000000 647bff58 00354130 CrashDemo!WorkerThread+0x5e 006affa8 00430e72 00000000 006affec 7c80b50b CrashDemo!_callthreadstartex+0x51 006affb4 7c80b50b 00355188 00354130 00350001 CrashDemo!_threadstartex+0xa2 006affec 00000000 00430dd0 00355188 00000000 kernel32!BaseThreadStart+0x37
2. Change debugger context and get the call stack for the exception.
cdb -z c:/myapp.dmp -logo out.txt -lines -c ".cxr dwo(0x006af9e0+4);kb;q"
('dwo' operator returns the double word stored at the specified address and passes it to .cxr command)
The batch files presented later in this article (DumpStackCtx.bat in particular) will simplify this task significantly. |
There is an alternative approach that also allows to solve this problem – you can find more information about it here.
Another situation where CDB can significantly complement Visual Studio debugger is when we want to inspect virtual memory layout of the debuggee process. The following command will display the complete virtual memory map of the process:
cdb -pv -pn myapp.exe -logo out.txt -c "!vadump -v;q"
(!vadump command is responsible for printing the virtual memory map, and -v option, as usual, asks it to show detailed output)
Here is an example of !vadump output:
BaseAddress: 00040000 AllocationBase: 00040000 AllocationProtect: 00000004 PAGE_READWRITE RegionSize: 0002e000 State: 00002000 MEM_RESERVE Type: 00020000 MEM_PRIVATE
BaseAddress: 0006e000 AllocationBase: 00040000 AllocationProtect: 00000004 PAGE_READWRITE RegionSize: 00001000 State: 00001000 MEM_COMMIT Protect: 00000104 PAGE_READWRITE + PAGE_GUARD Type: 00020000 MEM_PRIVATE
BaseAddress: 0006f000 AllocationBase: 00040000 AllocationProtect: 00000004 PAGE_READWRITE RegionSize: 00011000 State: 00001000 MEM_COMMIT Protect: 00000004 PAGE_READWRITE Type: 00020000 MEM_PRIVATE
On Windows XP and Windows Server 2003, CDB offers a better command for inspecting virtual memory layout - !address. This command allows to perform the following tasks:
- Display virtual memory map of the process (in my opinion, in a more readable format than !vadump)
- Display useful statistics about virtual memory usage
- Determine the kind of virtual memory region the specified address belongs to (for example, does it belong to a stack, heap or an executable image?)
Here is how to use !address to report the virtual memory map:
cdb -pv -pn myapp.exe -logo out.txt -c "!address;q"
Here is a sample output that shows a memory region occupied by a thread's stack:
00040000 : 00040000 - 0002e000 Type 00020000 MEM_PRIVATE Protect 00000000 State 00002000 MEM_RESERVE Usage RegionUsageStack Pid.Tid 658.644 0006e000 - 00001000 Type 00020000 MEM_PRIVATE Protect 00000104 PAGE_READWRITE | PAGE_GUARD State 00001000 MEM_COMMIT Usage RegionUsageStack Pid.Tid 658.644 0006f000 - 00011000 Type 00020000 MEM_PRIVATE Protect 00000004 PAGE_READWRITE State 00001000 MEM_COMMIT Usage RegionUsageStack Pid.Tid 658.644
Note that !address is smart enough to report the thread id of the thread the stack belongs to.
After !address has finished reporting virtual memory regions, it also reports interesting statistics about virtual memory usage:
-------------------- Usage SUMMARY -------------------------- TotSize Pct(Tots) Pct(Busy) Usage 00838000 : 0.40% 27.96% : RegionUsageIsVAD 7e28c000 : 98.56% 0.00% : RegionUsageFree 01348000 : 0.94% 65.60% : RegionUsageImage 00040000 : 0.01% 0.85% : RegionUsageStack 00001000 : 0.00% 0.01% : RegionUsageTeb 001a0000 : 0.08% 5.53% : RegionUsageHeap 00000000 : 0.00% 0.00% : RegionUsagePageHeap 00001000 : 0.00% 0.01% : RegionUsagePeb 00001000 : 0.00% 0.01% : RegionUsageProcessParametrs 00001000 : 0.00% 0.01% : RegionUsageEnvironmentBlock Tot: 7fff0000 Busy: 01d64000
-------------------- Type SUMMARY -------------------------- TotSize Pct(Tots) Usage 7e28c000 : 98.56% : <free> 01348000 : 0.94% : MEM_IMAGE 007b6000 : 0.38% : MEM_MAPPED 00266000 : 0.12% : MEM_PRIVATE
-------------------- State SUMMARY -------------------------- TotSize Pct(Tots) Usage 01647000 : 1.09% : MEM_COMMIT 7e28c000 : 98.56% : MEM_FREE 0071d000 : 0.35% : MEM_RESERVE
Largest free region: Base 01014000 - Size 59d5c000
The statistics can be especially useful when we are debugging a memory leak and want to determine what kind of memory is leaked (heap, stack, raw virtual memory, and so on). The last item allows to determine the size of the largest free region of virtual memory, which can be helpful when we have to design an application with high memory demands.
If you only want to see the statistics and do not need the virtual memory map, you can use -summary parameter:
cdb -pv -pn myapp.exe -logo out.txt -c "!address -summary;q"
If we need to determine what kind of virtual memory the given address belongs to, we can pass this address as a parameter to !address command. Here is an example:
0:000> !address 0x000a2480;q 000a0000 : 000a0000 - 000d7000 Type 00020000 MEM_PRIVATE Protect 00000004 PAGE_READWRITE State 00001000 MEM_COMMIT Usage RegionUsageHeap Handle 000a0000
Sometimes we might need to determine the address of a symbol (function or variable). If we know the exact name of the symbol, we can enter it into Disassembly window of Visual Studio debugger and get the address. But what if we don't remember the exact name? Or want to find the addresses of a set of symbols with the same pattern in the name (for example, all member functions of a class)? CDB can easily solve this problem – it offers 'x' command, which can list all symbols whose names match the specified mask:
x Module!Symbol
The following command tries to locate the address of UnhandledExceptionFilter function, located in kernel32.dll:
cdb -pv -pn notepad.exe -logo out.txt -c "x kernel32!UnhandledExceptionFilter;q"
Here is the output:
0:000> x kernel32!UnhandledExceptionFilter;q 7c862b8a kernel32!UnhandledExceptionFilter = <no type information>
'x' command accepts a large number of possible wildcards, and offers some useful options for sorting the output and for additional information about symbols - you can find more information about it in WinDbg documentation. For example, the following command lists all member functions and static data members of CMainFrame class defined in our application's main executable:
0:000> x myapp!*CMainFrame* 004542f8 MyApp!CMainFrame::classCMainFrame = struct CRuntimeClass 00401100 MyApp!CMainFrame::`scalar deleting destructor' (void) 004011a0 MyApp!CMainFrame::OnCreate (struct tagCREATESTRUCTW *) 00401000 MyApp!CMainFrame::CreateObject (void) 00401280 MyApp!CMainFrame::PreCreateWindow (struct tagCREATESTRUCTW *) 00401070 MyApp!CMainFrame::GetRuntimeClass (void) 00401120 MyApp!CMainFrame::~CMainFrame (void) 00401090 MyApp!CMainFrame::CMainFrame (void) 00401080 MyApp!CMainFrame::GetMessageMap (void) 004578ec MyApp!CMainFrame::`RTTI Base Class Array' = <no type information> 004578dc MyApp!CMainFrame::`RTTI Class Hierarchy Descriptor' = <no type information> 004578c8 MyApp!CMainFrame::`RTTI Complete Object Locator' = <no type information> 004579ec MyApp!CMainFrame::`RTTI Base Class Descriptor at (0,-1,0,64)' = <no type information> 00461e94 MyApp!CMainFrame `RTTI Type Descriptor' = <no type information> 00454354 MyApp!CMainFrame::`vftable' = <no type information>
CDB can also do just the opposite – find symbol by address, using 'ln' command:
ln Address
Here is how to use it:
cdb -pv -pn notepad.exe -logo out.txt -c "ln 0x77d491c8;q"
Here is the output:
0:000> ln 0x77d491c8;q (77d491c6) USER32!GetMessageW+0x2 | (77d49216) USER32!CharUpperBuffW
Note that we do not have to specify the start address of the symbol (a function in this case), but can use any address inside the address range occupied by the symbol. 'ln' will find the symbol, report its address, and in addition report the address and the name of the symbol that follows the specified one.
If we want to explore the contents of a data structure, we usually use Visual Studio's Watch, QuickWatch or other similar window. These windows allow us to see the types and values of the structure's member variables. But what if we also need to know the exact layout of the structure, including the offsets of its members? Visual Studio does not offer an easy-to-use solution, but fortunately CDB does. With the help of 'dt' command, we can display the exact layout of a data structure or a class.
If we simply want to know the layout of a data type, we can use this command as follows:
dt -b TypeName
(-b option enables recursive display of embedded data structures for members whose type is also a structure or a class).
Here is a sample CDB command line:
cdb -pv -pn myapp.exe -logo out.txt -c "dt -b CSymbolInfoPackage;q"
Here is the output (obtained while running SymFromAddr application):
0:000> dt /b CSymbolInfoPackage;q +0x000 si : _SYMBOL_INFO +0x000 SizeOfStruct : Uint4B +0x004 TypeIndex : Uint4B +0x008 Reserved : Uint8B +0x018 Index : Uint4B +0x01c Size : Uint4B +0x020 ModBase : Uint8B +0x028 Flags : Uint4B +0x030 Value : Uint8B +0x038 Address : Uint8B +0x040 Register : Uint4B +0x044 Scope : Uint4B +0x048 Tag : Uint4B +0x04c NameLen : Uint4B +0x050 MaxNameLen : Uint4B +0x054 Name : Char +0x058 name : Char
If you want to display the layout of a particular variable, you can pass its address to 'dt' command:
dt -b TypeName Address
Here is a sample:
cdb -pv -pn myapp.exe -logo out.txt -c "dt -b CSymbolInfoPackage 0x0012f6d0;q"
0:000> dt /b CSymbolInfoPackage 0x0012f6d0;q +0x000 si : _SYMBOL_INFO +0x000 SizeOfStruct : 0x58 +0x004 TypeIndex : 2 +0x008 Reserved : [00] 0 [01] 0 +0x018 Index : 1 +0x01c Size : 0x428 +0x020 ModBase : 0x400000 +0x028 Flags : 0 +0x030 Value : 0 +0x038 Address : 0x411d30 +0x040 Register : 0 +0x044 Scope : 0 +0x048 Tag : 5 +0x04c NameLen : 0xe +0x050 MaxNameLen : 0x7d1 +0x054 Name : "S" [00] 83 'S' +0x058 name : "SymbolInfo" [00] 83 'S' [01] 121 'y' [02] 109 'm' [03] 98 'b' [04] 111 'o' [05] 108 'l' [06] 73 'I' [07] 110 'n' [08] 102 'f' [09] 111 'o' [10] 0 '' [11] 0 '' [12] 0 '' [13] 0 '' [14] 0 '' [15] 0 '' [16] 0 '' [17] 0 '' ... part of the output omitted [1990] 0 '' [1991] 0 '' [1992] 0 '' [1993] 0 '' [1994] 0 '' [1995] 0 '' [1996] 0 '' [1997] -52 '' [1998] -52 '' [1999] -52 '' [2000] -52 ''
Note that now 'dt' also shows the values of the structure's member variables.
Now we know how to use CDB to solve some interesting debugging problems. It's time to solve one more problem – replace long CDB command lines with easy-to-use batch files. Consider the command we used as a sample at the beginning of the article:
cdb -pv -pn myapp.exe -logo out.txt -c "lm;q"
Most parts of this command are static and cannot change. The only variable part is the target information (-pn myapp.exe), where we might need to use another executable name, or even another way of attaching (e.g., by process id).
Here is how this command can be represented in a batch file:
; lm.bat cdb -pv %1 %2 -logo out.txt -c "lm;q"
If we want to run this batch file to get the list of modules loaded by a process, we can use either one of the following commands:
Attach by executable name:
lm -pn myapp.exe
Attach by process id:
lm -p 1234
Attach by service name:
lm -psn MyService
Open a crash dump file:
lm -z c:/myapp.dmp
Regardless of the target specified, the command still does the same – prints the list of loaded modules.
If we need to specify additional parameters to a CDB command, we can do it using the same approach. Consider the following command, which can be used to display the layout of a data structure:
cdb -pv -pn myapp.exe -logo out.txt -c "dt /b MyStruct;q"
Of course, we want to use this command with any data type, not only with MyStruct. Here is how we can do it:
; dt.bat cdb -pv %1 %2 -logo out.txt -c "dt /b %3;q"
Now we can run the command like this:
dt -pn myapp.exe CTestClass
Or like this:
dt -p 1234 SYMBOL_INFO
Or, for example, like this:
dt -z c:/myapp.dmp EXCEPTION_POINTERS
We can use the same approach with many other commands. Here you can find the list of batch files that wrap the commands discussed in this article. In the future, I am going to extend it with other useful commands.
|
、、、、、、、、、、、、、、、、、
、、、、、、、、、、、、、、、、、
Windbg轻松上路 WinDbg the easy way 中文翻译
2008-10-08 09:43
摘译自 WinDbg the easy way ,Oleg Starodumov
源码下载:http://files.cnblogs.com/itrust/WindbgEasyWayDemo.rar
如果要说最好的调试器是什么?那一定是:Visual Studio + Windbg。Visual Studio直观简捷,Windbg强大复杂。在你调试程序的时候,如果使用Visual Studio感觉束手无策时,就该考虑Windbg了,但Windbg是如此的专业,入门是如此的难。有没有更简单轻松一点的办法呢?可以考虑先使用CDB(Windbg的姐妹——轻量级控制台程序)。CDB和Windbg的命令是一致的,一旦熟悉了CDB,Windbg可上手了。
1 简介
1.1 环境准备
首先需要通过环境变量_NT_SYMBOL_PATH来配置符号文件的定位,可以从微软的网站上去下载,也可直接指定网站地址,让CDB和Windbg需要时自己去找,各自如此设置:
set _NT_SYMBOL_PATH = D:/debug/symbols;D:/debug/WindowsXP-KB835935-SP2-symbols
set _NT_SYMBOL_PATH = srv*c:/symbols*http://msdl.microsoft.com/download/symbols
注意:除了操作系统符号文件的定位,你也需要设置自己的程序的调试信息(*.pdb文件)的定位,如上例中的D:/debug/symbols。
1.2 CDB命令行基本用法
选项 | 描述 | 举例 | -p Pid | 告知CDB通过进程号挂接到某个进程 | cdb -p 1034 | -pn ExeName | 告知CDB通过进程的可执行文件名挂接到某个进程。如果当前有多个同名的进程运行,则不能使用该选项(CDB会报错) | cdb -pn myapp.exe | -psn ServiceName | 告知CDB通过服务名挂接到某个Windows服务的进程 | cdb -psn MyService |
1.3 命令行用法汇总
这里列出本文将使用到的命令行,也是CDB的主要用法。
通过进程号以非侵入模式挂接到进程上,执行一些命令(command1; command2;...;commandN;),然后推出(q),并输出日志文件:
cdb -pv -p <processid> -logo out.txt -lines -c "command1;command2;...;commandN;q"
打开转储文件,执行一些命令,并打印到日志文件:
cdb -z <dumpfile> -logo out.txt -lines -c "command1;command2;...;commandN;q"
需要说明一下非侵入模式(noninvasive),可以理解为不打断不影响进程运行的情况下和进程挂接。当然,这种模式下,调试器无法控制程序的运行。
2 实例应用
2.1 调试死锁
下面演示如何通过CDB来找出死锁,请做如下准备工作:
1. 编译DeadLockDemo.cpp
2. 运行编译出的exe,程序会立刻死锁
我们先通过"~*kb命令(显示所有的堆栈信息)看看都有那些线程在运行:cdb -pv -pn myapp.exe -logo out.txt -lines -c "~*kb;q"
. 0 Id: 6fc.4fc Suspend: 1 Teb: 7ffdf000 Unfrozen
ChildEBP RetAddr Args to Child
0012fdf8 7c90d85c 7c8023ed 00000000 0012fe2c ntdll!KiFastSystemCallRet 0012fdfc 7c8023ed 00000000 0012fe2c 0012ff54 ntdll!NtDelayExecution+0xc 0012fe54 7c802451 0036ee80 00000000 0012ff54 kernel32!SleepEx+0x61 0012fe64 004308a9 0036ee80 a0f63080 01c63442 kernel32!Sleep+0xf 0012ff54 00432342 00000001 003336e8 003337c8 DeadLockDemo!wmain+0xd9 [c:/tests/deadlockdemo/deadlockdemo.cpp @ 154] 0012ffb8 004320fd 0012fff0 7c816d4f a0f63080 DeadLockDemo!__tmainCRTStartup+0x232 [f:/rtm/vctools/crt_bld/self_x86/crt/src/crt0.c @ 318] 0012ffc0 7c816d4f a0f63080 01c63442 7ffdd000 DeadLockDemo!wmainCRTStartup+0xd [f:/rtm/vctools/crt_bld/self_x86/crt/src/crt0.c @ 187] 0012fff0 00000000 0042e5aa 00000000 78746341 kernel32!BaseProcessStart+0x23
1 Id: 6fc.3d8 Suspend: 1 Teb: 7ffde000 Unfrozen
ChildEBP RetAddr Args to Child 005afc14 7c90e9c0 7c91901b 000007d4 00000000 ntdll!KiFastSystemCallRet 005afc18 7c91901b 000007d4 00000000 00000000 ntdll!ZwWaitForSingleObject+0xc 005afca0 7c90104b 004a0638 00430b7f 004a0638 ntdll!RtlpWaitForCriticalSection+0x132 005afca8 00430b7f 004a0638 005afe6c 005afe78 ntdll!RtlEnterCriticalSection+0x46 005afd8c 00430b15 005aff60 005afe78 003330a0 DeadLockDemo!CCriticalSection::Lock+0x2f [c:/tests/deadlockdemo/deadlockdemo.cpp @ 62] 005afe6c 004309f1 004a0638 f3d065d5 00334fc8 DeadLockDemo!CCritSecLock::CCritSecLock+0x35 [c:/tests/deadlockdemo/deadlockdemo.cpp @ 90] 005aff6c 004311b1 00000000 f3d06511 00334fc8 DeadLockDemo!ThreadOne+0xa1 [c:/tests/deadlockdemo/deadlockdemo.cpp @ 182] 005affa8 00431122 00000000 005affec 7c80b50b DeadLockDemo!_callthreadstartex+0x51 [f:/rtm/vctools/crt_bld/self_x86/crt/src/threadex.c @ 348] 005affb4 7c80b50b 003330a0 00334fc8 00330001 DeadLockDemo!_threadstartex+0xa2 [f:/rtm/vctools/crt_bld/self_x86/crt/src/threadex.c @ 331] 005affec 00000000 00431080 003330a0 00000000 kernel32!BaseThreadStart+0x37
2 Id: 6fc.284 Suspend: 1 Teb: 7ffdc000 Unfrozen
ChildEBP RetAddr Args to Child 006afc14 7c90e9c0 7c91901b 000007d8 00000000 ntdll!KiFastSystemCallRet 006afc18 7c91901b 000007d8 00000000 00000000 ntdll!ZwWaitForSingleObject+0xc 006afca0 7c90104b 004a0620 00430b7f 004a0620 ntdll!RtlpWaitForCriticalSection+0x132 006afca8 00430b7f 004a0620 006afe6c 006afe78 ntdll!RtlEnterCriticalSection+0x46 006afd8c 00430b15 006aff60 006afe78 003332e0 DeadLockDemo!CCriticalSection::Lock+0x2f [c:/tests/deadlockdemo/deadlockdemo.cpp @ 62] 006afe6c 00430d11 004a0620 f3e065d5 00334fc8 DeadLockDemo!CCritSecLock::CCritSecLock+0x35 [c:/tests/deadlockdemo/deadlockdemo.cpp @ 90] 006aff6c 004311b1 00000000 f3e06511 00334fc8 DeadLockDemo!ThreadTwo+0xa1 [c:/tests/deadlockdemo/deadlockdemo.cpp @ 202] 006affa8 00431122 00000000 006affec 7c80b50b DeadLockDemo!_callthreadstartex+0x51 [f:/rtm/vctools/crt_bld/self_x86/crt/src/threadex.c @ 348] 006affb4 7c80b50b 003332e0 00334fc8 00330001 DeadLockDemo!_threadstartex+0xa2 [f:/rtm/vctools/crt_bld/self_x86/crt/src/threadex.c @ 331] 006affec 00000000 00431080 003332e0 00000000 kernel32!BaseThreadStart+0x37
可以看到有三个线程:主线程4fc,子线程3d8和284都在调用WaitForCriticalSection等待一个线程同步对象可用。
然后,再看看锁的列表:cdb -pv -pn myapp.exe -logo out.txt -lines -c "!locks;q"
CritSec DeadLockDemo!CritSecOne+0 at 004A0620 LockCount 1 RecursionCount 1 OwningThread 3d8 EntryCount 1 ContentionCount 1 *** Locked
CritSec DeadLockDemo!CritSecTwo+0 at 004A0638 LockCount 1 RecursionCount 1 OwningThread 284 EntryCount 1 ContentionCount 1 *** Locked
问题很清楚了,3d8和284在等待调用WaitForSingleObject等待一个线程同步对象可用时,都自己锁住了一个同步对象。两者互相等待,发生死锁。
这是一个简单的例子,在实际的应用中情况会比这复杂,但基本方法不变,具体的思路是:首先找到被锁住的线程,通过kb找到这个线程等待的同步对象,再通过!lock找到持有该同步对象的线程,顺着这个思路重复,看看最终是否线程是否能够回到最初的线程上。
如果应用程序使用了一些更复杂的同步对象(如:Mutex),调试会更复杂,在后续的文章中再讨论。
2.2 调试CPU的高消耗
要找出消耗CPU最厉害的线程:cdb -pv -pn myapp.exe -logo out.txt -c "!runaway;q"
0:000> !runaway
User Mode Time
Thread Time 1:358 0 days 0:00:47.408 2:150 0 days 0:00:03.495 0:d8 0 days 0:00:00.000
其时间为该线程自创建后所消耗的总时间,因此不能说线程358当前消耗CPU最厉害,应再来一次,观察时间增量:
0:000> !runaway
User Mode Time
Thread Time 1:358 0 days 0:00:47.408 2:150 0 days 0:00:06.859 0:d8 0 days 0:00:00.000
如此多次,可以发现消耗CPU最厉害的是线程150。
2.3 调试堆栈溢出
一般而言,堆栈溢出是由于函数的嵌套调用控制不好造成的。IDE能够很好的调试堆栈溢出。但有时,我们已经注意通过控制函数的嵌套调用来避免堆栈溢出,但堆栈溢出还是在偶尔出现,为什么呢?有某些函数在一些特定的情况下占用了过多的空间,造成了堆栈溢出。因此,我们需要知道在堆栈中函数对堆栈空间的占用情况,对此IDE没有提供简洁的方法。
操作方法:
1. 使用Debug模式编译StackOvrDemo.cpp(对这个例子,Release版本无法看到具体的函数栈)
2. 在VC中使用调试状态运行
3. 一旦异常被VC捕捉到,运行命令行:cdb -pv -pn stackovfdemo.exe -logo out.txt -c "~*kf;q"
. 0 Id: 210.3a8 Suspend: 1 Teb: 7ffde000 Unfrozen
Memory ChildEBP RetAddr
00033440 0041aca5 StackOvfDemo!_woutput+0x22
44 00033484 00415eed StackOvfDemo!wprintf+0x85
d8 0003355c 00415cc5 StackOvfDemo!ProcessStringW+0x2d
fc878 0012fdd4 00415a44 StackOvfDemo!ProcessStrings+0xe5
108 0012fedc 0041c043 StackOvfDemo!main+0x64
e4 0012ffc0 7c4e87f5 StackOvfDemo!mainCRTStartup+0x183
30 0012fff0 00000000 KERNEL32!BaseProcessStart+0x3d
可见,ProcessStrings方法占用了大量内存,最有可能是导致堆栈溢出元凶。
对于这个例子,你可能会疑惑ProcessStrings怎么能够占用如此多的堆栈内存,需要从ATL宏A2W上找原因,A2W调用了_alloca函数从栈上申请内存,这些内存只能在函数ProcessStrings退出堆栈清除时才能释放。因此,应避免在循环中调用A2W。
2.4 生成转储文件(Dump)
如果程序带着未知的Bug发布出去,怎么调试? 因此,在某些情况下,我们需要生成转储文件,通过转储文件来分析。我们使用可以CDB/Dr.Waton/dbghelp接口/Windgb/XP任务管理器等等方法生成转储文件。
使用CDB的方法:cdb -pv -pn myapp.exe -c ".dump /m c:/myapp.dmp;q"
选项 | 描述 | 举例 | /m | 缺省选项,生成标准的minidump, 转储文件通常较小,便于在网络上通过邮件或其他方式传输,当然这种文件的信息量较少,之包含:系统信息、加载的模块(DLL)信息、 进程信息和线程信息。 | .dump /m c:/myapp.dmp | /ma | 带有尽量多选项的minidump(包括完整的内存内容、句柄、未加载的模块,等等),文件很大,可用于本地调试。 | .dump /ma c:/myapp.dmp | /mFhutwd | 带有数据段、非共享的读/写内存页和其他有用的信息的minidump。包含了通过minidump能够得到的最多的信息。 | .dump /mFhutwd c:/myapp.dm |
如果你要为一个正在被IDE调试的进程创建转储文件,记得先使所有断点暂时失效。如果不这样做,转储文件中将带有所有断点指令(int 3)。
2.5 分析转储文件
一般而言,我们分析转储文件希望得到下列信息::
- 异常发生的地方 (地址、源码文件和代码行)
- 异常发生时的调用堆栈
- 调用堆栈上函数参数和本地变量的值
windgb和CDB都提供一个强大的命令!analyze –v来分析转储文件:cdb -z c:/myapp.dmp -logo out.txt -lines -c "!analyze -v;q"
CrashDemo.cpp演示了如何通过dbghelp接口实现自定义过滤器为异常创建转储文件。功能更完备的dbghelp接口封装在我(译者)其他的文章中将会讨论。
0:001> !analyze -v
******************************************************************************* * * * Exception Analysis * * * ******************************************************************************* FAULTING_IP: CrashDemo!TestFunc+2e [c:/tests/crashdemo/crashdemo.cpp @ 124] 004309de c70000000000 mov dword ptr [eax],0x0 EXCEPTION_RECORD: ffffffff -- (.exr ffffffffffffffff) .exr ffffffffffffffff ExceptionAddress: 004309de (CrashDemo!TestFunc+0x0000002e) ExceptionCode: c0000005 (Access violation) ExceptionFlags: 00000000 NumberParameters: 2 Parameter[0]: 00000001 Parameter[1]: 00000000 Attempt to write to address 00000000
DEFAULT_BUCKET_ID: APPLICATION_FAULT PROCESS_NAME: CrashDemo.exe ERROR_CODE: (NTSTATUS) 0xc0000005 - The instruction at "0x%08lx" referenced memory at "0x%08lx". The memory could not be "%s". WRITE_ADDRESS: 00000000 BUGCHECK_STR: ACCESS_VIOLATION LAST_CONTROL_TRANSFER: from 0043096e to 004309de
STACK_TEXT: 006afe88 0043096e 00000000 00354130 00350001 CrashDemo!TestFunc+0x2e [c:/tests/crashdemo/crashdemo.cpp @ 124] 006aff6c 00430f31 00000000 52319518 00354130 CrashDemo!WorkerThread+0x5e [c:/tests/crashdemo/crashdemo.cpp @ 115] 006affa8 00430ea2 00000000 006affec 7c80b50b CrashDemo!_callthreadstartex+0x51 [f:/rtm/vctools/crt_bld/self_x86/crt/src/threadex.c @ 348] 006affb4 7c80b50b 00355188 00354130 00350001 CrashDemo!_threadstartex+0xa2 [f:/rtm/vctools/crt_bld/self_x86/crt/src/threadex.c @ 331] 006affec 00000000 00430e00 00355188 00000000 kernel32!BaseThreadStart+0x37
FOLLOWUP_IP: CrashDemo!TestFunc+2e [c:/tests/crashdemo/crashdemo.cpp @ 124] 004309de c70000000000 mov dword ptr [eax],0x0
SYMBOL_STACK_INDEX: 0 FOLLOWUP_NAME: MachineOwner SYMBOL_NAME: CrashDemo!TestFunc+2e MODULE_NAME: CrashDemo IMAGE_NAME: CrashDemo.exe DEBUG_FLR_IMAGE_TIMESTAMP: 43dc6ee7 STACK_COMMAND: .ecxr ; kb FAILURE_BUCKET_ID: ACCESS_VIOLATION_CrashDemo!TestFunc+2e BUCKET_ID: ACCESS_VIOLATION_CrashDemo!TestFunc+2e Followup: MachineOwner 注意看粗体字部分(异常发生的地址、调用堆栈信息、进一步分析异常的命令.ecxr和kb)。
通过.ecxr,我们可以切换到记录了异常信息的向下文中,这样,我们就能够访问到异常发生时调用堆栈和本地变量的值。我们可使用dv命令显示函数参数和本地变量的值。
cdb -z c:/myapp.dmp -logo out.txt -lines -c "!analyze -v;.ecxr;!for_each_frame dv /t;q"
/t 选项告诉dv命令显示变量的类型信息,输入如例:
00 006afe88 0043096e CrashDemo!TestFunc+0x2e [c:/tests/crashdemo/crashdemo.cpp @ 124]
int * pParam = 0x00000000
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
01 006aff6c 00430f31 CrashDemo!WorkerThread+0x5e [c:/tests/crashdemo/crashdemo.cpp @ 115]
void * lpParam = 0x00000000
int * TempPtr = 0x00000000
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
02 006affa8 00430ea2 CrashDemo!_callthreadstartex+0x51 [f:/rtm/vctools/crt_bld/self_x86/crt/src/threadex.c @ 348]
struct _tiddata * ptd = 0x00355188
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
03 006affb4 7c80b50b CrashDemo!_threadstartex+0xa2 [f:/rtm/vctools/crt_bld/self_x86/crt/src/threadex.c @ 331]
void * ptd = 0x00355188
struct _tiddata * _ptd = 0x00000000
.
2.6 虚拟内存分析 下列命令可以显示出进程的整个虚拟内存图:cdb -pv -pn myapp.exe -logo out.txt -c "!vadump -v;q"
BaseAddress: 00040000
AllocationBase: 00040000
AllocationProtect: 00000004 PAGE_READWRITE
RegionSize: 0002e000
State: 00002000 MEM_RESERVE
Type: 00020000 MEM_PRIVATE
BaseAddress: 0006e000
AllocationBase: 00040000
AllocationProtect: 00000004 PAGE_READWRITE
RegionSize: 00001000
State: 00001000 MEM_COMMIT
Protect: 00000104 PAGE_READWRITE + PAGE_GUARD
Type: 00020000 MEM_PRIVATE
BaseAddress: 0006f000
AllocationBase: 00040000
AllocationProtect: 00000004 PAGE_READWRITE
RegionSize: 00011000
State: 00001000 MEM_COMMIT
Protect: 00000004 PAGE_READWRITE
Type: 00020000 MEM_PRIVATE
XP和2003系统上,有一个更帅的命令!address,可执行下列任务:
* 显示出进程的整个虚拟内存图,可能会比vadump更可靠
* 显示虚拟内存的耗用情况统计
* 判断某个地址属于哪一个虚拟内存区 (如,判断该地址是否属于栈、堆还是可执行镜像)
通过!address显示虚拟内存图:
cdb -pv -pn myapp.exe -logo out.txt -c "!address;q"
00040000 : 00040000 - 0002e000 Type 00020000 MEM_PRIVATE Protect 00000000 State 00002000 MEM_RESERVE Usage RegionUsageStack Pid.Tid 658.644
0006e000 - 00001000 Type 00020000 MEM_PRIVATE Protect 00000104 PAGE_READWRITE | PAGE_GUARD State 00001000 MEM_COMMIT Usage RegionUsageStack Pid.Tid 658.644
0006f000 - 00011000 Type 00020000 MEM_PRIVATE Protect 00000004 PAGE_READWRITE State 00001000 MEM_COMMIT Usage RegionUsageStack Pid.Tid 658.644 同时显示虚拟内存耗用情况统计:
-------------------- Usage SUMMARY --------------------------
TotSize Pct(Tots) Pct(Busy) Usage
00838000 : 0.40% 27.96% : RegionUsageIsVAD
7e28c000 : 98.56% 0.00% : RegionUsageFree
01348000 : 0.94% 65.60% : RegionUsageImage
00040000 : 0.01% 0.85% : RegionUsageStack
00001000 : 0.00% 0.01% : RegionUsageTeb
001a0000 : 0.08% 5.53% : RegionUsageHeap
00000000 : 0.00% 0.00% : RegionUsagePageHeap
00001000 : 0.00% 0.01% : RegionUsagePeb
00001000 : 0.00% 0.01% : RegionUsageProcessParametrs
00001000 : 0.00% 0.01% : RegionUsageEnvironmentBlock
Tot: 7fff0000 Busy: 01d64000
-------------------- Type SUMMARY --------------------------
TotSize Pct(Tots) Usage
7e28c000 : 98.56% : <free>
01348000 : 0.94% : MEM_IMAGE
007b6000 : 0.38% : MEM_MAPPED
00266000 : 0.12% : MEM_PRIVATE
-------------------- State SUMMARY --------------------------
TotSize Pct(Tots) Usage
01647000 : 1.09% : MEM_COMMIT
7e28c000 : 98.56% : MEM_FREE
0071d000 : 0.35% : MEM_RESERVE
Largest free region: Base 01014000 - Size 59d5c000
在有内存泄漏时,内存耗用情况统计很有用,可用来判断究竟是栈、堆还是虚拟内存在泄漏。同时最大空闲区(Largest free region)有助于我们开发要消耗大量内存的应用程序。
判断某个地址属于哪一个虚拟内存区
0:000> !address 0x000a2480;q
000a0000 : 000a0000 - 000d7000
Type 00020000 MEM_PRIVATE
Protect 00000004 PAGE_READWRITE
State 00001000 MEM_COMMIT
Usage RegionUsageHeap
Handle 000a0000
2.7 查找符号
有时,我们需要通过名字来查找某个函数或变量的地址,CDB可帮忙。
如,定位kernel32模块中的UnhandledExceptionFilter函数:
0:000> x kernel32!UnhandledExceptionFilter;q
7c862b8a kernel32!UnhandledExceptionFilter = <no type information>
你也可以通过通配符来查找,如:
0:000> x myapp!*CMainFrame*
004542f8 MyApp!CMainFrame::classCMainFrame = struct CRuntimeClass
00401100 MyApp!CMainFrame::`scalar deleting destructor' (void)
00401090 MyApp!CMainFrame::CMainFrame (void)
也可以通过地址得到名称(使用ln命令),如:
0:000> ln 0x77d491c8;q
(77d491c6) USER32!GetMessageW+0x2 | (77d49216) USER32!CharUpperBuffW
注意:输入的地址不需要是首地址
2.8 显示数据结构
Visual Studio的观察(watch)窗口可以看到数据结构,而CDB和Windbg可以看的更多(包括偏移和布局)。通过dt命令实现,如:
cdb -pv -pn myapp.exe -logo out.txt -c "dt -b CSymbolInfoPackage;q"
0:000> dt /b CSymbolInfoPackage;q
+0x000 si : _SYMBOL_INFO
+0x000 SizeOfStruct : Uint4B
+0x004 TypeIndex : Uint4B
+0x04c NameLen : Uint4B
+0x050 MaxNameLen : Uint4B
+0x054 Name : Char
+0x058 name : Char
也可以显示数据结构的实例变量的布局信息,需要传入该变量的地址,如:
cdb -pv -pn myapp.exe -logo out.txt -c "dt -b CSymbolInfoPackage 0x0012f6d0;q"
0:000> dt /b CSymbolInfoPackage 0x0012f6d0;q
+0x000 si : _SYMBOL_INFO
+0x000 SizeOfStruct : 0x58
+0x004 TypeIndex : 2
+0x008 Reserved :
[00] 0
[01] 0
+0x038 Address : 0x411d30
[00] 83 'S'
+0x058 name : "SymbolInfo"
[00] 83 'S'
[01] 121 'y'
[02] 109 'm'
[03] 98 'b'
[04] 111 'o'
[05] 108 'l'
[06] 73 'I'
[07] 110 'n'
[17] 0 ''
...
[1998] -52 ''
[1999] -52 ''
[2000] -52 ''
|