原文地址:http://mtaulty.com/communityserver/blogs/mike_taultys_blog/archive/2004/08/03/4671.aspx
Continuing on from theprevious post on using WinDbg let’s take what we learnt in that previous post and apply it to managed code.
WinDbg supports the debugging of managed code through an extension namedSOS.DLL. This is named for esoteric reasons that I’ll not get into here but you can find out where that name came from by Googling the web for it.
This DLL is shipped with the latest debugger so you’ll find a version of it which matches V1.1 of the .NET framework in the debugger’s installation folder and you’ll also find a sub-folder labelled CLR10 which contains the previous version of the DLL.
You interact with the SOS.DLL extension by loading it up into WinDbg as you load any other extension. So, if I want to make use of SOS.DLL I issue;
.load SOS.DLL
And the debugger loads the extension for me and I can see it in the results of a;
.chain
command and I can set it as my default extension by using;
.setdll SOS.DLL
Before we go into SOS.DLL in more depth I’ll add a quick word about the symbols for the .NET framework. If you have the .NET framework SDK installed on your machine then you should have symbols for DLLs such asmscorwks.dll and mscoree.dll which make up the CLR installed onto your machine. On my machine (with VS.NET 2003 installed) these symbol files exist in a folder named;
C:/program files/Microsoft Visual Studio.NET 2003/SDK/v1.1/Symbols
So, this means that you can add this folder to your symbol path (see the previous post) and have WinDbg pick up symbols for the CLR bits for you locally. Alternatively, these symbols will come down from the symbol server anyway but it might save you some download time.
So, let’s consider that straight away what that adds. If I have a piece of managed code such as the following;
using System;
using System.Threading;
namespace ConsoleApplication12
{
class BadClass
{
public BadClass()
{
}
~BadClass()
{
Thread.Sleep(Timeout.Infinite);
}
}
class EntryPoint
{
staticvoid Main(string[] args)
{
BadClass c1 =new BadClass();
BadClass c2 =new BadClass();
System.GC.Collect();
System.GC.Collect();
System.GC.WaitForPendingFinalizers();
Console.WriteLine("Done");
Console.ReadLine();
}
}
}
Then I’ve created a bit of a monster in that what you should find is that this piece of code probably runs to completion if you run it indebug mode but it’ll never complete if you run it in release mode. This is because in release mode we madec1 and c2 eligible for garbage collection at an earlier point and that allows this code to run the finalizer forBadClass and that finalizer blocks the finalizer thread.
How to debug this with WinDbg? Run the process in release mode, attach WinDbg and have a look at the stack traces with a;
~* kb 200
command. I can see straight away from this that one of my threads looks like this;
2 Id: 7b8.a08 Suspend: 1 Teb: 7ffdd000 Unfrozen
ChildEBP RetAddr Args to Child
06bffd50 7c90fb74 7c8023ae 00000001 06bffd84 ntdll!KiFastSystemCallRet
06bffd54 7c8023ae 00000001 06bffd84 00000000 ntdll!NtDelayExecution+0xc
06bffdac 7929703b ffffffff 00000001 ffffffff KERNEL32!SleepEx+0x61
06bffdcc 792e42d5 ffffffff 06bffe04 06bffe34 mscorwks!Thread::UserSleep+0x93
06bffddc 0095a492 06bffde8 ffffffff 06bffe3c mscorwks!ThreadNative::Sleep+0x30
WARNING: Frame IP not in any known module. Following frames may be wrong.
06bffe34 791f9118 06bffeb0 7920fc3f ffffffff 0x95a492
06bffec0 791f91dd 04a619a4 00158010 00147880 mscorwks!MethodTable::CallFinalizer+0xee
06bffed4 791e1f08 04a619a4 7c8099f2 00000000 mscorwks!CallFinalizer+0x84
06bfff6c 791bbed0 04a619a4 00000000 0012fb9c mscorwks!CallFinalizer+0x245
06bfffb4 7c812ccb 00000000 0012fb9c 003e0000 mscorwks!GCHeap::FinalizerThreadStart+0xc2
06bfffec 00000000 791d0020 00000000 00000000 KERNEL32!BaseThreadStart+0x37
So, I can see that this is myfinalizer thread and somewhere in the call chain I can see that this thread is callingSleep which sounds like a bad idea to me for a finalizer method.
Instantly, you get a good picture of what might be going on here. As an aside, it’s worth taking a look around themscorwks, mscoree and friends DLLs to have a look at places where you can set breakpoints. You can do this with;
X mscoree!*
X mscorwks!*
So, our example so far has been nice but it doesn’t directly interact with managed code. If we want to do that then we need to look to theSOS.DLL and the functionality that it offers. We can see this with!SOS.help which gives;
0:003> !sos.help
SOS : Help
COMState | List COM state for each thread
ClrStack | Provides true managed stack trace, source and line numbers.
Additional parameters: -p[arams] -l[ocals] -r[egs] -a[ll].
DumpClass <addr> | Dump EEClass info
DumpDomain [<addr>] | List assemblies and modules in a domain
DumpHeap [-stat] [-min 100] [-max 2000] [-mt 0x3000000] [-type <partial type name>] [-fix] [start [end]] | Dump GC heap contents
DumpMD <addr> | Dump MethodDesc info
DumpMT [-MD] <addr> | Dump MethodTable info
DumpModule <addr> | Dump EE Module info
DumpObj <addr> | Dump an object on GC heap
DumpStack [-EE] [-smart] [top stack [bottom stack] | -EE only shows managed stack items.
DumpStackObjects [top stack [bottom stack]
DumpVC <mt> <addr> | Dump a value class object
EEHeap [-gc] [-win32] [-loader] | List GC/Loader heap info
EEStack [-short] [-EE] | List all stacks EE knows
EEVersion | List mscoree.dll version
FinalizeQueue [-detail] | Work queue for finalize thread
GCInfo [<MD>] [IP] | Dump GC encoding info for a managed method
GCRoot <addr> | Find roots on stack/handle for object
IP2MD <addr> | Find MethodDesc from IP
Name2EE <module name> <item name> | Find memory address of EE data given a class/method name
ObjSize [<addr>] | Find number of bytes that a root or all roots keep alive on GC heap.
ProcInfo [-env] [-time] [-mem] | Display the process info
RWLock [-all] <addr> | List info for a Read/Write lock
SyncBlk [-all|#] | List syncblock
ThreadPool | Display CLR threadpool state
Threads | List managed threads
Token2EE <module name> <mdToken> | Find memory address of EE data for metadata token
u [<MD>] [IP] | Unassembly a managed code
Looking at “Global State”
So, there are some simpler “global-state” commands in here which show us information about the CLR. These would be things such as;
!ProcInfo
0:003> !ProcInfo
---------------------------------------
Process Times
Process Started at: 2004 Aug 3 12:5:20.76
Kernel CPU time : 0 days 00:00:00.03
User CPU time : 0 days 00:00:00.04
Total CPU time : 0 days 00:00:00.07
---------------------------------------
Process Memory
WorkingSetSize: 7476 KB PeakWorkingSetSize: 7476 KB
VirtualSize: 142832 KB PeakVirtualSize: 142832 KB
PagefileUsage: 2916 KB PeakPagefileUsage: 2916 KB
---------------------------------------
32 percent of memory is in use.
Memory Availability (Numbers in MB)
Total Avail
Physical Memory 2046 1372
Page File 3944 3350
Virtual Memory 2047 1907
!Threads
0:003> !Threads
ThreadCount: 2
UnstartedThread: 0
BackgroundThread: 1
PendingThread: 0
DeadThread: 0
PreEmptive GC Alloc Lock
ID ThreadOBJ State GC Context Domain Count APT Exception
0 cd0 0014cf38 2000020 Enabled 00000000:00000000 00147880 0 Ukn
2 a08 00158010 2001220 Enabled 04a61d94:04a61ff4 00147880 0 Ukn (Finalizer)
!ThreadPool
0:003> !ThreadPool
Worker Thread: Total: 0 Running: 0 Idle: 0 MaxLimit: 0 MinLimit: 0
Work Request in Queue: 0
--------------------------------------
Number of Timers: 0
--------------------------------------
Completion Port Thread: Total: 0 Free: 0 MaxFree: 0 CurrentLimit: 0 MaxLimit: 1000 MinLimit: 0
!COMState
0:003> !COMState
ID TEB APT APTId CallerTID Context
0 cd0 7ffdf000 Ukn
1 e3c 7ffde000 Ukn
2 a08 7ffdd000 Ukn
3 d88 7ffdc000 Ukn
The last one here, COMState, is useful for seeing how your .NET threads have initilaised themselves to COM (i.e. are they STA threads or MTA threads and more detailed info on where they are with respect to COM apartments, contexts, etc).
!FinalizeQueue
0:003> !FinalizeQueue
SyncBlock to be cleaned up: 0
----------------------------------
generation 0 has 0 finalizable objects (0014d330->0014d330)
generation 1 has 0 finalizable objects (0014d330->0014d330)
generation 2 has 0 finalizable objects (0014d330->0014d330)
Ready for finalization 1 objects (0014d330->0014d334)
Statistics:
MT Count TotalSize Class Name
935108 2 24 ConsoleApplication12.BadClass
Total 2 objects
In our scenario you can get a clear picture of what’s present on the queue for finalization in that we have 2 instances ofBadClass resident in there.
!EEVersion
0:003> !EEVersion
1.1.4322.573 retail
Workstation build
Looking at Managed Memory
We can get a picture of the “global” state of memory within the application through thedumpheap command.
The simplest way to usedumpheap is to provide the –stat option which will give a complete list of what’s allocated on the managed heap. Note that for a big process you might need to go and get a cup of tea whilst this list is being created. In our small example dumpheap –stat gives us;
total 57 objects
Statistics:
MT Count TotalSize Class Name
79c077b4 1 20 System.Collections.ArrayList
79bff0a4 1 20 System.AppDomainSetup
79c19424 1 24 System.LocalDataStoreMgr
935108 2 24 ConsoleApplication12.BadClass
79bfcb44 1 32 System.SharedStatics
79c0d0cc 1 52 System.Collections.Hashtable
95236c 1 56 System.Char[]
79bfc414 1 64 System.ExecutionEngineException
79bfc2dc 1 64 System.StackOverflowException
79bfc1a4 1 64 System.OutOfMemoryException
9526b0 1 76 System.Byte[]
79bfdbec 1 80 System.AppDomain
952c28 1 144 System.Collections.Hashtable/bucket[]
952964 1 292 System.Int32[]
79bf9af8 26 1448 System.String
14d0f0 11 2800 Free
95209c 5 6328 System.Object[]
Total 57 objects
So we can see that we have 57 objects here. If you examine the line for our managed typeBadClass you’ll see that it has a MT column for its method table. This effectively describes the type and we can use it to restrict the output ofdumpheap by doing;
!dumpheap –mt 935108
0:003> !dumpheap -mt 935108
Address MT Size
04a61998 00935108 12
04a619a4 00935108 12
total 2 objects
Statistics:
MT Count TotalSize Class Name
935108 2 24 ConsoleApplication12.BadClass
Total 2 objects
Note that this actually gives usaddresses of the instances of our type (2 in our case) which we’ll come back to in a second but, first, how would we get the MethodTable for a particular type without sitting through !dumpheap –stat first? We can use theName2EE function as below;
!Name2EE ConsoleApplication12.exe ConsoleApplication12.BadClass
0:003> !Name2EE ConsoleApplication12.exe ConsoleApplication12.BadClass
--------------------------------------
MethodTable: 00935108
EEClass: 06c2334c
Name: ConsoleApplication12.BadClass
And now we know the MethodTable address for our type which is niceJ Now, coming back to those addresses. These allow us to dump out the objects themselves using the !dumpobj command;
!dumpobj 04a61998
0:003> !dumpobj 04a61998
Name: ConsoleApplication12.BadClass
MethodTable 0x00935108
EEClass 0x06c2334c
Size 12(0xc) bytes
mdToken: 02000002 (t:/Temp/ConsoleApplication12/bin/Release/ConsoleApplication12.exe)
This object isn’t so interesting so I’ve gone back and dumped out a hashtable so you can see what a real object looks like;
0:003> !dumpobj 04a61a90
Name: System.Collections.Hashtable
MethodTable 0x79c0d0cc
EEClass 0x79c0d20c
Size 52(0x34) bytes
mdToken: 020000f9 (c:/windows/microsoft.net/framework/v1.1.4322/mscorlib.dll)
FieldDesc*: 79c0d270
MT Field Offset Type Attr Value Name
79c0d0cc 4000395 4 CLASS instance 04a61d04 buckets
79c0d0cc 4000396 1c System.Int32 instance 0 count
79c0d0cc 4000397 20 System.Int32 instance 0 occupancy
79c0d0cc 4000398 24 System.Int32 instance 7 loadsize
79c0d0cc 4000399 28 System.Single instance 0.720000 loadFactor
79c0d0cc 400039a 2c System.Int32 instance 0 version
79c0d0cc 400039b 8 CLASS instance 00000000 keys
79c0d0cc 400039c c CLASS instance 00000000 values
79c0d0cc 400039d 10 CLASS instance 00000000 _hcp
79c0d0cc 400039e 14 CLASS instance 00000000 _comparer
79c0d0cc 400039f 18 CLASS instance 00000000 m_siInfo
79c0d0cc 4000394 0 CLASS shared static primes
>> Domain:Value 00147880:04a61ac4 <<
So, in essence, we can traverse the entire managed heap from within WinDbg here and we can see the details of every single instance that is stored on that managed heap and we can track down all the members of those instances ad infinitum. This is really, really powerful stuff.
Looking at Stack Frames
What about stack frames? Well, if we look at the original stack trace that I had when I was trying to diagnose my “hung application” problem we can see the frames look like this;
2 Id: 7b8.a08 Suspend: 1 Teb: 7ffdd000 Unfrozen
ChildEBP RetAddr Args to Child
06bffd50 7c90fb74 7c8023ae 00000001 06bffd84 ntdll!KiFastSystemCallRet
06bffd54 7c8023ae 00000001 06bffd84 00000000 ntdll!NtDelayExecution+0xc
06bffdac 7929703b ffffffff 00000001 ffffffff KERNEL32!SleepEx+0x61
06bffdcc 792e42d5 ffffffff 06bffe04 06bffe34 mscorwks!Thread::UserSleep+0x93
06bffddc 0095a492 06bffde8 ffffffff 06bffe3c mscorwks!ThreadNative::Sleep+0x30
WARNING: Frame IP not in any known module. Following frames may be wrong.
06bffe34 791f9118 06bffeb0 7920fc3f ffffffff 0x95a492
06bffec0 791f91dd 04a619a4 00158010 00147880 mscorwks!MethodTable::CallFinalizer+0xee
06bffed4 791e1f08 04a619a4 7c8099f2 00000000 mscorwks!CallFinalizer+0x84
06bfff6c 791bbed0 04a619a4 00000000 0012fb9c mscorwks!CallFinalizer+0x245
06bfffb4 7c812ccb 00000000 0012fb9c 003e0000 mscorwks!GCHeap::FinalizerThreadStart+0xc2
06bfffec 00000000 791d0020 00000000 00000000 KERNEL32!BaseThreadStart+0x37
So, what’s the bit that’s highlighted there? How come WinDbg can’t fathom this stack frame? Essentially what this points to is that in between themscorwks!MethodTable::CallFinalizer function and mscorwks!ThreadNative::Sleep function we have managed code. If we want to see the managed stack frames then we can issue;
!ClrStack
0:002> !ClrStack
Thread 2
ESP EIP
06bffe04 7c911444 [FRAME: ECallMethodFrame] [DEFAULT] Void System.Threading.Thread.Sleep(I4)
06bffe14 06d4010e [DEFAULT] [hasThis] Void ConsoleApplication12.BadClass.Finalize()
at [+0x1e] [+0x6]
which works on the current thread (i.e. the one that the debugger is “focused” on) and dumps out the managed stack – you can see that we now could guess that we have a problem in BadClass.Finalize which we kind of knew all along ;-)
We can also do!ClrStack –params –locals –regs –all to include parameter, local and register information which is going to be really helpful for real-world debugging. For managed objects that we find we can then go and do!dumpobj to take a look at those objects.
Setting BreakPoints
Setting breakpoints isn’t quite as easy for managed code in WinDbg as it might be but it’s far from impossible. What we have to work out to set a breakpoint on a manged function is the address of the code for that function.
For our type,BadClass let’s say that we want to set a breakpoint in its constructor. What we do is;
1) Get the MethodTable address for the type by issuing our !Name2EE command.
a. !Name2EE ConsoleApplication12.exe ConsoleApplication12.BadClass
0:002> !Name2EE ConsoleApplication12.exe ConsoleApplication12.BadClass
--------------------------------------
MethodTable: 00935108
EEClass: 06c2334c
Name: ConsoleApplication12.BadClass
2) We can now get hold of the method table for this class by using the !dumpmt command as below;
a. !dumpmt –md 00935108
0:002> !dumpmt -md 00935108
EEClass : 06c2334c
Module : 0015b610
Name: ConsoleApplication12.BadClass
mdToken: 02000002 (t:/Temp/ConsoleApplication12/bin/Release/ConsoleApplication12.exe)
MethodTable Flags : 180000
Number of IFaces in IFaceMap : 0
Interface Map : 00935148
Slots in VTable : 5
--------------------------------------
MethodDesc Table
Entry MethodDesc JIT Name
79bf84eb 79bf84f0 None [DEFAULT] [hasThis] String System.Object.ToString()
79bf8473 79bf8478 None [DEFAULT] [hasThis] Boolean System.Object.Equals(Object)
79bf848b 79bf8490 None [DEFAULT] [hasThis] I4 System.Object.GetHashCode()
009350fb 00935100 None [DEFAULT] [hasThis] Void ConsoleApplication12.BadClass.Finalize()
009350eb 009350f0 None [DEFAULT] [hasThis] Void ConsoleApplication12.BadClass..ctor()
3) We can see that the address for our constructor here is 009350eb. We can provide this as a breakpoint to WinDbg
a. bp 009350eb
b. and when that breakpoint hits we can issue a step command (“t”) and we’ll find that we’re inside the function that we wanted to break in.
Determining Roots
Last topic for this post – how to determine what is holding on to your CLR memory. I talked about how we can use!dumpheap to take a look at what’s on the managed heap but we can also use the fantastic!gcroot command in order to determine for any managed object what “set of roots” are actually causing that object to remain “alive” rather than be garbage collected by the GC.
This is a particularly useful command if you suspect you’ve got a memory leak somewhere in the sense that something is holding on to a managed object longer than it should be – you can break into the process with WinDbg and take a look at what “roots” are present for your particular object and, hopefully, that’d move you forward in determining what’s going on.
An example of the output here on!gcroot is as below let’s use the following piece of code;
class EntryPoint
{
privatestatic ArrayList al = new ArrayList();
staticvoid Main(string[] args)
{
for (int i = 0; i < 10; ++i)
{
switch (i % 3)
{
case 0:
al.Add("Hello");
break;
case 1:
al.Add(new Exception(""));
break;
case 2:
al.Add(newobject());
break;
}
}
Console.ReadLine();
}
}
So, if I want to know what objects are around at the point of the Console.ReadLine()call then I can use !dumpheap –stat to show me;
total 132 objects
Statistics:
MT Count TotalSize Class Name
79c53524 1 12 System.IO.TextReader/NullTextReader
79c0c46c 1 12 System.__Filters
79c0bfa8 1 12 System.Reflection.Missing
79c03458 1 12 System.IO.Stream/NullStream
79c8d504 1 16 System.IO.TextReader/SyncTextReader
79c58114 1 20 System.Text.CodePageEncoding/Decoder
79c3667c 1 20 System.Text.UTF8Encoding/UTF8Encoder
79c363c4 1 20 System.IO.TextWriter/NullTextWriter
79c077b4 1 20 System.Collections.ArrayList
79c03efc 1 20 System.Text.UTF8Encoding
79bff0a4 1 20 System.AppDomainSetup
79c61b94 2 24 System.Text.Encoding/DefaultEncoder
79bfcb44 1 32 System.SharedStatics
79bf8364 3 36 System.Object
79c36114 2 40 System.Text.CodePageEncoding
79bfb4f4 2 40 System.Text.StringBuilder
79c61cac 2 48 System.IO.TextWriter/SyncTextWriter
79c57f8c 1 56 System.IO.StreamReader/NullStreamReader
79c53fac 1 56 System.IO.StreamReader
79c618d4 3 60 System.IO.__ConsoleStream
79c0b37c 4 64 System.RuntimeType
79bfc414 1 64 System.ExecutionEngineException
79bfc2dc 1 64 System.StackOverflowException
79bfc1a4 1 64 System.OutOfMemoryException
79bfdbec 1 80 System.AppDomain
79c0c584 3 84 System.Reflection.MemberFilter
79c35d24 3 156 System.IO.StreamWriter
79bfbcdc 3 192 System.Exception
9526b0 6 1356 System.Byte[]
95236c 12 2008 System.Char[]
79bf9af8 56 3280 System.String
14d0f0 6 10220 Free
95209c 7 10440 System.Object[]
Total 132 objects
And if I now want to know what instances ofSystem.Exception are around I can do !dumpheap –mt 79bfbcdc and get;
0:003> !dumpheap -mt 79bfbcdc
Address MT Size
04a619fc 79bfbcdc 64
04a61a48 79bfbcdc 64
04a61a94 79bfbcdc 64
total 3 objects
Statistics:
MT Count TotalSize Class Name
79bfbcdc 3 192 System.Exception
Total 3 objects
And if I now want to knowwhy the first of these instances is still around I can use !gcroot on it to give me;
0:003> !gcroot 04a619fc
Scan Thread 0 (ec4)
ESP:12f644:Root:04a6197c(System.Collections.ArrayList)->04a61990(System.Object[])->04a619fc(System.Exception)
ESP:12f658:Root:04a6197c(System.Collections.ArrayList)->04a61990(System.Object[])
ESP:12f668:Root:04a6197c(System.Collections.ArrayList)->04a61990(System.Object[])
Scan Thread 2 (eb8)
Scan HandleTable 15f008
Scan HandleTable 149c88
HANDLE(Strong):9411c0:Root:05a62000(System.Object[])->04a6197c(System.Collections.ArrayList)
I can see that my System.Exception instance at 04a619fc is being referenced (rooted) by an array of objects at 04a61990 and that is rooted by an ArrayList at 04a619c.
So, we can chase back object references as far as we need to work out what’s going on with our managed memory and why particular instances are not yet eligible for GC.
Working with Exceptions
The WinDbg debugger does not know anything about managed exceptions. Whenever managed code throws an exception it throws a native exception of type0xe0434f4d
If you want to stop on managed exceptions then you can issue to the debugger;
sxe 0xe0434f4d
or
sxe clr (much easier!)
and it’ll break at the point where the exception is thrown. You can grab the exception record from an unmanaged point of view by issuing a display exception record command with a minus 1 as;
.exr -1
And this will give you a result such as;
0:000> .exr -1
ExceptionAddress: 7c82dad4 (KERNEL32!RaiseException+0x00000038)
ExceptionCode: e0434f4d (CLR exception)
ExceptionFlags: 00000001
NumberParameters: 0
I think there’s a nice way of turning this into a managed exception object but, at the time of writing, I can’t work it out. What you can do is to execute a!dumpstackobjects and the usual case would be to should find your exception living within the set of objects that you get back there.
Wrap-up
That’s it for this posting – if you want a more info on debugging with SOS.DLL check outthis big guide that’s over on MSDN which will give you more info and also have a look at resources such as;
http://blogs.msdn.com/mvstanton/archive/2004/04/05/108023.aspx
http://msdn.microsoft.com/msdnmag/issues/03/06/Bugslayer/default.aspx
Enjoy!