Hi all,
I realize that you probably haven't all had a chance to look at the Crash lab yet, considering I just published it on Friday but here is the review for it. I have to say that considering the amount of downloads for the buggy bits site I am a bit surprised with the lack of questions about the labs. Perhaps they were too easy, or perhaps people just haven't had the time to go through them yet... either case, feel free to ask questions or comment on the labs, even questions that aren't directly related to the labs are welcome. My comments and answers are shown inline in red.
I will be publishing a Lab 2 which will be a memory lab later this week.
Reproduce the problem
1. Browse to the reviews page http://localhost/BuggyBits/Reviews.aspx, you should see a couple of bogus reviews for BuggyBits
2. Click on the Refresh button in the reviews page. This will crash the w3wp.exe process (or aspnet_wp.exe on IIS 5)
Note: If you have Visual Studio installed a Just-In-Time Debugger message may pop up (just click no for the purposes of this excercise).
However since this message box will sit there and wait for user input in order to shut down the app you may want to disable JIT debugging if you have visual studio
installed on a test system.
Examine the eventlogs
1. Open the Application and System eventlogs, the information in the eventlogs will differ based on the OS and IIS version you are running. Among other events you may
have a System Event looking something like this...
Event Type: Warning Event Source: W3SVC Event Category: None Event ID: 1009 Date: 2008-02-08 Time: 10:12:06 User: N/A Computer: MYMACHINE Description: A process serving application pool 'DefaultAppPool' terminated unexpectedly. The process id was '4592'. The process exit code was '0xe0434f4d'. For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Q: What events do you see?
A: On my machine (Win2k3, .NET 2.0 SP1) i get the above message in the system eventlog and the following two messages in the application eventlog.
From the first one we can see that there was a nullreference exception of some kind...
Event Type: Error Event Source: .NET Runtime 2.0 Error Reporting Event Category: None Event ID: 5000 Date: 2008-02-11 Time: 11:13:18 User: N/A Computer: MYMACHINE Description: EventType clr20r3, P1 w3wp.exe, P2 6.0.3790.3959, P3 45d6968e, P4 app_code.wbwztx_4, P5 0.0.0.0, P6 47a81aee, P7 13, P8 0, P9 system.nullreferenceexception, P10 NIL.And the second one completely gives it away as a null reference exception that occurred in the finalizer/destructor for Review (Review.Finalize()).
The good thing about this is that we don't really even need to debug this on W2k3, .NET 2.0 SP1, but on some combinations of IIS/OS/Framework you will not get the 2nd error message.Event Type: Error Event Source: ASP.NET 2.0.50727.0 Event Category: None Event ID: 1334 Date: 2008-02-11 Time: 11:13:15 User: N/A Computer: PRATHER Description: An unhandled exception occurred and the process was terminated. Application ID: /LM/W3SVC/1/Root/BuggyBits Process ID: 4944 Exception: System.NullReferenceException Message: Object reference not set to an instance of an object. StackTrace: at Review.Finalize()Q: What does the exit code 0xe0434f4d mean?
A: 0xe0434f4d is the native exception code for .NET exceptions, any .net exception will be translated into 0xe0434f4d, whether it is a SQLException, NullReferenceException or a FileNotFoundException.
Q: Can you tell from the eventlogs what it was that caused the crash?
A: In this case as mentioned above, the 2nd application eventlog entry gives it away completely. The process crashed because there was an unhandled exception on the finalizer thread when finalizing an object of type Review.
Get a memory dump
1. Browse to the reviews page http://localhost/BuggyBits/Reviews.aspx again, but don't click refresh
2. Open a command prompt and move to the debuggers directory and type in "adplus -crash -pn w3wp.exe" and hit enter
Q: A new window should appear on the toolbar, what is it?
A: This is the debugger, if you maximize it you will see a cmd window with cdb running. Cdb is the commandline equivalent for windbg (i.e. windbg without a UI).
Q: What is the debugger waiting for? Hint: Check the help files for ADplus/crash mode in windbg
A: Here is a table showing what events/exceptions adplus monitors for by default, and what it will do if any of these occurr
Abbreviations L=Log entry in debuglog, T=Record Time, S=Log stack, M=Mini dump, F=Full dump, E=log in the eventlog.
You can change how these events are handled by providing your own cfg file to adplus.
Code Exception/Event Explanation/Note 1st Chance 2nd Chance av Access Violation An attempt to access to uninitialized memory.
This is also thrown for .net NullReferenceExceptionsLTSM LTSFE ch Invalid Handle An attempt to access data using an invalid handle. Eg. window, file or other native handle. LTSM LTSFE ii Illegal Instruction An attempt is made to execute an invalid instruction. LTSM LTSFE dz Integer divide Integer divide by zero. Even in .net. LTSM LTSFE c000008e Floating divide Floating point divide by zero. LTSM LTSFE iov Integer overflow An attempt to store more than 32/64 bits in an integer. LTSM LTSFE lsq Invalid Lock Sequence An attempt is made to execute an invalid lock sequence. LTSM LTSFE sov Stack overflow The memory used for the stak (method pointers, parameters, local vars etc.) exceeds the amount allocated for the stack. Typically caused by recursion. LTSM LTSFE eh C++ EH exception C++ EH Exception LTS LTSFE * Unknown exception Any exception not specifically defined in this list LTS LTSFE clr .NET Exception Any .net exception - LTSFE bpe Breakpoint A break point is hit or the debugger window is shut down. LTSME - wkd Wake Debugger Wake debugger. LTSME - ld Dll Load A dll is loaded in the process L L ud Dll Unload A dll is unloaded from the process L L epr Exit Process The process exits. LTSFE - aph Application Hang The Windows OS has determined that the process is no longer responding. LTSM LTSFE cce Ctrl+C Console App Ctrl+C or Ctrl+Break is passed to a console app. LTSM LTSFE dm Data Misaligned A unit of data is read from or written to an address that is not a multiple of the data size. LTSM LTSFE gp Guard page violation An attempt to access data within a guard page. LTSM LTSFE ip In Page IO Error I/O Error LTSM LTSFE isc Invalid System Call Invalid System Call LTSM LTSFE sbo Stack Buffer overflow Stack buffer overflow LTSM LTSFE
3. Reproduce the issue by clicking on the refresh button in the reviews page.
Q: What files got created in the dump folder? Note: The dump folder will be located under your debuggers directory with the name crash_mode and todays date and time
A: The following files were created in my crash mode folder (under the debuggers folder)
ADPlus_report.txt
<DIR> CDBScripts
PID-4628__W3WP.EXE__Date_02-11-2008__Time_13-14-0808.log
Process_List.txt
PID-4628__W3WP.EXE__1st_chance_AccessViolation__mini_17a4_2008-02-11_13-14-19-358_1214.dmp
PID-4628__W3WP.EXE__1st_chance_Process_Shut_Down__full_17a4_2008-02-11_13-14-31-889_1214.dmp
PID-4628__W3WP.EXE__2nd_chance_NET_CLR__full_17a4_2008-02-11_13-14-20-093_1214.dmpSo we have the normal log files, CDBScripts, ADPlus_report.txt and the process list which I described in the review for lab 1
Apart from this we have a minidump for a 1st chance access violation (probably our NullReferenceException), and then a 2nd chance CLR exception (the same nullreference exception, but now 2nd chance since it wasn't handled), followed by a process shutdown dump. You can see in which order they occurred by looking at the timestamps on them.
Open the dump in windbg
1. Open the dump file labeled 2nd Chance CLR Exception in windbg (file/open crash dump). Note that this dump got created just before the 1st chance process shutdown.
Note: if you throw an exception (.net or other) you have a chance to handle it in a try/catch block. The first time it is thrown it becomes a 1st chance exception and is non-fatal. If you don't handle the exception it will become a 2nd chance exception (unhandled exception) and any 2nd chance exceptions will terminate the process.
2. Set up the symbol path and load sos (see the setup instructions for more info)
In a crash dump, the active thread is the one that caused the exceptions (since the dump is triggered on an exception).
Q: Which thread is active when you open the dump? Hint: check the command bar at the bottom of the windbg window.
A: The active thread is thread #20, which if I compare it to the !threads output is the Finalizer thread
20 2 116c 001ac540 b220 Enabled 02f0d28c:02f0f130 00199400 0 MTA (Finalizer) System.NullReferenceException (02f0cb3c)
In the !threads output we can also see that this thread threw a NullReferenceException and that the address of the Exception is 0x02f0cb3c
Examine the callstacks and the exception
1. Examine the native and managed callstacks.
kb 2000
!clrstackQ: What type of thread is it?
A: We can see from kb that this is in fact the finalizer thread but !clrstack does not give us a whole lot of information about what the managed stack looks like because we are deep in the exception handler code.
0:020> kL 2000 ChildEBP RetAddr 02a0f9d8 79f071ac kernel32!RaiseException+0x53 02a0fa38 7a10733d mscorwks!RaiseTheExceptionInternalOnly+0x2a8 02a0fa50 7a1073b1 mscorwks!RaiseTheException+0x4e 02a0fa78 7a1073e2 mscorwks!RealCOMPlusThrowWorker+0x71 02a0faa4 7a1073f0 mscorwks!RealCOMPlusThrow+0x30 02a0fab4 7a14f0d4 mscorwks!RealCOMPlusThrow+0xd 02a0fbe4 7a025a5b mscorwks!Thread::RaiseCrossContextException+0x41f 02a0fc98 79f01747 mscorwks!Thread::DoADCallBack+0x293 02a0fcb8 79f04a71 mscorwks!ManagedThreadBase_DispatchInner+0x47 02a0fcc4 79f04b01 mscorwks!ManagedThreadBase::FinalizerAppDomain+0x25 02a0fd20 79efe5b4 mscorwks!SVR::DoOneFinalization+0x141 02a0fda8 79efe70a mscorwks!SVR::FinalizeAllObjects+0x56 02a0fdc0 79ef3207 mscorwks!SVR::GCHeap::FinalizerThreadWorker+0xe7 02a0fdd4 79ef31a3 mscorwks!ManagedThreadBase_DispatchInner+0x4f 02a0fe68 79ef30c3 mscorwks!ManagedThreadBase_DispatchMiddle+0xb1 02a0fea4 79fb9643 mscorwks!ManagedThreadBase_DispatchOuter+0x6d 02a0fecc 79fb960d mscorwks!ManagedThreadBase_NoADTransition+0x32 02a0fedc 79fd0c91 mscorwks!ManagedThreadBase::FinalizerBase+0xd 02a0ff14 79f95a2e mscorwks!SVR::GCHeap::FinalizerThreadStart+0xbb 02a0ffb8 7d4dfe21 mscorwks!Thread::intermediateThreadProc+0x49 02a0ffec 00000000 kernel32!BaseThreadStart+0x34 0:020> !clrstack OS Thread Id: 0x116c (20) ESP EIP 02a0fa8c 7d4e2366 [GCFrame: 02a0fa8c] 02a0faf0 7d4e2366 [GCFrame: 02a0faf0] 02a0fcec 7d4e2366 [GCFrame: 02a0fcec]Q: What is this thread doing?
A: It is trying to call the finalizer of some object and in doing so it threw an exception that is now being rethrown as a 2nd chance exception.
2. Examine the exception thrown
!pe
Note: !pe/!PrintException will print out the current exception being thrown on this stack if no parameters are given
Q: What type of exception is it?
A: A NullReferenceException
0:020> !pe Exception object: 02f0cb3c Exception type: System.NullReferenceException Message: Object reference not set to an instance of an object. InnerException: StackTrace (generated): StackTraceString: HResult: 80004003
Note: In some cases, like this one where the exception has been rethrown, the original stacktrace may not be available in the exception. In cases like this you may get more information if you find the original exception
3. Look at the objects on the stack to find the address of the original exception
!dso
Q: What is the address of the original exception
Hint: Look at your previous pe output to see the address of the rethrown exception. Compare this to the addresses of the objects on the stack. You should have multiple exceptions, a few with the rethrown exception address but one of the bottommost exceptions will be the original one (look for one with a different address).
A: The only NullReferenceException with a different address than the one we listed before (0x2f0cb3c) is 0x6f1e5e4
0:020> !dso OS Thread Id: 0x116c (20) ESP/REG Object Name 02a0f9f4 02f0cb3c System.NullReferenceException 02a0f9f8 02f0cb3c System.NullReferenceException 02a0fa40 02f0cb3c System.NullReferenceException 02a0fa58 02f0cb3c System.NullReferenceException 02a0faac 02f0cb3c System.NullReferenceException 02a0fabc 02f0cb3c System.NullReferenceException 02a0fb58 06f1e5e4 System.NullReferenceException 02a0fb5c 02f0bd70 System.Byte[] 02a0fb78 02f0cb3c System.NullReferenceException 02a0fb7c 02f0bd70 System.Byte[] 02a0fb8c 02f0bd70 System.Byte[] 02a0fc58 06eb281c System.RuntimeType 02a0fcb0 02efd4b0 Review
4. Print out the original exception and look at the information and the callstack
!pe <original exception address>
0:020> !pe 06f1e5e4 Exception object: 06f1e5e4 Exception type: System.NullReferenceException Message: Object reference not set to an instance of an object. InnerException: StackTrace (generated): SP IP Function 02A0F950 0FE90F24 App_Code_wbwztx_4!Review.Finalize()+0x14 StackTraceString: HResult: 80004003Q: In what method is the exception thrown?
A: In the finalizer/destructor for the Review class
Q: What object is being finalized?
Note: you could actually have gotten this information by dumping out the _exceptionMethodString of the rethrown exception as well, but with !pe of the original exception you get the information in a cleaner way.
A: A Review object
Q: Normally exceptions thrown in ASP.NET are handled with the global exception handler and an error page is shown to the user. Why did this not occurr here? Why did it cause a crash?
A: The global exception handler for ASP.NET only handles exceptions thrown when processing a request. In this case the exception is thrown on the finalizer which means that there is no user there to see the exception and therefore there is noone to show the error page to.
In 1.1. and if you use legacy exception handling in 2.0, exceptions on the finalizer are non-fatal, i.e. they will not crash the process, instead the finalizer would just stop finalizing the current object. This is very dangerous since this means that you may not process code that releases native objects and thus you may end up with a memory leak that is very hard to troubleshoot or other resource leaks like connection leaks etc. All unhandled exceptions on the finalizer thread should be investigated and taken seriously.
Examine the code for verification
1. Open Review.cs to find the destructor/finalizer for the Review class
~Review() { if (quote.ToString() != string.Empty) { quote = null; } }
Q: which line or method could have caused the exception
A: In this case the code for the Review destructor is fairly short so it is pretty easy to pinpoint where the exception occurred. The only instruction in here that could have caused a NullRef is the quote.ToString() and this would occurr if quote = null.
As an extra excercise you can also examine the disassembly of the function to try to pinpoint better where in the function the exception is caused
!u <IP shown in the exceptionstack>
0:020> !u 0FE90F24 Normal JIT generated code Review.Finalize() Begin 0fe90f10, size 57 0fe90f10 55 push ebp 0fe90f11 8bec mov ebp,esp 0fe90f13 57 push edi 0fe90f14 56 push esi 0fe90f15 53 push ebx 0fe90f16 83ec14 sub esp,14h 0fe90f19 33c0 xor eax,eax 0fe90f1b 8945e8 mov dword ptr [ebp-18h],eax 0fe90f1e 8bf1 mov esi,ecx 0fe90f20 8b4e04 mov ecx,dword ptr [esi+4] 0fe90f23 8b01 mov eax,dword ptr [ecx] 0fe90f25 ff5028 call dword ptr [eax+28h] 0fe90f28 8bc8 mov ecx,eax 0fe90f2a 8b15ac46eb0a mov edx,dword ptr ds:[0AEB46ACh] ("") 0fe90f30 e8db2a4d69 call mscorlib_ni+0x2a3a10 (79363a10) (System.String.Equals(System.String, System.String), mdToken: 06000143) 0fe90f35 85c0 test eax,eax 0fe90f37 7505 jne 0fe90f3e 0fe90f39 33d2 xor edx,edx 0fe90f3b 895604 mov dword ptr [esi+4],edx 0fe90f3e c745e400000000 mov dword ptr [ebp-1Ch],0 0fe90f45 c745e8fc000000 mov dword ptr [ebp-18h],0FCh 0fe90f4c 685e0fe90f push 0FE90F5Eh 0fe90f51 eb00 jmp 0fe90f53 0fe90f53 58 pop eax 0fe90f54 ffe0 jmp eax 0fe90f56 8d65f4 lea esp,[ebp-0Ch] 0fe90f59 5b pop ebx 0fe90f5a 5e pop esi 0fe90f5b 5f pop edi 0fe90f5c 5d pop ebp 0fe90f5d c3 ret 0fe90f5e c745e800000000 mov dword ptr [ebp-18h],0 0fe90f65 ebef jmp 0fe90f56
If we look at the disassembly below and compare it to the method above we find a few things.
1. The exception address 0x0FE90F24 falls on the bolded lines
2. Right before the bolded line we set EAX to ptr(ESI+0x4) and it is when we use EAX that we access violate/get the nullreference exception. ESI is typically used to hold the this
pointer (in this case the pointer to our Review object) so the question is really what is at ESI+0x4
Dumping out the Review object that we found with !dso we can see that at offset 0x4 we have the member variable quote and looking at the value we can see that it is null.
0:020> !do 02efd4b0 Name: Review MethodTable: 0fe11cc4 EEClass: 0fde5860 Size: 16(0x10) bytes (C:/WINDOWS/Microsoft.NET/Framework/v2.0.50727/Temporary ASP.NET Files/buggybits/b27906cc/f5f91899/App_Code.wbwztx_4.dll) Fields: MT Field Offset Type VT Attr Value Name 790fd8c4 4000008 4 System.String 0 instance 00000000 quote 790fd8c4 4000009 8 System.String 0 instance 00000000 source
3. If that wasn't enough proof we could also have seen approximately where we were at in the function by looking at the instructions below the point where we threw the exception. We have an instruction to move "" into edx, i.e. an empty string (string.Empty), and right after that there is a string compare operation which matches up to if(quote.ToString() != string.Empty)
Btw, the reason that quote was null is because ClearReview() sets it to null... The proper way to resolve this issue would have been to not have a finalizer (since we don't really need one given that we don't have to clean up any native resources) or at the very least we should have called GC.SuppressFinalize() in the ClearReview() method.
Until next time
Tess
# re: .NET Debugging Demos Lab 2: Crash - Review
I just wanted to say that the labs are great. And because of the hints is kind of easy to go through them and this is good because I'm pretty ignorant about this type of debugging. I'm sure they'll get more difficult later on.
# re: .NET Debugging Demos Lab 2: Crash - Review
Tess,
What about a screen cast for the lab ?
# Link Listing - February 11, 2008
Sharepoint Announcing the DoD 5015.2 Resource Kit for SharePoint Server 2007 [Via: sptblog ] WPF The...
# Link Listing - February 11, 2008
Link Listing - February 11, 2008
# re: .NET Debugging Demos Lab 2: Crash - Review
I aggree with Lucian Bargaoanu: "The labs are great." I am also pleased wit the hint because of my ignorance with this kind of debugging.
The lack of questions about the lab you notice, may be caused by the quality of the labs. So just keep on going as you are; I lik it.
# re: .NET Debugging Demos Lab 2: Crash - Review
Great to hear:) I'll post a memory lab today or tomorrow depending on if when i can get an hour or two freed up.
As a teaser, the mem leak will be produced by running tinyget on 30 threads, 100 loops for links.aspx
# re: .NET Debugging Demos Lab 2: Crash - Review
“examine the disassembly ” parts is difficult for me
# re: .NET Debugging Demos Lab 2: Crash - Review
agreed, I think that is a bit difficult overall... and not needed in most cases, which is why i just added it as an extra excercise.
do you have any specific questions around it?
# re: .NET Debugging Demos Lab 2: Crash - Review
Dear, questions will come, don't worry!
today i spent 8 hours to make 2 reviews articles to chinese, I have not run the code, just translate to catch up with you. i will investigate them, then ask you something.
-------
tired,over here
# re: .NET Debugging Demos Lab 2: Crash - Review
I'm curious, is there something you did to cause the disassembly to show that the memory contained an empty string? I do not see the same results when disassembling. My output does not show the ("") with the mov instruction.
01bc0f20 8bc8 mov ecx,eax
01bc0f22 8b158c46550b mov edx,dword ptr ds:[0B55468Ch]
01bc0f28 e8a3db7a77 call mscorlib_ni+0x2aead0 (7936ead0) (System.String.Equals(System.String, System.String), mdToken: 06000142)
Thanks,
Kevin
# re: .NET Debugging Demos Lab 2: Crash - Review
interesting, no nothing special, it might be slightly different versions of sos.dll...
# re: .NET Debugging Demos Lab 2: Crash - Review
A littel comments for exception object:
0:018> kb
ChildEBP RetAddr Args to Child
01e9f9fc 79f97065 e0434f4d 00000001 00000001 kernel32!RaiseException+0x53
01e9fa5c 7a05b941 02678b1c 00000000 00000000 mscorwks!RaiseTheExceptionInternalOnly+0x226
Review the first args of the 1st line in callstack: kernel32!RaiseException+0x53, "e0434f3d", you can also found this strange number at eventvwr log just descripbed in tess's lab2. This is a common CLR exception number.
Second, the real CLR exception object is identified in the first arg of mscorwks!RaiseTheExceptionInternalOnly
so, you can dumpobject to examine this exception object by !do 02678b1c or directly by !pe 02678b1c . If you type command !do, then you can found below info:
790fa3e0 40000b7 c System.String 0 instance 02679cc4 _exceptionMethodString
we can continously !do 02679cc4 to see more information about this exception.
Thanks for tess's great post!!! I'd learned much more debugging skill and thoughts from you, thanks again!
# re: .NET Debugging Demos Lab 2: Crash - Review
Thank you for all the posting. This is the best place to learn managed debugging. :)
I have one question on !dso.
I am getting the following
0:018> !dso
OS Thread Id: 0xb90 (18)
ESP/REG Object Name
01e9fa18 0266174c System.NullReferenceException
01e9fa1c 0266174c System.NullReferenceException
01e9fa64 0266174c System.NullReferenceException
01e9fa78 0266174c System.NullReferenceException
01e9facc 0266174c System.NullReferenceException
01e9fad8 0266174c System.NullReferenceException
01e9fb74 02659b0c System.NullReferenceException
01e9fb78 0266040c System.Byte[]
01e9fba4 0266174c System.NullReferenceException
01e9fba8 0266040c System.Byte[]
01e9fbb8 0266040c System.Byte[]
01e9fccc 025f75c0 System.CodeDom.Compiler.TempFileCollection
The difference between this and the one shows in the lab is that the bottom piece. Your output shows
0:020> !dso
OS Thread Id: 0x116c (20)
ESP/REG Object Name
02a0f9f4 02f0cb3c System.NullReferenceException
02a0f9f8 02f0cb3c System.NullReferenceException
02a0fa40 02f0cb3c System.NullReferenceException
02a0fa58 02f0cb3c System.NullReferenceException
02a0faac 02f0cb3c System.NullReferenceException
02a0fabc 02f0cb3c System.NullReferenceException
02a0fb58 06f1e5e4 System.NullReferenceException
02a0fb5c 02f0bd70 System.Byte[]
02a0fb78 02f0cb3c System.NullReferenceException
02a0fb7c 02f0bd70 System.Byte[]
02a0fb8c 02f0bd70 System.Byte[]
02a0fc58 06eb281c System.RuntimeType
02a0fcb0 02efd4b0 Review <<--
So in my case, I am not able to follow some piece of the lab. I am running IIS 6.0 with Win2K3 SP2.
I am interested on getting the offset like the one in the lab
"Dumping out the Review object that we found with !dso we can see that at offset 0x4 we have the member variable quote and looking at the value we can see that it is null."
Is there another way to get the offset?
# re: .NET Debugging Demos Lab 2: Crash - Review
Found it. Sorry, should have tried a little harder before posting the question.
0:018> !dumpheap -mt 01f98ba4
Address MT Size
02644834 01f98ba4 16
02644a74 01f98ba4 16
total 2 objects
Statistics:
MT Count TotalSize Class Name
01f98ba4 2 32 Review
Total 2 objects
0:018> !do 02644834
Name: Review
MethodTable: 01f98ba4
EEClass: 024354ac
Size: 16(0x10) bytes
(C:/WINDOWS/Microsoft.NET/Framework/v2.0.50727/Temporary ASP.NET Files/debug1/b1d4288d/b06837fb/App_Code.jbom0q7o.dll)
Fields:
MT Field Offset Type VT Attr Value Name
790fa3e0 4000008 4 System.String 0 instance 00000000 quote
790fa3e0 4000009 8 System.String 0 instance 00000000 source
# re: .NET Debugging Demos Lab 2: Crash - Review
I have the same problem as Raymond, the Review object doesn't show when I run !dso
I can get to it if I run !dumpheap -stat, find it "manually", then use !dumpheap -mt [address]
but there's gotta be an easier way than that
# re: .NET Debugging Demos Lab 2: Crash - Review
if you dont see the review object with !dso that probably means that you caught the dump when it was finalizing some other object, in which case you would need to run !dumpheap -stat to get any review objects on the heap. Can you post your !dso output?
# re: .NET Debugging Demos Lab 2: Crash - Review
is pretty much the same (other than the addresses) as the one Raymond posted
0:018> !dso
OS Thread Id: 0xb90 (18)
ESP/REG Object Name
01e9fa18 0266174c System.NullReferenceException
01e9fa1c 0266174c System.NullReferenceException
01e9fa64 0266174c System.NullReferenceException
01e9fa78 0266174c System.NullReferenceException
01e9facc 0266174c System.NullReferenceException
01e9fad8 0266174c System.NullReferenceException
01e9fb74 02659b0c System.NullReferenceException
01e9fb78 0266040c System.Byte[]
01e9fba4 0266174c System.NullReferenceException
01e9fba8 0266040c System.Byte[]
01e9fbb8 0266040c System.Byte[]
01e9fccc 025f75c0 System.CodeDom.Compiler.TempFileCollection
is there an easier way to locate a specific object though?
# re: .NET Debugging Demos Lab 2: Crash - Review
I'm not sure why the review object is not showing up, it is probably gone by the time you gathered the dump.
Looking at the review object is just an additional step to verify the issue but you can verify it through looking at the exception and through looking at the code as well so getting the review object is not crucial to the issue.
# re: .NET Debugging Demos Lab 2: Crash - Review
Hey Tess,
Another great one! Thanks! Is it beneficial to have a good understanding of assembly language to be proficient at this? I'm assuming it does but wanted to get your take.
And speaking of that, in the 0:020> !u 0FE90F24 command, I had a tough time seeing the lines in bold type. Tried on two different monitors. Screen res on both is cranked to the highest setting possible. Could be other settings I guess. Anyway, just thought I'd mention it.
Take care.
KB
# re: .NET Debugging Demos Lab 2: Crash - Review
Hi KB,
No it is not essential to have a good understanding of assembly language to be proficient at debugging .net. I just wanted to add it for completeness.
I am not sure why the bolded lines are not showing up online. The bolding is in the html but I can't see it online either now that you mention it... either way it is these two lines that were bolded
0fe90f23 8b01 mov eax,dword ptr [ecx]
0fe90f25 ff5028 call dword ptr [eax+28h]
Thanks
Tess
# re: .NET Debugging Demos Lab 2: Crash - Review
Hi Tess,
This is the best blog which i could find for debugging .net code. I have a strange problem reported by a user. The problem is that a customer application crashes at a certain point of time. When we had a look at the event viewer and the dumps and saw a Null Reference exception in our code.
SP IP Function
1E57ED38 1B1F5023 com.uni.jellybeans.CommsManager.Run()
1E57ED70 793B0D1F System.Threading.ThreadHelper.ThreadStart_Context(System.Object)
1E57ED78 793740AB System.Threading.ExecutionContext.runTryCode(System.Object)
00000000 00000001 System.Runtime.CompilerServices.RuntimeHelpers.ExecuteCodeWithGuaranteedCleanup(TryCode, CleanupCode, System.Object)
1E57F204 79373FF7 System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
1E57F21C 79373EDE System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
1E57F234 793B0C68 System.Threading.ThreadHelper.ThreadStart()
I followed your instructions and was successful in diassembling the code. But i could not find the exact string which is causing the failure in our code. i would need your assistance on this. The following is the dso of the dump
!dso
OS Thread Id: 0x1dc0 (58)
ESP/REG Object Name
1e57f528 10ebe164 System.NullReferenceException
1e57f52c 10ebe164 System.NullReferenceException
1e57f574 10ebe164 System.NullReferenceException
1e57f58c 10ebe164 System.NullReferenceException
1e57f5e0 10ebe164 System.NullReferenceException
1e57f5f0 10ebe164 System.NullReferenceException
1e57f68c 04ed5b00 System.NullReferenceException
1e57f690 10ebd12c System.Byte[]
1e57f6ac 10ebe164 System.NullReferenceException
1e57f6b0 10ebd12c System.Byte[]
1e57f6c0 10ebd12c System.Byte[]
Please let me know if you need anything further to analyze
# re: .NET Debugging Demos Lab 2: Crash - Review
Hi Thyagu,
From this we can see that the app is starting a new thread and runs com.uni.jellybeans.CommsManager.Run() on it. and in there there is a null reference exception. If that is not handled then that would definitely lead to a crash.
Its hard to say anything else about what is causing the null ref just based on the stack above but you can do
!u 1B1F5023 which will unassemble the code for com.uni.jellybeans.CommsManager.Run() and then locate where in the !u output that 1B1F5023 falls to get a bit more info about where you are getting the nullref. If that doesnt help then perhaps looking at the code and adding some debug.writelines would help
# re: .NET Debugging Demos Lab 2: Crash - Review
Hi Tess,
Thanks a ton for your quick response. I would like to admit that i am an amateur in analyzing this dumps. I have pasted below the failing disassembly code.
1b1f5016 8b0d3cbd9612 mov ecx,dword ptr ds:[1296BD3Ch]
1b1f501c 8b45d8 mov eax,dword ptr [ebp-28h]
1b1f501f 8b5844 mov ebx,dword ptr [eax+44h]
1b1f5022 8b530c mov edx,dword ptr [ebx+0Ch]
1b1f5025 e8c685175e call mscorlib_ni+0x2ad5f0 (7936d5f0) (System.String.Concat(System.String, System.String), mdToken: 060001c7)
1b1f502a 50 push eax
1b1f502b 8bcf mov ecx,edi
1b1f502d 8bd6 mov edx,esi
1b1f502f 8b01 mov eax,dword ptr [ecx]
1b1f5031 ff505c call dword ptr [eax+5Ch]
1b1f5034 8d65f4 lea esp,[ebp-0Ch]
I would be pleased if you can help me or guide me on further action.
Thanks a lot for your help.
Cheers,
Thyagu
# re: .NET Debugging Demos Lab 2: Crash - Review
Hi Thyagu
The NullRef happens between these lines.
1b1f5022 8b530c mov edx,dword ptr [ebx+0Ch]
1b1f5025 e8c685175e call mscorlib_ni+0x2ad5f0 (7936d5f0) (System.String.Concat(System.String, System.String), mdToken: 060001c7)
The top line is taking ebx+0xc (probably a membervariable of some object) and stores it in a temporary register
The second line calls string.concat so it is doing something along the lines of str1 = str2 + str3
I would look for the line in code that concatenates strings and see if any of the strings that it concatenates may be null. and also look at the line of code above it to see if could cause a null ref.
I am not exactly sure how the code looks but looking at the disassembly we have
mov ecx,dword ptr ds:[1296BD3Ch]
=> ecx = whatever is stored at 0x1296BD3C
(likely some static string)
mov eax,dword ptr [ebp-28h]
=> eax = what is stored at ebp-0x28
(some parameter or local) i'll call it myparam for now
ebx,dword ptr [eax+44h]
=> ebx = what is stored at eax+0x44
(some membervariable of myparam) i'll call it mymembervar
edx,dword ptr [ebx+0Ch]
=> edx = ebx + 0xc
(some membervar of mymembervar)
so the code should look something like this
str = somestring + myParam.myMemberVar.membervar
and one of these is probably null
# re: .NET Debugging Demos Lab 2: Crash - Review
Hi Tess,
Thanks for your quick response.
I spent the whole day in figuring where the failure had happened, but i was not successful.
If there is any alternative way to figure out which object is responsible for the Null Reference exception, it would be helpful.
FYI : This problem is not reproducible in our environment, so we can't debug the code and find out the reason.
Your help will be very much appreciated.
Cheers,
Thyagu
# re: .NET Debugging Demos Lab 2: Crash - Review
if the above didnt help then the best would be to add tracing statements as i mentioned before to print out what objects are null etc. inside that method...
# re: .NET Debugging Demos Lab 2: Crash - Review
Hello Tess,
I agree with the other folks ... nice work on these labs!
I am a little late to the party as this Lab2 was posted a few months ago, but I had a question about your explanation of the assembly code.
You made this comment:
2. Right before the bolded line we set EAX to ptr(ESI+0x4) and it is when we use EAX that we access violate/get the nullreference exception.
Did you mean to say ECX here instead of EAX?
Thanks again for sharing your knowledge.
Cheers,
PaulG
# re: .NET Debugging Demos Lab 2: Crash - Review
0fe90f20 8b4e04 mov ecx,dword ptr [esi+4]
0fe90f23 8b01 mov eax,dword ptr [ecx]
Well actually both, first we set ecx to ptr[esi+4], and then we move the contents of this pointer to eax
# re: .NET Debugging Demos Lab 2: Crash - Review
Thanks Tess ... that makes sense.
I have another question ... I hope you don't mind.
I liked how you displayed the assembly code and then used the IP to show us which line of assembly code was causing the access violation / null reference.
Is there a way to point us to the line of IL that is causing the null reference?
I can find the IL by printing my exception:
0:016> !pe 03cdbb0c
Exception object: 03cdbb0c
Exception type: System.NullReferenceException
Message: Object reference not set to an instance of an object.
InnerException: <none>
StackTrace (generated):
SP IP Function
025BF520 027A0F1C App_Code_ie7jwbrp!Review.Finalize()+0x14
StackTraceString: <none>
HResult: 80004003
Then I use my IP to find the corresponding method desc:
0:016> !ip2md 027A0F1C
MethodDesc: 029b0ae0
Method Name: Review.Finalize()
Class: 025c54ac
MethodTable: 029b0b0c
mdToken: 06000013
Module: 00ee8670
IsJitted: yes
m_CodeOrIL: 027a0f08
And then I dump the IL for this method desc:
0:016> !DumpIL 029b0ae0
ilAddr = 025227f4
.try
{
IL_0000: ldarg.0
IL_0001: ldfld Review::quote
IL_0006: callvirt System.Object::ToString
IL_000b: ldsfld System.String::Empty
IL_0010: call System.String::op_Inequality
IL_0015: brfalse.s IL_001e
IL_0017: ldarg.0
IL_0018: ldnull
IL_0019: stfld Review::quote
IL_001e: leave.s IL_0027
} // end .try
.finally
{
IL_0020: ldarg.0
IL_0021: call System.Object::Finalize
IL_0026: endfinally
} // end .finally
IL_0027: ret
Because we know that we are calling ToString() on a null string we it is obvious that this line is the culprit.
IL_0006: callvirt System.Object::ToString
But is there some commnad that will point us exactly to this line of IL?
Cheers again,
PaulG
# re: .NET Debugging Demos Lab 2: Crash - Review
unfortunately there is no way (other than matching it up to the assembly) to find out which line is causing the issue
# re: .NET Debugging Demos Lab 2: Crash - Review
In step one, regarding the application event log entry for 2003, you stated: "The process crashed because there was an unhandled exception on the finalizer thread when finalizing an object of type Review."
I'm assuming that you just know it was in the finalizer thread because you wrote it, right? Just looking at the event log entry, we can obviously tell what the exception was, but I'm missing how one would know where it occured without walking through the debugging.
On that note, do you know of a link that breaks down what the fields are in the event ID 5000 entry? I've not had any luck finding that reference...
Thanks! These labs have been a big help!
# re: .NET Debugging Demos Lab 2: Crash - Review
_______________________
Message: Object reference not set to an instance of an object.
StackTrace: at Review.Finalize()
______________________
The stack for the exception shows that it is on the finalizer i.e. in the finalize method for the review object, so that is how I know...
Regarding the breakdown of the 5000 entry, what specifically are you looking for?
# re: .NET Debugging Demos Lab 2: Crash - Review
For the 5000 entry, I'm looking for what are the P fields.
So, for instance, for this entry:
EventType clr20r3, P1 w3wp.exe, P2 6.0.3790.3959, P3 45d6968e, P4 app_code.wbwztx_4, P5 0.0.0.0, P6 47a81aee, P7 13, P8 0, P9 system.nullreferenceexception, P10 NIL
P5 looks like an emtpy IP address. Is it? What about P10? What is Nil? What is the address that is being shows in P3?
I'm not really worried about what the answers to those questions are for this specific example.. just the generic answer such as "P1 is the executable that generated the exception". I just haven't had any luck finding a reference for that, and was curious about it.
# re: .NET Debugging Demos Lab 2: Crash - Review
In the walkthrough you state "ESI is typically used to hold the this pointer". Is there a compilation somewhere of this type of information when debugging .NET applications?
Also with your help I found it very easy to identify which x86 assembler instruction the exception occurred on, however determining which .NET source line or as a minimum which IL instruction in the method the exception occurred on. I can use WinDBG to produce the assembler version of the JITed method or the IL version of the method but I need an IL version of the method interspersed with the assembler instructions so that I can correlate.
# re: .NET Debugging Demos Lab 2: Crash - Review
Robin,
The fact that ESI usually holds the this pointer is not .net specific. I dont know off hand if there is a compilation of what registers are usually used for but the most common ones are that ESI holds this, and eax holds the return value of a function.
Regarding the IL, you can run !dumpil on the method descriptor to get the il of the method.
# re: .NET Debugging Demos Lab 2: Crash - Review
When I try this example, w3wp.exe doesn't crash; I get the normal red and yellow exception page, with the message:
Server Error in '/BuggyBits' Application.
--------------------------------------------------------------------------------
Validation of viewstate MAC failed. If this application is hosted by a Web Farm or cluster, ensure that <machineKey> configuration specifies the same validationKey and validation algorithm. AutoGenerate cannot be used in a cluster.
What am I doing wrong (or right??)
# re: .NET Debugging Demos Lab 2: Crash - Review
That sounds like a config error. Do any of the pages work?