Working with Kernel Cores The most common type of kernel debug target is a core file, saved from a prior system crash. In the following sections, we highlight some of the introductory steps as used with mdb to explore a kernel core image. 14.1.1. Locating and Attaching the Target If a system has crashed, then we should have a core image saved in /var/crash on the target machine. The mdb debugger should be invoked from a system with the same architecture and Solaris revision as the crash image. The first steps are to locate the appropriate saved image and then to invoke mdb.
# cd /var/crash/nodename
# ls bounds unix.1 unix.3 unix.5 unix.7 vmcore.1 vmcore.3 vmcore.5 vmcore.7 unix.0 unix.2 unix.4 unix.6 vmcore.0 vmcore.2 vmcore.4 vmcore.6
# mdb -k unix.7 vmcore.7 Loading modules: [ unix krtld$c genunix specfs dtrace ufs ip sctp usba uhci s1394 fcp fctl nca lofs zfs random nfs audiosup sppp crypto md fcip logindmux ptm ipc ] >
14.1.2. Examining Kernel Core Summary Information The kernel core contains important summary information from which we can extract the following:
We can use the ::showrev and ::status dcmds to extract this information.
> ::showrev Hostname: zones-internal Release: 5.11 Kernel architecture: i86pc Application architecture: i386 Kernel version: SunOS 5.11 i86pc snv_27 Platform: i86pc > ::status debugging crash dump vmcore.2 (32-bit) from zones-internal operating system: 5.11 snv_27 (i86pc) panic message: BAD TRAP: type=e (#pf Page fault) rp=d2a587c8 addr=0 occurred in module "unix" due to a NULL pointer dereference dump content: kernel pages only > ::panicinfo cpu 0 thread d2a58de0 message BAD TRAP: type=e (#pf Page fault) rp=d2a587c8 addr=0 occurred in module "unix" due to a NULL pointer dereference gs fe8301b0 fs fec30000 es fe8d0160 ds d9820160 edi 0 esi dc062298 ebp d2a58828 esp d2a58800 ebx de453000 edx d2a58de0 ecx 1 eax 0 trapno e err 2 eip fe82ca58 cs 158 eflags 10282 uesp fe89ab0d ss 0 gdt fec1f2f002cf idt fec1f5c007ff ldt 140 task 150 cr0 8005003b cr2 0 cr3 4cb3000 cr4 6d8
14.1.3. Examining the Message Buffer The kernel keeps a cyclic buffer of the recent kernel messages. In this buffer we can observe the messages up to the time of the panic. The ::msgbuf dcmd shows the contents of the buffer.
> ::msgbuf MESSAGE /pseudo/zconsnex@1/zcons@5 (zcons5) online /pseudo/zconsnex@1/zcons@6 (zcons6) online /pseudo/zconsnex@1/zcons@7 (zcons7) online pseudo-device: ramdisk1024 ... panic[cpu0]/thread=d2a58de0: BAD TRAP: type=e (#pf Page fault) rp=d2a587c8 addr=0 occurred in module "unix" due to a NULL pointer dereference
sched: #pf Page fault Bad kernel fault at addr=0x0 pid=0, pc=0xfe82ca58, sp=0xfe89ab0d, eflags=0x10282 cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 6d8<xmme,fxsr,pge,mce,pse,de> cr2: 0 cr3: 4cb3000 gs: fe8301b0 fs: fec30000 es: fe8d0160 ds: d9820160 edi: 0 esi: dc062298 ebp: d2a58828 esp: d2a58800 ebx: de453000 edx: d2a58de0 ecx: 1 eax: 0 trp: e err: 2 eip: fe82ca58 cs: 158 efl: 10282 usp: fe89ab0d ss: 0 ...
14.1.4. Obtaining a Stack Trace of the Running Thread We can obtain a stack backtrace of the current thread by using the $C command. Note that the displayed arguments to each function are not necessarily accurate. On each platform, the meaning of the shown arguments is as follows:
-
SPARC. The values of the arguments if they are available from a saved stack frame, assuming they are not overwritten by use of registers during the called function. With SPARC architectures, a function's input argument registers are sometimes saved on the way out of a functionif the input registers are reused during the function, then values of the input arguments are overwritten and lost. -
x86. Accurate values of the input arguments. Input arguments are always saved onto the stack and can be accurately displayed -
x64. The values of the arguments, assuming they are available. As with the SPARC architectures, input arguments are passed in registers and may be overwritten.
> $C d2a58828 atomic_add_32+8(0) d2a58854 nfs4_async_inactive+0x3b(dc1c29c0, 0) d2a58880 nfs4_inactive+0x41() d2a5889c fop_inactive+0x15(dc1c29c0, 0) d2a588b0 vn_rele+0x4b(dc1c29c0) d2a588c0 snf_smap_desbfree+0x59(dda94080) d2a588dc dblk_lastfree_desb+0x13(de45b520, d826fb40) d2a588f4 dblk_decref+0x4e(de45b520, d826fb40) d2a58918 freemsg+0x69(de45b520) d2a5893c FreeTxSwPacket+0x3b(d38b84f0) d2a58968 CleanTxInterrupts+0xb4(d2f9cac0) d2a589a4 e1000g_send+0xf6(d2f9cac0, d9ffba00) d2a589c0 e1000g_m_tx+0x22() d2a589dc dls_tx+0x16(d4520f68, d9ffba00) d2a589f4 str_mdata_fastpath_put+0x1e(d3843f20, d9ffba00) d2a58a40 tcp_send_data+0x62d(db0ecac0, d97ee250, d9ffba00) d2a58aac tcp_send+0x6b6(d97ee250, db0ecac0, 564, 28, 14, 0) d2a58b40 tcp_wput_data+0x622(db0ecac0, 0, 0) d2a58c28 tcp_rput_data+0x2560(db0ec980, db15bd20, d2d45f40) d2a58c40 tcp_input+0x3c(db0ec980, db15bd20, d2d45f40) d2a58c78 squeue_enter_chain+0xe9(d2d45f40, db15bd20, db15bd20, 1, 1) d2a58cec ip_input+0x658(d990e554, d3164010, 0, e) d2a58d40 i_dls_link_ether_rx+0x156(d4523db8, d3164010, db15bd20) d2a58d70 mac_rx+0x56(d3520200, d3164010, db15bd20) d2a58dac e1000g_intr+0xa6(d2f9cac0, 0) d2a58ddc intr_thread+0x122()
14.1.5. Which Process? If the stack trace is of a kernel housekeeping or interrupt thread, the process reported for the thread will be that of p0"sched." The process pointer for the thread can be obtained with ::tHRead, and ::ps will then display summary information about that process. In this example, the thread is an interrupt thread (as indicated by the top entry in the stack from $C), and the process name maps to sched.
> d2a58de0::thread -p ADDR PROC LWP CRED d2a58de0 fec1d280 0 d9d1cf38 > fec1d280::ps -t S PID PPID PGID SID UID FLAGS ADDR NAME R 0 0 0 0 0 0x00000001 fec1d280 sched T t0 <TS_STOPPED>
14.1.6. Disassembling the Suspect Code Once we've located the thread of interest, we often learn more about what happened by disassembling the target and looking at the instruction that reportedly caused the panic. MDB's ::dis dcmd will disassemble the code around the target instruction that we extract from the stack backtrace.
> $C d2a58828 atomic_add_32+8(0) d2a58854 nfs4_async_inactive+0x3b(dc1c29c0, 0) d2a58880 nfs4_inactive+0x41() d2a5889c fop_inactive+0x15(dc1c29c0, 0) d2a588b0 vn_rele+0x4b(dc1c29c0) ... > nfs4_async_inactive+0x3b::dis nfs4_async_inactive+0x1a: pushl $0x28 nfs4_async_inactive+0x1c: call +0x51faa30 <kmem_alloc> nfs4_async_inactive+0x21: addl $0x8,%esp nfs4_async_inactive+0x24: movl %eax,%esi nfs4_async_inactive+0x26: movl $0x0,(%esi) nfs4_async_inactive+0x2c: movl -0x4(%ebp),%eax nfs4_async_inactive+0x2f: movl %eax,0x4(%esi) nfs4_async_inactive+0x32: movl 0xc(%ebp),%edi nfs4_async_inactive+0x35: pushl %edi nfs4_async_inactive+0x36: call +0x51b7cdc <crhold> nfs4_async_inactive+0x3b: addl $0x4,%esp nfs4_async_inactive+0x3e: movl %edi,0x8(%esi) nfs4_async_inactive+0x41: movl $0x4,0xc(%esi) nfs4_async_inactive+0x48: leal 0xe0(%ebx),%eax nfs4_async_inactive+0x4e: movl %eax,-0x8(%ebp) nfs4_async_inactive+0x51: pushl %eax nfs4_async_inactive+0x52: call +0x51477f4 <mutex_enter> nfs4_async_inactive+0x57: addl $0x4,%esp nfs4_async_inactive+0x5a: cmpl $0x0,0xd4(%ebx) nfs4_async_inactive+0x61: je +0x7e <nfs4_async_inactive+0xdf> nfs4_async_inactive+0x63: cmpl $0x0,0xd0(%ebx) > crhold::dis crhold: pushl %ebp crhold+1: movl %esp,%ebp crhold+3: andl $0xfffffff0,%esp crhold+6: pushl $0x1 crhold+8: movl 0x8(%ebp),%eax crhold+0xb: pushl %eax crhold+0xc: call -0x6e0b8 <atomic_add_32> crhold+0x11: movl %ebp,%esp crhold+0x13: popl %ebp crhold+0x14: ret > atomic_add_32::dis atomic_add_32: movl 0x4(%esp),%eax atomic_add_32+4: movl 0x8(%esp),%ecx atomic_add_32+8: lock addl %ecx,(%eax) atomic_add_32+0xb: ret
14.1.7. Displaying General-Purpose Registers In this example, the system had a NULL pointer reference at atomic_add_ 32+8(0). The faulting instruction was atomic, referencing the memory at the location pointed to by %eax. By looking at the registers at the time of the panic, we can see that %eax was indeed NULL. The next step is to attempt to find out why %eax was NULL.
> ::regs %cs = 0x0158 %eax = 0x00000000 %ds = 0xd9820160 %ebx = 0xde453000 %ss = 0x0000 %ecx = 0x00000001 %es = 0xfe8d0160 %edx = 0xd2a58de0 %fs = 0xfec30000 %esi = 0xdc062298 %gs = 0xfe8301b0 %edi = 0x00000000
%eip = 0xfe82ca58 atomic_add_32+8 %ebp = 0xd2a58828 %esp = 0xd2a58800
%eflags = 0x00010282 id=0 vip=0 vif=0 ac=0 vm=0 rf=1 nt=0 iopl=0x0 status=<of,df,IF,tf,SF,zf,af,pf,cf>
%uesp = 0xfe89ab0d %trapno = 0xe %err = 0x2
14.1.8. Navigating the Stack Backtrace The function prototype for atomic_add_32() reveals that the first argument is a pointer to the memory location to be added. Since this was an x86 machine, the arguments reported by the stack backtrace are known to be useful, and we can look to see where the NULL pointer was handed downin this case nfs4_async_inactive().
void atomic_add_32(volatile uint32_t *target, int32_t delta) { *target += delta; }
> atomic_add_32::dis atomic_add_32: movl 0x4(%esp),%eax atomic_add_32+4: movl 0x8(%esp),%ecx atomic_add_32+8: lock addl %ecx,(%eax) atomic_add_32+0xb: ret > $C d2a58828 atomic_add_32+8(0) d2a58854 nfs4_async_inactive+0x3b(dc1c29c0, 0) d2a58880 nfs4_inactive+0x41() d2a5889c fop_inactive+0x15(dc1c29c0, 0) d2a588b0 vn_rele+0x4b(dc1c29c0) ...
> $C d2a58828 atomic_add_32+8(0) d2a58854 nfs4_async_inactive+0x3b(dc1c29c0, 0) d2a58880 nfs4_inactive+0x41() d2a5889c fop_inactive+0x15(dc1c29c0, 0) d2a588b0 vn_rele+0x4b(dc1c29c0) ... > nfs4_async_inactive+0x3b::dis nfs4_async_inactive+0x1a: pushl $0x28 nfs4_async_inactive+0x1c: call +0x51faa30 <kmem_alloc> nfs4_async_inactive+0x21: addl $0x8,%esp nfs4_async_inactive+0x24: movl %eax,%esi nfs4_async_inactive+0x26: movl $0x0,(%esi) nfs4_async_inactive+0x2c: movl -0x4(%ebp),%eax nfs4_async_inactive+0x2f: movl %eax,0x4(%esi) nfs4_async_inactive+0x32: movl 0xc(%ebp),%edi nfs4_async_inactive+0x35: pushl %edi nfs4_async_inactive+0x36: call +0x51b7cdc <crhold> nfs4_async_inactive+0x3b: addl $0x4,%esp nfs4_async_inactive+0x3e: movl %edi,0x8(%esi) nfs4_async_inactive+0x41: movl $0x4,0xc(%esi) nfs4_async_inactive+0x48: leal 0xe0(%ebx),%eax nfs4_async_inactive+0x4e: movl %eax,-0x8(%ebp) nfs4_async_inactive+0x51: pushl %eax nfs4_async_inactive+0x52: call +0x51477f4 <mutex_enter> nfs4_async_inactive+0x57: addl $0x4,%esp nfs4_async_inactive+0x5a: cmpl $0x0,0xd4(%ebx) nfs4_async_inactive+0x61: je +0x7e <nfs4_async_inactive+0xdf> nfs4_async_inactive+0x63: cmpl $0x0,0xd0(%ebx) ...
Looking at the disassembly, it appears that there is an additional function call, which is omitted from the stack backtrack (typically due to tail call compiler optimization). The call is to crhold(), passing the address of a credential structure from the arguments to nfs4_async_inactive(). Here we can see that crhold() does in fact call atomic_add_32().
/* * Put a hold on a cred structure. */ void crhold(cred_t *cr) { atomic_add_32(&cr->cr_ref, 1); }
> crhold::dis crhold: pushl %ebp crhold+1: movl %esp,%ebp crhold+3: andl $0xfffffff0,%esp crhold+6: pushl $0x1 crhold+8: movl 0x8(%ebp),%eax crhold+0xb: pushl %eax crhold+0xc: call -0x6e0b8 <atomic_add_32> crhold+0x11: movl %ebp,%esp crhold+0x13: popl %ebp crhold+0x14: ret
|