作者: uNDeaD
Email: someonebw@gmail.com
Blog: http://blog.csdn.net/undead
转载请注明出处
动态观察生产环境下的进程!
pstack 进程名发现如下问题:
----------------- lwp# 757 / thread# 757 --------------------
ffffffff07413f58 startTaskHook(), exit value = 0x0000000000000000
** zombie (exited, not detached, not yet joined) **
----------------- lwp# 758 / thread# 758 --------------------
ffffffff07213f58 startTaskHook(), exit value = 0x0000000000000000
** zombie (exited, not detached, not yet joined) **
----------------- lwp# 759 / thread# 759 --------------------
ffffffff06f13f58 startTaskHook(), exit value = 0x0000000000000000
** zombie (exited, not detached, not yet joined) **
----------------- lwp# 760 / thread# 760 --------------------
ffffffff06d13f58 startTaskHook(), exit value = 0x0000000000000000
** zombie (exited, not detached, not yet joined) **
zombie 居然在进程里面出现了僵尸线程
估计是程序初始化线程的时候,有问题导致的!
gcore 进程名,看看情况!
> startTaskHook::dis
groupsend.so`startTaskHook: save %sp, -0xc0, %sp
groupsend.so`startTaskHook+4: call +0x105684
groupsend.so`startTaskHook+8: mov %i0, %o0
groupsend.so`startTaskHook+0xc: return %i7 + 8
groupsend.so`startTaskHook+0x10:clr %o0
定位到函数,将结果反馈给开发人员,迅速定位到问题,为线程创建的时候,没有赋参数,导致的
附相关opensolaris源码
-------------
#pragma weak _pthread_create = pthread_create 89 int 90 pthread_create(pthread_t *thread, const pthread_attr_t *attr, 91 void * (*start_routine)(void *), void *arg) 92 { 93 ulwp_t *self = curthread; 94 const thrattr_t *ap = attr? attr->__pthread_attrp : def_thrattr(); 95 const pcclass_t *pccp; 96 long flag; 97 pthread_t tid; 98 int error; 99 100 update_sched(self); 101 102 if (ap == NULL) 103 return (EINVAL); 104 105 /* validate explicit scheduling attributes */ 106 if (ap->inherit == PTHREAD_EXPLICIT_SCHED && 107 (ap->policy == SCHED_SYS || 108 (pccp = get_info_by_policy(ap->policy)) == NULL || 109 ap->prio < pccp->pcc_primin || ap->prio > pccp->pcc_primax)) 110 return (EINVAL); 111 112 flag = ap->scope | ap->detachstate | ap->daemonstate | THR_SUSPENDED; 113 error = _thrp_create(ap->stkaddr, ap->stksize, start_routine, arg, 114 flag, &tid, ap->guardsize); 115 if (error == 0) { 116 if (ap->inherit == PTHREAD_EXPLICIT_SCHED && 117 (ap->policy != self->ul_policy || 118 ap->prio != (self->ul_epri? self->ul_epri : self->ul_pri))) 119 /* 120 * The SUSv3 specification requires pthread_create() 121 * to fail with EPERM if it cannot set the scheduling 122 * policy and parameters on the new thread. 123 */ 124 error = _thr_setparam(tid, ap->policy, ap->prio); 125 if (error) { 126 /* 127 * We couldn't determine this error before 128 * actually creating the thread. To recover, 129 * mark the thread detached and cancel it. 130 * It is as though it was never created. 131 */ 132 ulwp_t *ulwp = find_lwp(tid); 133 if (ulwp->ul_detached == 0) { 134 ulwp->ul_detached = 1; 135 ulwp->ul_usropts |= THR_DETACHED; 136 (void) __lwp_detach(tid); 137 } 138 ulwp->ul_cancel_pending = 2; /* cancelled on creation */ 139 ulwp->ul_cancel_disabled = 0; 140 ulwp_unlock(ulwp, self->ul_uberdata); 141 } else if (thread) { 142 *thread = tid; 143 } 144 (void) thr_continue(tid); 145 } 146 147 /* posix version expects EAGAIN for lack of memory */ 148 if (error == ENOMEM) 149 error = EAGAIN; 150 return (error); 151 } 152