僵尸进程:一个进程使用fork创建子进程,如果子进程退出,而父进程并没有调用wait或waitpid获取子进程的状态信息,那么子进程的进程描述符仍然保存在系统中。这种进程称之为僵死进程。在每个进程退出的时候,内核释放该进程所有的资源,包括打开的文件,占用的内存等。 但是仍然为其保留一定的信息(包括进程号the process ID,退出状态the termination status of the process,运行时间the amount of CPU time taken by the process等)。直到父进程通过wait / waitpid来取时才释放。 但这样就导致了问题,如果进程不调用wait / waitpid的话, 那么保留的那段信息就不会释放,其进程号就会一直被占用,但是系统所能使用的进程号是有限的,如果大量的产生僵死进程,将因为没有可用的进程号而导致系统不能产生新的进程. 此即为僵尸进程的危害,应当避免。
从系统角度来说,处理僵尸进程有两种方法:
1 找到僵死进程的父进程,kill掉父进程,那么僵死进程将变为孤儿进程,孤儿进程在系统中由init进程接管,init进程将回收僵死进程的资源
2 reboot系统,因为僵死进程是不可以被kill掉
如下测试:
[root@limt ~]# ps -ef|grep 21165
root 21165 7459 0 05:51 pts/1 00:00:00 ./a.out
root 21166 21165 0 05:51 pts/1 00:00:00 [a.out] <defunct>
root 21190 8866 0 05:52 pts/3 00:00:00 grep 21165
[root@limt ~]# kill 21165
[root@limt ~]#
[root@limt ~]#
[root@limt ~]# ps -ef|grep 21165 杀掉父进程后,子进程也消失
root 21196 8866 0 05:52 pts/3 00:00:00 grep 21165
[root@limt ~]# ps -ef|grep 16704
root 16704 16703 0 04:06 pts/1 00:00:00 [a.out] <defunct>
root 16719 8866 0 04:06 pts/3 00:00:00 grep 16704
[root@limt ~]#
[root@limt ~]#
[root@limt ~]# kill -9 16704
[root@limt ~]#
[root@limt ~]# ps -ef|grep 16704
root 16704 16703 0 04:06 pts/1 00:00:00 [a.out] <defunct>
root 16725 8866 0 04:06 pts/3 00:00:00 grep 16704
</pre><p></p><p>从开发角度,有两种方法来避免僵死进程:1 通过signal函数处理,因为每一个子进程退出都会向父进程发生一个SIGCHILD信号2 通过fork两次子进程实现,使子进程的父进程为init进程(),大多数守护进程就是这样实现的简单程序演示产生僵死进程的过程:</p><pre code_snippet_id="554626" snippet_file_name="blog_20141217_3_6668512" name="code" class="cpp">一 父进程fork一个子进程然后不使用waitpid函数,直接退出,而子进程sleep 120秒后退出
#include <sys/types.h>
#include <sys/wait.h>
#include <errno.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
int main(int argc, char *argv[])
{
pid_t pid;
pid = fork();
if (pid == 0) {
int iPid = (int)getpid();
fprintf(stderr,"I am child,%d\n",iPid);
sleep(120);
fprintf(stderr, "Child exits\n");
return EXIT_SUCCESS;
}
int iPid = (int)getpid();
fprintf(stderr,"I am parent,%d\n",iPid);
fprintf(stderr, "parent exits\n");
return EXIT_SUCCESS;
}
[root@limt ~]# gcc Zom.c
[root@limt ~]# ./a.out
[root@limt ~]# ./a.out
I am parent,15019
parent exits
I am child,15020
从输出看父进程先退出,然后子进程运行sleep函数
[root@limt ~]# ps -ef|grep 15019 //查看父进程
root 15046 8866 0 03:29 pts/3 00:00:00 grep 15019
[root@limt ~]#
[root@limt ~]# ps -ef|grep 15020 //查看子进程
root 15020 1 0 03:29 pts/1 00:00:00 ./a.out
root 15056 8866 0 03:29 pts/3 00:00:00 grep 15020
从ps看父进程已经销毁,子进程的父进程号为1,也就是init进程
二 父进程fork一个子进程然后使用waitpid函数,然后退出,而子进程sleep 120秒后退出
#include <sys/types.h>
#include <sys/wait.h>
#include <errno.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
int main(int argc, char *argv[])
{
pid_t pid;
pid = fork();
if (pid == 0) {
int iPid = (int)getpid();
fprintf(stderr,"I am child,%d\n",iPid);
sleep(120);
fprintf(stderr, "Child exits\n");
return EXIT_SUCCESS;
}
int iPid = (int)getpid();
fprintf(stderr,"I am parent,%d\n",iPid);
waitpid(pid,NULL,0);
fprintf(stderr, "parent exits\n");
return EXIT_SUCCESS;
}
[root@limt ~]# gcc Zom.c
[root@limt ~]# ./a.out
I am parent,15187
I am child,15188
Child exits
parent exits
从输出看子进程先退出,父进程一直等待子进程退出后才退出
[root@limt ~]# ps -ef|grep 15187 //查看父进程
root 15187 7459 0 03:32 pts/1 00:00:00 ./a.out
root 15188 15187 0 03:32 pts/1 00:00:00 ./a.out
root 15197 8866 0 03:33 pts/3 00:00:00 grep 15187
[root@limt ~]# ps -ef|grep 15188 //查看子进程
root 15188 15187 0 03:32 pts/1 00:00:00 ./a.out
root 15207 8866 0 03:33 pts/3 00:00:00 grep 15188
三 父进程fork一个子进程,然后使用waitpid函数,最后使用sleep函数,而子进程sleep 40秒后退出
#include <sys/types.h>
#include <sys/wait.h>
#include <errno.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
int main(int argc, char *argv[])
{
pid_t pid;
pid = fork();
if (pid == 0) {
int iPid = (int)getpid();
fprintf(stderr,"I am child,%d\n",iPid);
sleep(40);
fprintf(stderr, "Child exits\n");
return EXIT_SUCCESS;
}
int iPid = (int)getpid();
fprintf(stderr,"I am parent,%d\n",iPid);
waitpid(pid,NULL,0);
fprintf(stderr, "sleep....\n");
sleep(120);
fprintf(stderr, "parent exits\n");
return EXIT_SUCCESS;
}
[root@limt ~]# ./a.out
I am parent,15673
I am child,15674
Child exits //等待40秒后输出
sleep.... //等待40秒后输出
parent exits
从上面的输出看sleep函数没有立刻执行,而是等待子进程运行完成才执行,实际是在等待waitpid函数返回
[root@limt ~]# ps -ef|grep 15673 //程序运行40秒内
root 15673 7459 0 03:43 pts/1 00:00:00 ./a.out
root 15674 15673 0 03:43 pts/1 00:00:00 ./a.out
root 15681 8866 0 03:44 pts/3 00:00:00 grep 15673
[root@limt ~]# ps -ef|grep 15674 //程序运行40秒内
root 15674 15673 0 03:43 pts/1 00:00:00 ./a.out
root 15692 8866 0 03:44 pts/3 00:00:00 grep 15674
[root@limt ~]#
[root@limt ~]# ps -ef|grep 15673 //程序运行40秒外
root 15673 7459 0 03:43 pts/1 00:00:00 ./a.out
root 15725 8866 0 03:44 pts/3 00:00:00 grep 15673
[root@limt ~]# ps -ef|grep 15674 //程序运行40秒外,子进程已经销毁
root 15727 8866 0 03:44 pts/3 00:00:00 grep 15674
[root@limt ~]#
[root@limt ~]# ps -ef|grep 15673 //程序运行120秒外
root 15798 8866 0 03:46 pts/3 00:00:00 grep 15673
四 父进程fork一个子进程,不使用waitpid函数然,而是sleep 120秒,而子进程sleep 1秒后退出
#include <sys/types.h>
#include <sys/wait.h>
#include <errno.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
int main(int argc, char *argv[])
{
pid_t pid;
pid = fork();
if (pid == 0) {
int iPid = (int)getpid();
fprintf(stderr,"I am child,%d\n",iPid);
sleep(1);
fprintf(stderr, "Child exits\n");
return EXIT_SUCCESS;
}
int iPid = (int)getpid();
fprintf(stderr,"I am parent,%d\n",iPid);
fprintf(stderr, "sleep....\n");
sleep(120);
fprintf(stderr, "parent exits\n");
return EXIT_SUCCESS;
}
[root@limt ~]# ./a.out
I am parent,16026
sleep....
I am child,16027
Child exits
parent exits
从上面输出看子进程先于父进程退出
[root@limt ~]# ps -ef|grep 16026 //查看父进程
root 16026 7459 0 03:51 pts/1 00:00:00 ./a.out
root 16027 16026 0 03:51 pts/1 00:00:00 [a.out] <defunct>
root 16039 8866 0 03:51 pts/3 00:00:00 grep 16026
[root@limt ~]#
[root@limt ~]# ps -ef|grep 16027 //查看子进程,子进程处于僵死状态
root 16027 16026 0 03:51 pts/1 00:00:00 [a.out] <defunct>
root 16046 8866 0 03:51 pts/3 00:00:00 grep 16027
[root@limt ~]# top //top 进程显示存在一个僵死进程
top - 04:02:20 up 2:46, 4 users, load average: 0.21, 0.15, 0.15
Tasks: 280 total, 1 running, 278 sleeping, 0 stopped, 1 zombie
Cpu(s): 1.2%us, 0.6%sy, 0.0%ni, 97.4%id, 0.7%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 2050752k total, 1819480k used, 231272k free, 141876k buffers
Swap: 6291448k total, 0k used, 6291448k free, 775368k cached
[root@limt ~]# ps -ef|grep 16027 //120秒后僵死进程消失
root 16513 8866 0 04:01 pts/3 00:00:00 grep 16027