system popen -> exec fork waitpid

最新推荐文章于 2021-06-28 22:47:45 发布

king523103

最新推荐文章于 2021-06-28 22:47:45 发布

阅读量785

点赞数

分类专栏： Linux编程

本文链接：https://blog.csdn.net/king523103/article/details/46726941

版权

Linux编程专栏收录该内容

8 篇文章 0 订阅

订阅专栏

应用程序执行shell命令一般使用popen或者system系统调用函数，看看他们的代码可以搞清楚他们的区别。

popen

/* 
 *  popen.c     Written by W. Richard Stevens 
 */  
#include    <sys/wait.h>   
#include    <errno.h>   
#include    <fcntl.h>   
#include    "ourhdr.h"   
  
static pid_t    *childpid = NULL;  /* ptr to array allocated at run-time */  
static int      maxfd;  	   /* from our open_max(), {Prog openmax} */  
#define SHELL   "/bin/sh"   
  
FILE *  popen(const char *cmdstring, const char *type)  
{  
    int     i, pfd[2];  
    pid_t   pid;  
    FILE    *fp;  
  
    /* 判断参数位r/w */  
    if ((type[0] != 'r' && type[0] != 'w') || type[1] != 0) {  //判断打开方式
        errno = EINVAL;     /* required by POSIX.2 */  
        return(NULL);  
    }  
    if (childpid == NULL) {                                 //第一次使用   /* allocate zeroed out array for child pids */  
        maxfd = open_max();  
        if ( (childpid = calloc(maxfd, sizeof(pid_t))) == NULL)  
            return(NULL);  
    }  
  
    if (pipe(pfd) < 0)                                     //int pipe(int pipefd[2]);   建立管道，pfd[0]读取管道，pfd[1]写入管道
        return(NULL);   /* errno set by pipe() */  
  
    if ( (pid = fork()) < 0)                               //fork，子进程继承父进程打开的文件描述符，当然包括pfd[0]和pfd[1]这两个文件描述符
        return(NULL);   /* errno set by fork() */  
    else if (pid == 0)                    //子进程
    {                                   
        if (*type == 'r')                 //读方式时，父进程关闭写管道，子进程关闭读管道。写方式时，父进程关闭读管道，子进程关闭写管道
        {  
            close(pfd[0]);                                 //关闭读取管道
            if (pfd[1] != STDOUT_FILENO) 
            {  
                dup2(pfd[1], STDOUT_FILENO);               //重定向管道到标准输出
                close(pfd[1]);  
            }  
        } 
        else 
        {  
            close(pfd[1]);                                 //关闭写通道
            if (pfd[0] != STDIN_FILENO) 
            {  
                dup2(pfd[0], STDIN_FILENO);  
                close(pfd[0]);  
            }  
        }  
        /* close all descriptors in childpid[] */  
        for (i = 0; i < maxfd; i++)  
            if (childpid[ i ] > 0)  
                close(i);  
  
        execl(SHELL, "sh", "-c", cmdstring, (char *) 0);  
        _exit(127);  
    }  
    //父进程执行。。。
    //打开文件描述符，返回文件句柄
    if (*type == 'r') {  
        close(pfd[1]);  
        if ( (fp = fdopen(pfd[0], type)) == NULL)  
            return(NULL);  
    } else {  
        close(pfd[0]);  
        if ( (fp = fdopen(pfd[1], type)) == NULL)  
            return(NULL);  
    }  
    //记住管道句柄对应的后台子进程。（关闭时，传入的fp将得到pid，从而可以关闭子进程）
    childpid[fileno(fp)] = pid; 
    return(fp);  
}  
  
int  pclose(FILE *fp)  
{  
  
    int     fd, stat;  
    pid_t   pid;  
  
    if (childpid == NULL)  
        return(-1);     /* popen() has never been called */  
    //得到子进程
    fd = fileno(fp);  
    if ( (pid = childpid[fd]) == 0)  
        return(-1);     /* fp wasn't opened by popen() */  
  
    childpid[fd] = 0;  
    if (fclose(fp) == EOF)  
        return(-1);  
    //关闭子进程
    while (waitpid(pid, &stat, 0) < 0)  
        if (errno != EINTR)  
            return(-1); /* error other than EINTR from waitpid() */  
  
    return(stat);   /* return child's termination status */  
}

可以看出这个函数的调用过程：

1、建立pipe，得到输入和输出的两个管道，

2、fork子进程

3、根据读写类型，关闭对应的管道

4、子进程重定向STDIN或者STDOUT到PIPE，执行execl

5、父进程返回被重定向的文件描述符（不是文件句柄）

相应的pclose函数中需要注意waitpid

System函数

#include <sys/types.h>
#include <sys/wait.h>
#include <errno.h>
#include <unistd.h>
int system(const char * cmdstring)
{
    pid_t pid;
    int status;
    if(cmdstring == NULL){
         return (1);
    }

    if((pid = fork())<0){
            status = -1;
    }
    else if(pid = 0){
        execl("/bin/sh", "sh", "-c", cmdstring, (char *)0);
        exit(127); //上面的execl正常执行则不会执行这条语句
    }
    else{
        while(waitpid(pid, &status, 0) < 0){
             if(errno != EINTER){
                  status = -1;
                  break;
            }
        }
    }
    return status;
}

调用过程：

1 fork 生成一个子进程。
2 在子进程执行 execl（"/bin/sh","sh","-c" command,(char*)0）;
3 父进程中waitpid等待子进程退出。

返回值：

当参数错误时返回 1

当fork或者waitpid函数执行失败返回 -1

当execl执行发生错误时（例如命令未找到）返回127

其他返回值由execl返回，execl的返回值由两部分组成：bit0-bit7(shell工具的结果) bit8-bit15（命令程序的exit或return值)。例如程序中正常返回1，system返回值为256

execl返回的可能原因：

execl调用的cmdstring所指的可指定文件执行中调用exit或者main函数中return的值，
进程执行被信号打断
进程执行被信号暂停

上述3种情况可以分别提取到结果：

正常结果（说明子进程调用了exit(1)或者main函数中return 1）可通过WIFEXITED判断，exit或者return值可以通过WEXITSTATUS提取
被信号中断可通过WTERMSIG提取（可通过WIFSIGNALED判断）
被信号暂停可通过WIFSTOPPED提取（可通过WIFSTOPPED判断）

另外需要注意信号SIGCHLD被暂时搁置 SIGINT/SIGQUIT被忽略

由上面的代码可以看到：

popen会比system多申请一个管道，并利用管道和子进程进行数据通信

system新建进程后，会等待子进程的执行结果。

效率上来说，popen占用的内存比system多，原因在于COW机制，system调用时父进程运行的时写内存的操作机会更大

上面的代码都调用了fork,execl,waitpid，需要分别了解一下

Fork函数

fork函数对应系统函数sys_fork()、sys_clone()、sys_vfork()，而这3个函数最终调用的是do_fork函数，他的最大特点是一次调用有两次返回。

/*
 *  Ok, this is the main fork-routine.
 *
 * It copies the process, and if successful kick-starts
 * it and waits for it to finish using the VM if required.
 */
long do_fork(unsigned long clone_flags,
	      unsigned long stack_start,
	      struct pt_regs *regs, 
	      unsigned long stack_size,
	      int __user *parent_tidptr,
	      int __user *child_tidptr)
{
	p = dup_task_struct(current);//创建内核栈,thread_info,task_struct结构
	......
	p = copy_process(clone_flags, stack_start, regs, stack_size, //建立子进程并复制父进程信息 
			child_tidptr, NULL, trace);/* * Do this prior waking up the new thread - the thread pointer * might get invalid after that point, if the thread exits quickly. */
	if (!IS_ERR(p)) {
		...
		if (unlikely(clone_flags & CLONE_STOPPED)) 
		{
			/* * We'll start up with an immediate SIGSTOP. */
			sigaddset(&p->pending.signal, SIGSTOP);
			set_tsk_thread_flag(p, TIF_SIGPENDING);
			__set_task_state(p, TASK_STOPPED);
		} 
		else {
			wake_up_new_task(p, clone_flags);
		}
		...
	} 
	else {
		nr = PTR_ERR(p);
	}
	return nr;
}

上面的copy_process函数作了大部分拷贝父进程信息的工作。

static struct task_struct *copy_process(unsigned long clone_flags,
					unsigned long stack_start,
					struct pt_regs *regs,
					unsigned long stack_size,
					int __user *child_tidptr,
					struct pid *pid,
					int trace)
{        
        .....	

	/* Perform scheduler related setup. Assign this task to a CPU. */
	sched_fork(p, clone_flags);

	retval = perf_counter_init_task(p);
	if (retval)
		goto bad_fork_cleanup_policy;

	if ((retval = audit_alloc(p)))                            //拷贝audit context
		goto bad_fork_cleanup_policy;
	/* copy all the process information */
	if ((retval = copy_semundo(clone_flags, p)))              //
		goto bad_fork_cleanup_audit;
	if ((retval = copy_files(clone_flags, p)))                //拷贝文件描述符（已经打开的文件）
		goto bad_fork_cleanup_semundo;
	if ((retval = copy_fs(clone_flags, p)))                   //拷贝文件系统(当前目录、根目录信息等）
		goto bad_fork_cleanup_files;
	if ((retval = copy_sighand(clone_flags, p)))              //拷贝信号处理表
		goto bad_fork_cleanup_fs;
	if ((retval = copy_signal(clone_flags, p)))               //拷贝信号表
		goto bad_fork_cleanup_sighand;
	if ((retval = copy_mm(clone_flags, p)))                   //拷贝mm_struct信息,（创建页表，有可能共享父进程页表也有可能复制页表）
		goto bad_fork_cleanup_signal;
	if ((retval = copy_namespaces(clone_flags, p)))           //
		goto bad_fork_cleanup_mm;
	if ((retval = copy_io(clone_flags, p)))                   //拷贝IO Context
		goto bad_fork_cleanup_namespaces;
	retval = copy_thread(clone_flags, stack_start, stack_size, p, regs);  //拷贝线程信息，填充task_struct->thread
	if (retval)
		goto bad_fork_cleanup_io;

	.....
	
	return p;

        .....错误处理
}

copy_thread和不同的CPU架构相关，不过总的说来，复制的是线程的执行上下文（寄存器，堆栈等信息）

arm：

int
copy_thread(unsigned long clone_flags, unsigned long stack_start,
	    unsigned long stk_sz, struct task_struct *p, struct pt_regs *regs)
{
	struct thread_info *thread = task_thread_info(p);
	struct pt_regs *childregs = task_pt_regs(p);

	*childregs = *regs;
	childregs->ARM_r0 = 0;
	childregs->ARM_sp = stack_start;

	memset(&thread->cpu_context, 0, sizeof(struct cpu_context_save));
	thread->cpu_context.sp = (unsigned long)childregs;
	thread->cpu_context.pc = (unsigned long)ret_from_fork;

	if (clone_flags & CLONE_SETTLS)
		thread->tp_value = regs->ARM_r3;

	return 0;
}

mips:

int copy_thread(unsigned long clone_flags, unsigned long usp,
	unsigned long unused, struct task_struct *p, struct pt_regs *regs)
{
	...
	if (is_fpu_owner())
		save_fp(p);

	if (cpu_has_dsp)
		save_dsp(p);
        ...
	childregs->regs[7] = 0;	/* Clear error flag */

	childregs->regs[2] = 0;	/* Child gets zero as return value */
	regs->regs[2] = p->pid;
        ....
	p->thread.reg29 = (unsigned long) childregs;
	p->thread.reg31 = (unsigned long) ret_from_fork;

       ....
	return 0;
}

从上面可以看出，do_fork中dup_task_struct建立了新的进程（这时候还不能运行），copy_process对新进程拷贝信息，包括：

实际用户ID、实际组ID、有效用户ID、有效组ID
附加组ID 进程组ID 会话ID；
控制终端
设置-用户-ID标志和设置-组-ID标志
当前工作目录、根目录
文件权限屏蔽字
信号屏蔽和排列
打开的文件描述符。由父进程打开的文件描述符都被复制到子进程中，父、子进程中相同编号的文件描述符在内核中指向同一个file结构体，也就是说，file结构体的引用计数要增加。
环境变量
连接的共享存储段
数据段、代码段、堆段、.bss段（由于代码段（加载到内存的执行码）在内存中是只读的，所以父子进程可共用代码段，而数据段和堆栈段子进程则完全从父进程复制拷贝了一份。）
资源限制

子进程和父进程区别：

fork的返回值；
进程ID、不同的父进程ID；
子进程的tms_utime、tms_stime、tms_cutime以及tms_ustime设置为0。（进程运行时间）
父进程设置的锁，子进程不继承
未处理的闹钟信号子进程将清除
子进程的未决告警被清除
子进程的未决信号集设置为空集。

最后将新进程的添加到内核的运行列表中，从而启动对新进程的调度。

上面的copy_thread函数中能够找到为什么fork能够执行一次但是返回两次，且返回不同结果的原因。只所以能够返回两次是因为对栈进行拷贝，同时将子进程中的栈空间的函数返回值修改为0。

因此可以看出，使用fork系统调用的代价是很大的，它复制了父进程中的数据段和堆栈段里的绝大部分内容，使得fork系统调用的执行速度并不很快。

不过在Linux中，对fork进行了优化，调用时采用写时复制 (COW，copy on write）的方式，在系统调用fork生成子进程的时候，不马上为子进程复制父进程的资源，而是在遇到“写入”（对资源进行修改）操作时才复制资源。实际的开销是复制父进程的页表和给子进程创建惟一的进程描述符

参考：http://blog.chinaunix.net/uid-24774106-id-3048281.html

exec函数

exec实际上是一个函数族，exec执行的结果是，新进程替换了调用进程的数据段、程序段、堆栈等，只保留了调用进程号。从用户的角度来看，新进程替换了老进程。

所需头文件	#include <unistd.h>
函数说明	执行文件
函数原型	int execl(const char path, const char arg, ...)
	int execv(const char path, char const argv[])
	int execle(const char path, const char arg, ..., char *const envp[])
	int execve(const char path, char const argv[], char *const envp[])
	int execlp(const char file, const char arg, ...)
	int execvp(const char file, char const argv[])
函数返回值	成功：函数不会返回
函数返回值	出错：返回-1，失败原因记录在error中

上述函数可以根据后缀做一些区分：

"l"和"v”：参数传递的方式是列表还是数组方式

"p"：可以只给出可执行文件名，不需要文件全路径。文件在环境变量PATH中查找

"e"：可以替换环境变量，不带后缀的使用默认或者继承的环境变量。

这6个函数中,execve是基础，其他5个函数是execve的封装

exec后新进程保持原进程以下特征:

Ÿ 环境变量（使用了execle、execve函数则不继承环境变量）；
进程ID和父进程ID；
Ÿ 实际用户ID和实际组ID；
Ÿ 附加组ID；
Ÿ 进程组ID；
Ÿ 会话ID；
Ÿ 控制终端；
Ÿ 当前工作目录；
Ÿ 根目录；
Ÿ 文件权限屏蔽字；
Ÿ 文件锁；
Ÿ 进程信号屏蔽；
Ÿ 未决信号；
Ÿ 资源限制；
Ÿ tms_utime、tms_stime、tms_cutime以及tms_ustime值。

对打开文件的处理与每个描述符的exec关闭标志值有关，进程中每个文件描述符有一个exec关闭标志（FD_CLOEXEC），若此标志设置，则在执行exec时关闭该描述符，否则该描述符仍打开。除非特地用fcntl设置了该标志，否则系统的默认操作是在exec后仍保持这种描述符打开，利用这一点可以实现I/O重定向。

waitpid

在system系统调用中可以看到，fork函数新建子进程，若子进程退出后，需要父进程回收子进程一些资源。若父进程没有进行回收，这个子进程就成了僵尸进程。

实际上，子进程退出时会发出SIGCHLD信号，不过这个信号并不会被处理。原因在于它是子进程向父进程传送信息的唯一通道，只有父进程调用waitpid才会进行处理。处理的结果也就是所说对子进程进行清理操作，释放占用的资源

king523103

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
system popen -> exec fork waitpid

应用程序执行shell命令一般使用popen或者system系统调用函数，看看他们的代码可以搞清楚他们的区别。popen/* * popen.c Written by W. Richard Stevens */ #include #include #include #include "ourhdr.h"
复制链接

扫一扫

专栏目录