【Linux05-进程控制】进程创建、进程等待、进程退出、进程程序替换（附简易shell实现）

周杰偷奶茶

已于 2023-01-25 21:37:13 修改

阅读量1.2k

点赞数 11

分类专栏： Linux 文章标签： linux 服务器

于 2023-01-19 16:40:41 首次发布

本文链接：https://blog.csdn.net/BaconZzz/article/details/128736118

版权

Linux 专栏收录该内容

10 篇文章 0 订阅

订阅专栏

前言

本期分享进程控制的内容。

博主水平有限，不足之处望请斧正！

进程的控制主要分为四点：

进程创建
进程退出
进程等待
进程程序替换

进程创建

怎么创建

通过fork创建。

#`fork`

是什么

创建子进程的函数。（使用已经介绍过）

为什么

创建子进程来执行父进程的代码（如处理一个等待请求，创建一个子进程来等待）
创建子进程来执行别的代码

怎么创建

为子进程创建task_struct对象
将父进程task_struct对象的大部分属性拷贝给子进程的task_struct对象
为子进程创建进程地址空间
将父进程进程地址空间内的数据和代码拷贝给子进程的进程地址空间
创建并设置页表
子进程放入进程list
…
返回pid（在此之前，核心代码已经执行完毕）

#写时拷贝

创建子进程后，父子进程会先共享数据和代码，当页表检测到任意一方尝试写入，就会发生写时拷贝，使得一人有一份数据和代码。此后才能写入。

这样有什么好处？

不写入：共享数据和代码，不开辟空间，不拷贝数据，效率高——血赚
写入：开辟空间，拷贝数据——不亏

【if else同时进？两个返回值？】

之前使用fork就感觉很奇怪，if else if怎么能同时进，返回值怎么能返回两个？这里就可以解释了。

fork返回pid之前，fork核心代码（创建子进程的工作）已经执行完，父子进程已经分出两个执行流，对于fork剩下的代码都要执行。

父子进程都要从fork返回。是分别返回，不是两个返回值。
父子进程分别进入自己代码的if / else if / else。是分别进入，不是同时进入。

进程退出

#退出码

是什么

每次写main函数，都要写return 0;，有什么用呢？

可以说，我们写代码是为了完成某件事情。但我们怎么知道事情完成得如何？

return 0;就是为了通过返回值确定“事情”完成得咋样。（很像僵尸进程的父进程获取子进程退出信息）

0就是退出码！

退出码：进程退出信息的标识

一般0表示正常，非0表示错误（不同的非0值，可标识不同的错误）。

#环境变量`?`

?永远记录最近一次进程结束对应的退出码。

int add(int x, int y)
{
    return x + y;
}

int main()
{
    int ret = add(1, 1) + 1;
    if(ret != 2)
        return 1;
    else 
        return 0;
}

[bacon@VM-12-5-centos 4]$ ./myproc 
[bacon@VM-12-5-centos 4]$ echo $?
1
[bacon@VM-12-5-centos 4]$ echo $?
0
[bacon@VM-12-5-centos 4]$ echo $?
0

给结果+1，答案就错了，main返回1。

诶？怎么再看退出码就变了？

?永远记录最近一次进程结束对应的退出码，而echo也是进程，正常执行就返回0了。

但退出码对人不太友好，所以可以转换成具体信息

`strerror`

NAME
       strerror, strerror_r - return string describing error number

SYNOPSIS
       #include <string.h>

       char *strerror(int errnum);

用一下：

int main()
{   
    for(int i =0; i <100; ++i)
    {
        printf("[%d]: %s\n", i, strerror(i));
    }    
    return 0;
}

[bacon@VM-12-5-centos 4]$ ./myproc 
[0]: Success
[1]: Operation not permitted
[2]: No such file or directory
[3]: No such process
[4]: Interrupted system call
[5]: Input/output error
[6]: No such device or address
[7]: Argument list too long
[8]: Exec format error
[9]: Bad file descriptor
[10]: No child processes
//...

进程退出的情况

代码跑完了，结果正确——return 0;
代码跑完了，结果不正确——return !0;（退出码发挥作用）
代码没跑完，程序异常——退出码无意义

怎么让进程退出

1. `main函数`返回

这个我们一直都在用，不多说。

2. 调用库函数 `exit`

NAME
       exit - cause normal process termination

SYNOPSIS
       #include <stdlib.h>

       void exit(int status);

DESCRIPTION
       The exit() function causes normal process termination and the value of status & 0377 is returned to the parent
       (see wait(2))

用一下：

int main()
{       
    printf("hello world\n");
    
    exit(10);
}

[bacon@VM-12-5-centos 4]$ ./myproc 
hello world
[bacon@VM-12-5-centos 4]$ echo $?
10

3. 调用系统调用`_exit`

NAME
       _exit, _Exit - terminate the calling process

SYNOPSIS
       #include <unistd.h>

       void _exit(int status);

       #include <stdlib.h>

       void _Exit(int status);

   Feature Test Macro Requirements for glibc (see feature_test_macros(7)):

       _Exit():
           _XOPEN_SOURCE >= 600 || _ISOC99_SOURCE ||
           _POSIX_C_SOURCE >= 200112L;
           or cc -std=c99

DESCRIPTION
       The function _exit() terminates the calling process "immediately".  Any
       open file descriptors belonging to the process are closed; any children
       of the process are inherited by process 1, init, and the process's par‐
       ent is sent a SIGCHLD signal.

用一下：

int main()
{
    printf("hello world\n");
    
   _exit(20);
}

[bacon@VM-12-5-centos 4]$ ./myproc 
hello world
[bacon@VM-12-5-centos 4]$ echo $?
20

看这么区别不大？exit底层就是_exit实现的。

但是他们真的没有区别吗？来看一个例子。

int main()
{
    printf("hello world");
    
  	sleep(2);
  
    exit(10);
}

[bacon@VM-12-5-centos 4]$ ./myproc 
hello world[bacon@VM-12-5-centos 4]$

exit：可以看到，因为没有\n来刷新缓冲区，先sleep2秒，程序退出的时候才刷新缓冲区打印数据。

int main()
{
    printf("hello world");
    
  	sleep(2);
  
    _exit(20);
}

[bacon@VM-12-5-centos 4]$ ./myproc 
[bacon@VM-12-5-centos 4]$

_exit：居然并没有打印数据，那说明，exit终止进程会主动刷新缓冲区，_exit不会！

为什么要多搞个exit，直接一个会主动刷新缓冲区的_exit不香吗？

_exit是系统调用，exit是库函数，后者更底层。

在这里插入图片描述

exit底层就是_exit实现的 ==> exit能刷新,_exit肯定也能刷新 | 但_exit没有刷新 | ==> 缓冲区肯定不在操作系统层面（不然两者都刷） ==> 只能在用户层面。

也因此，只有用户层面的exit才能刷新。_exit不是不想刷新，而是他的地盘根本没缓冲区给他刷。

*具体等基础IO讲

进程等待

为什么有进程等待

先前提到僵尸进程的资源释放是个问题，可能造成内存泄漏。进程等待就可以来解决这个问题了。不仅如此，父进程需要知道子进程的任务完成得如何，也可以通过进程等待的方式回收子进程资源，获取子进程退出信息。

获取子进程退出信息
回收子进程资源

怎么进行进程等待

先用个简单的wait，见见猪跑

`pid_t wait(int* status)`

作用：等待任意子进程
返回值：等待成功返回被等待进程的pid，等待失败返回-1
参数：输出型参数，获取子进程退出状态，不关心可以设为NULL（方便演示，我们先不用，见见猪跑）

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>

int main()
{
    pid_t id = fork();
    if(id == 0)
    {
        int cnt = 10;
        while(cnt)
        {
            printf("child process: pid=%d ppid=%d; on:%d\n", getpid(), getppid(), cnt--);
            sleep(1);
        }
        printf("child process exit now!\n");
        exit(0);
    }

    sleep(15);
    pid_t ret = wait(NULL);
    if(id > 0)
        printf("wait success: %d\n", ret);
    sleep(5);
    return 0;
}

创建子进程，10秒后子进程退出，退出再过5秒，父进程等待到子进程，回收子进程资源，获取子进程退出信息。

在这里插入图片描述

我们成功看到父进程对子进程的等待，回收了子进程资源，获取了子进程退出信息（用status获取，这里还不关心）。

那我想指定等待的进程呢？看waitpid这个系统调用。

`pid_t waitpid`

NAME
       waitpid - wait for process to change state

SYNOPSIS
       #include <sys/types.h>
       #include <sys/wait.h>

       pid_t waitpid(pid_t pid, int *status, int options);

作用：等待指定子进程或任意子进程
返回值：返回等待到的进程的pid
参数
- pid> 0：waitpid将等待pid为 pid 的进程
- status：以位图方式存储子进程退出信息（输出型参数）
- options：等待的方式
  - 0：阻塞式等待（后面讲）

子进程退出无非就三种情况：

代码跑完了，结果正确
代码跑完了，结果不正确
代码没跑完，程序异常

那也意味着status需要表示这三种情况。那status的位图结构是怎么回事？

#status的位图结构

对于status，我们只需要用到低16位，而不同的情况对应不同的存储意义：

在这里插入图片描述

代码正常跑完
- 次低八位(8~15)存储 进程的退出状态
- 低八位(0~7)为0
代码出异常终止
- 低七位(0~6)存储 终止信号
- 第7位为core dump标志（暂时不关心）

怎么获取呢？可以通过一些位操作：

//获取终止信号：0~6位（第七位是core dump标志，暂时不关心）
//0x7F = 0111 1111b（相与得到0~6位）
int signal = status & 0x7F; 

//获取进程退出码：8~15位
//(8~15) >> 8 = 0~7位
//0xFF = 1111 1111（相与得到0~7位）
int exit_code = (status >> 8) & 0xFF;

终止信号我们也接触过（kill -9），我们可以再看看有哪些：

[bacon@VM-12-5-centos wait1]$ kill -l
 1) SIGHUP		 	2) SIGINT	 			3) SIGQUIT			4) SIGILL			 	5) SIGTRAP
 6) SIGABRT	 		7) SIGBUS	 			8) SIGFPE			 	9) SIGKILL			10) SIGUSR1
11) SIGSEGV			12) SIGUSR2			13) SIGPIPE			14) SIGALRM			15) SIGTERM
16) SIGSTKFLT		17) SIGCHLD			18) SIGCONT			19) SIGSTOP			20) SIGTSTP
21) SIGTTIN			22) SIGTTOU			23) SIGURG			24) SIGXCPU			25) SIGXFSZ
26) SIGVTALRM		27) SIGPROF			28) SIGWINCH		29) SIGIO				30) SIGPWR
31) SIGSYS			34) SIGRTMIN		35) SIGRTMIN+1	36) SIGRTMIN+2	37) SIGRTMIN+3
38) SIGRTMIN+4	39) SIGRTMIN+5	40) SIGRTMIN+6	41) SIGRTMIN+7	42) SIGRTMIN+8
43) SIGRTMIN+9	44) SIGRTMIN+10	45) SIGRTMIN+11	46) SIGRTMIN+12	47) SIGRTMIN+13
48) SIGRTMIN+14	49) SIGRTMIN+15	50) SIGRTMAX-14	51) SIGRTMAX-13	52) SIGRTMAX-12
53) SIGRTMAX-11	54) SIGRTMAX-10	55) SIGRTMAX-9	56) SIGRTMAX-8	57) SIGRTMAX-7
58) SIGRTMAX-6	59) SIGRTMAX-5	60) SIGRTMAX-4	61) SIGRTMAX-3	62) SIGRTMAX-2
63) SIGRTMAX-1	64) SIGRTMAX

至于core dump，后面再谈。

了解各个参数的含义，现在来用一下。

代码正常跑完

int main()
{
    pid_t id = fork();
    if(id == 0)
    {
        int cnt = 5;
        while(cnt)
        {
            printf("child process: pid=%d ppid=%d; on:%d\n", getpid(), getppid(), cnt--);
            sleep(1);
        }
        printf("child process exit now!\n");
        exit(233);
    }

    int status = 0;
    pid_t ret = waitpid(id, &status, 0);
    if(id > 0)
        printf("wait success: pid=%d | exit code=%d | signal=%d\n", ret, (status >> 8) & 0xFF, status & 0x7F);
    sleep(5);
    return 0;
}

[bacon@VM-12-5-centos wait1]$ ./test 
child process: pid=22125 ppid=22124; on:5
child process: pid=22125 ppid=22124; on:4
child process: pid=22125 ppid=22124; on:3
child process: pid=22125 ppid=22124; on:2
child process: pid=22125 ppid=22124; on:1
child process exit now!
wait success: pid=22125 | exit code=233 | signal=0

次低八位(8~15)存储 进程的退出状态：233
低八位(0~7)为0

代码出异常终止

int main()
{
    pid_t id = fork();
    if(id == 0)
    {
        int cnt = 5;
        while(cnt)
        {
            printf("child process: pid=%d ppid=%d; on:%d\n", getpid(), getppid(), cnt--);
       
            sleep(1);
            int a = 5;
            a /= 0; //异常终止
        }
        printf("child process exit now!\n");
        exit(233);
    }
    
    int status = 0;
    pid_t ret = waitpid(id, &status, 0);
    if(id > 0)
        printf("wait success: pid=%d | exit code=%d | signal=%d\n", ret, (status >> 8) & 0xFF, status & 0x7F);
    sleep(5);
    return 0;
}

[bacon@VM-12-5-centos wait1]$ ./test 
child process: pid=23423 ppid=23422; on:5
wait success: pid=23423 | exit code=0 | signal=8

低七位(0~6)存储 终止信号：8

对应kill -l中的信号，8号代表浮点数错误，没毛病。

进程等待的本质

总说获取子进程退出信息，那

【子进程退出后，退出信息保存在哪里？】

struct task_struct {
	...

/* task state */
	int exit_state;
	int exit_code, exit_signal;
	int pdeath_signal;  /*  The signal sent when the parent dies  */
	
	...
}

保存在子进程的task_struct对象中。此时，如果父进程想获取这些信息，waitpid(id, &status, 0);，操作系统就会去子进程的task_struct对象中拿到信息，放进status内。

（wait/waitpid是系统调用，由系统执行，它有资格也有能力读取子进程的task_struct对象）

所以，等待的本质就是：从子进程的task_struct对象中获取信息，放到status。

#进程等待的宏

用一趟下来，总觉得要用位操作获取信息太麻烦。是的，大佬也觉得，所以有几个宏可以用：

`WIFEXITED(status)`

作用：查看进程是否正常退出（正常返回真）

`WEXITSTATUS(status)`

作用：查看进程的退出码

int main()
{
    pid_t id = fork();
    if(id == 0)
    {
        int cnt = 5;
        while(cnt)
        {
            printf("child process: pid=%d ppid=%d; on:%d\n", getpid(), getppid(), cnt--);
            sleep(1);
        }
        printf("child process exit now!\n");
        exit(233);
    }

    int status = 0;
    pid_t ret = waitpid(id, &status, 0);
    if(ret > 0)
    {
        if(WIFEXITED(status))
        {
            printf("exit code = %d\n", WEXITSTATUS(status));
        }
        else 
        {
            printf("error!");
            //...
        }
    }
    return 0;
}

[bacon@VM-12-5-centos wait1]$ ./test 
child process: pid=10945 ppid=10944; on:5
child process: pid=10945 ppid=10944; on:4
child process: pid=10945 ppid=10944; on:3
child process: pid=10945 ppid=10944; on:2
child process: pid=10945 ppid=10944; on:1
child process exit now!
exit code = 233

看完了基本的进程等待，就来谈谈之前提到的“阻塞式等待”是什么。

阻塞与非阻塞等待

有阻塞式等待，自然有非阻塞式等待。我们举个例子理解二者的区别。

张三找李四开黑：“来？”“我作业还有一点，等下。”

此时张三需要等待李四，有两种等法：

不挂电话，就干等着
先干自己的事，隔一段时间打个电话，问下李四好了没

打电话就是系统调用，检测李四的状态是否可以开黑。

前者就是阻塞式等待，一直检测。

后者就是非阻塞式等待，隔一段时间检测一下。

其中，多次非阻塞等待叫作轮询。

waitpid的最后一个参数，就能控制等待方式。0是阻塞等待，对于非阻塞也有宏。

`WNOHANG`

：非阻塞等待。

阻塞式等待我们前面已经看过，现在来看看非阻塞等待和轮询，同时还能进一步学习waitpid的用法。

int main()
{
    pid_t id = fork();
    if(id == 0)
    {
        //child
        int cnt = 5;
        while(cnt)
        {
            printf("child process: pid=%d ppid=%d; on:%d\n", getpid(), getppid(), cnt--);
       
            sleep(1);
        }
        printf("child process exit now!\n");
        exit(233);
    }

    //parent
    //轮询
    int status = 0;
    while(1)
    {
        pid_t ret = waitpid(id, &status, WNOHANG); //非阻塞等待
        
        //waitpid调用失败
        if(ret < 0) //等待失败
        {
            printf("wait call failed!\n");
            break;
        }
        //waitpid调用成功
        else if(ret == 0) //没有等待失败，仅检测到子进程没有退出
        {
            printf("waitpid call success, but child process is still running...\n");
        }
        //waitpid调用成功
        else //ret == id：成功等待到pid为id的子进程 
        {
            printf("wait success: exit code = %d | exit signal = %d\n", WEXITSTATUS(status), status & 0x7F);
            break;
        }
        sleep(1);
    }
    return 0;
}

[bacon@VM-12-5-centos poll]$ ./test 
waitpid call success, but child process is still running...
child process: pid=11013 ppid=11012; on:5
waitpid call success, but child process is still running...
child process: pid=11013 ppid=11012; on:4
child process: pid=11013 ppid=11012; on:3
waitpid call success, but child process is still running...
waitpid call success, but child process is still running...
child process: pid=11013 ppid=11012; on:2
waitpid call success, but child process is still running...
child process: pid=11013 ppid=11012; on:1
waitpid call success, but child process is still running...
child process exit now!
wait success: exit code = 233 | exit signal = 0

但有个问题，阻塞等待不够用吗，非阻塞等待有啥用？

非阻塞等待的好处

父进程可以干自己的事。

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <assert.h>
#include <sys/types.h>
#include <sys/wait.h>

#define NUM 10

typedef void (*func_t)(); //func_t是无参，返回值为空的函数指针

func_t task2Solve[NUM]; //函数指针数组，存放解决的任务

//样例任务
void task1()
{
    printf("solving task1\n");
}

void task2()
{
    printf("solving task2\n");
}

void task3()
{
    printf("solving task3\n");
}

void loadTask()
{
    memset(task2Solve, 0, sizeof(task2Solve));
    task2Solve[0] = task1;
    task2Solve[1] = task2;
    task2Solve[2] = task3;
}

int main()
{
    pid_t id = fork();
    assert(id != -1);
    if(id == 0)
    {
        //child
        int cnt = 5;
        while(cnt)
        {
            printf("child process: pid=%d ppid=%d; on:%d\n", getpid(), getppid(), cnt--);
       
            sleep(1);
        }
        printf("child process exit now!\n");
        exit(233);
    }

    //parent
    loadTask();
    int status = 0;
    while(1)
    {
        pid_t ret = waitpid(id, &status, WNOHANG); //非阻塞等待
        
        if(ret < 0)
        {
            printf("wait call failed!\n");
            break;
        }
        else if(ret == 0)
        {
            printf("waitpid call success, but child process is still running...\n");
            for(int i = 0; task2Solve[i] != NULL; ++i)
            {
                task2Solve[i](); //回调函数：检测子进程没退出，父进程干自己的事
            }
        }
        else
        {
            printf("wait success: exit code = %d | exit signal = %d\n", WEXITSTATUS(status), status & 0x7F);
            break;
        }
        sleep(1);
    }
    return 0;
}

[bacon@VM-12-5-centos poll]$ ./test 
waitpid call success, but child process is still running...
solving task1
solving task2
solving task3
child process: pid=14865 ppid=14864; on:5
waitpid call success, but child process is still running...
solving task1
solving task2
solving task3
child process: pid=14865 ppid=14864; on:4
child process: pid=14865 ppid=14864; on:3
waitpid call success, but child process is still running...
solving task1
solving task2
solving task3
waitpid call success, but child process is still running...
solving task1
solving task2
solving task3
child process: pid=14865 ppid=14864; on:2
waitpid call success, but child process is still running...
solving task1
child process: pid=14865 ppid=14864; on:1
solving task2
solving task3
waitpid call success, but child process is still running...
child process exit now!
solving task1
solving task2
solving task3
wait success: exit code = 233 | exit signal = 0

进程程序替换（重要）

学习程序替换，首先要回答一个问题：

创建子进程的目的？
- 让子进程执行父进程代码的一部分（执行父进程从磁盘中加载的代码的一部分）
- 让子进程执行另外的程序（重新从磁盘加载别的程序，执行）

其中，“创建子进程，让子进程执行另外的程序”，就是进程程序替换。

见见猪跑

六个程序替换函数：

NAME
       execl, execlp, execle, execv, execvp, execvpe - execute a file

SYNOPSIS
       #include <unistd.h>

       extern char **environ;

       int execl(const char *path, const char *arg, ...);
       int execlp(const char *file, const char *arg, ...);
       int execle(const char *path, const char *arg,
                  ..., char * const envp[]);
       int execv(const char *path, char *const argv[]);
       int execvp(const char *file, char *const argv[]);
       int execvpe(const char *file, char *const argv[],char *const envp[]);

挑个简单的execl用：

int execl(const char* path, const char* arg, ...)

作用：将指定程序加载到内存，让指定进程执行

你让我执行一个程序，肯定要告诉我程序在哪里，然后告诉我怎么执行
参数
- path：要加载的文件
- arg：命令行参数
- ...：可变参数列表

#include <stdio.h>
#include <unistd.h>

int main()
{
    printf("process running...\n");

    execl("/usr/bin/ls",  //要执行谁
            "ls", "--color=auto", "-a", "-l", NULL); //怎么执行
    //所有exec的函数传参都以NULL结尾

    printf("process running done!\n");

    return 0;
}

[bacon@VM-12-5-centos substitution]$ ./test 
process running...
total 28
drwxrwxr-x  2 bacon bacon 4096 Jan 17 07:57 .
drwxrwxr-x 11 bacon bacon 4096 Jan 17 07:48 ..
-rw-rw-r--  1 bacon bacon   74 Jan 17 07:48 makefile
-rwxrwxr-x  1 bacon bacon 8408 Jan 17 07:57 test
-rw-rw-r--  1 bacon bacon  390 Jan 17 07:57 test.c

诶？最后一句printf怎么没执行？这就需要了解程序替换的原理了

程序替换原理

是什么

程序替换的本质：将指定程序的代码和数据直接加载到指定位置（覆盖指定位置的数据）。

【程序替换时，有没有创建新的进程？】

没有，仅仅是将一些数据和代码覆盖式加载到指定位置。

这也就能解释为什么最后一句printf没执行了：进程的数据和代码已经被新的程序替换了。

如果execl调用失败，没将新的代码和数据覆盖加载，最后一句printf还在，也就会正常打印。

int main()
{
    printf("process running...\n");
  
    execl("/usr/bin/lsfdsajidofjio",  //传参错误
            "ls", "--color=auto", "-a", "-l", NULL); 

    printf("process running done!\n");

    return 0;
}

[bacon@VM-12-5-centos substitution]$ ./test 
process running...
process running done!

execl只有在出错的时候返回1，我们看看是不是调用出错了。

int main()
{
    //.c ==> exe ==> load ==> process ==> run ==> execute code
    printf("process running...\n");

    //load ==> execute code
    int ret = execl("/usr/bin/lsfdsajidofjio",  //要执行谁
            "ls", "--color=auto", "-a", "-l", NULL); //怎么执行
    //所有exec的函数传参都以NULL结尾
    printf("ret = %d\n", ret);

    printf("process running done!\n");

    return 0;
}

[bacon@VM-12-5-centos substitution]$ ./test 
process running...
ret = -1
process running done!

结果符合预期。

为什么成功不返回呢？因为一旦替换成功，返回也没有意义了，因为原来的代码都没了，也不会再用这个返回值。

看一段代码：

int main()
{
    //.c ==> exe ==> load ==> process ==> run ==> execute code
    printf("process running...\n");
    
    pid_t id = fork();
    assert(id != -1);

    if(id == 0)
    {
        sleep(1);
      	//这里的替换，会不会影响父进程？
        execl("/usr/bin/ls", "ls", "-a", "-l", "--color=auto", NULL);
        exit(1); //因为替换成功，这里的代码就被覆盖，不执行。所以执行到这里一定是调用失败返回了
    }
    
    int status = 0;
    pid_t ret = waitpid(id, &status, 0);

    if(ret == id) printf("wait success: exit code = %d  |  sig = %d\n", ((status)>>8) & 0xFF, status & 0x7F);



//    //load ==> execute code
//    int ret = execl("/usr/bin/lsfdsajidofjio",  //要执行谁
//            "ls", "--color=auto", "-a", "-l", NULL); //怎么执行
//    //所有exec的函数传参都以NULL结尾
//    printf("ret = %d\n", ret);
//
    printf("process running done!\n");

    return 0;
}

【子进程的程序替换，会不会把父进程的代码和数据也覆盖了？】

不会，因为进程具有独立性。

当要写入（把指定程序载入）时，父子进程原本共享的代码就必须进行写时拷贝，再拷贝一份数据和代码。此时子进程的替换，根本不影响父进程！

#`exec`系列函数的使用

NAME
       execl, execlp, execle, execv, execvp, execvpe - execute a file

SYNOPSIS
       #include <unistd.h>

       extern char **environ;

       int execl(const char *path, const char *arg, ...);
       int execlp(const char *file, const char *arg, ...);
       int execle(const char *path, const char *arg, ..., char * const envp[]);

       int execv(const char *path, char *const argv[]);
       int execvp(const char *file, char *const argv[]);
       int execvpe(const char *file, char *const argv[],char *const envp[]);

这些函数命名其实都是有意义的：

l：以list列表方式传arg参数（一个一个传）
- p：以文件名的方式传file参数（自动在PATH找）
- e：传环境变量数组envp
v：以vector数组方式传传argv参数（直接传一个数组）
- p：以文件名的方式传file参数（自动在PATH找）
- e：传环境变量数组envp

在学习之前，来试试将我们的程序替换进一个进程中。

首先，makefile默认只生成一个可执行，所以可以这样：

.PHONY:all
all: test bin 


test:test.c
	gcc -o test -std=c99 test.c
	gcc -o bin -std=c99 bin.c

.PHONY:clean
clean:
	rm -f test
	rm -f bin

execl("./bin", "./bin", NULL);

[bacon@VM-12-5-centos substitution]$ ./test
process running...
This is the other .exe!
This is the other .exe!
This is the other .exe!
This is the other .exe!
This is the other .exe!
This is the other .exe!
This is the other .exe!
This is the other .exe!
This is the other .exe!
This is the other .exe!
wait success: exit code = 0  |  sig = 0
process running done!

`int execlp(const char file, const char arg, ...);`

其他代码同execl使用一样，仅改变exec函数的使用。

execlp("ls", "ls", "-a", "-l", "--color=auto", NULL);

`int execle(const char path, const char arg, ..., char * const envp[]);`

1.传自定义环境变量（不要系统的）

//test.c
int main()
{
    printf("process running...\n");
    
    pid_t id = fork();
    assert(id != -1);

    if(id == 0)
    {
        sleep(1);
      
        char* const myenvp[] = { "MYENV=8848", NULL };
        execle("./bin", "bin",  NULL, myenvp); 

        exit(1);
    }
    
    int status = 0;
    pid_t ret = waitpid(id, &status, 0);

    if(ret == id) printf("wait success: exit code = %d  |  sig = %d\n", ((status)>>8) & 0xFF, status & 0x7F);

    printf("process running done!\n");

    return 0;
}

//bin.c
int main()
{
    //系统
    printf("PATH=%s\n", getenv("PATH"));
    printf("PWD=%s\n", getenv("PWD"));
    //自定
    printf("MYENV=%s\n", getenv("MYENV"));

    printf("This is the other .exe!\n");
    printf("This is the other .exe!\n");
    printf("This is the other .exe!\n");
    printf("This is the other .exe!\n");
    printf("This is the other .exe!\n");
    printf("This is the other .exe!\n");
    printf("This is the other .exe!\n");
    printf("This is the other .exe!\n");
    printf("This is the other .exe!\n");
    printf("This is the other .exe!\n");

    return 0;
}

[bacon@VM-12-5-centos substitution]$ ./test
process running...
PATH=(null)
PWD=(null)
MYENV=8848
This is the other .exe!
This is the other .exe!
This is the other .exe!
This is the other .exe!
This is the other .exe!
This is the other .exe!
This is the other .exe!
This is the other .exe!
This is the other .exe!
This is the other .exe!
wait success: exit code = 0  |  sig = 0
process running done

传了自定义环境变量，所以系统的获取不到。

2.传系统的

int main()
{
    printf("process running...\n");
    
    pid_t id = fork();
    assert(id != -1);

    if(id == 0)
    {
        sleep(1);
        
        extern char** environ;
        char* const myenvp[] = { "MYENV=8848", NULL };
        execle("./bin", "bin",  NULL, environ); 

        exit(1); 
    }
    
    int status = 0;
    pid_t ret = waitpid(id, &status, 0);

    if(ret == id) printf("wait success: exit code = %d  |  sig = %d\n", ((status)>>8) & 0xFF, status & 0x7F);

    printf("process running done!\n");

    return 0;
}

[bacon@VM-12-5-centos substitution]$ ./test
process running...
PATH=/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/bacon/.local/bin:/home/bacon/bin
PWD=/home/bacon/linux/5-process_control/substitution
MYENV=(null)
This is the other .exe!
This is the other .exe!
This is the other .exe!
This is the other .exe!
This is the other .exe!
This is the other .exe!
This is the other .exe!
This is the other .exe!
This is the other .exe!
This is the other .exe!
wait success: exit code = 0  |  sig = 0
process running done!

传系统的，自定义的就获取不到了。

但实际上，系统的环境变量我们就算不传，也会从父进程继承。怎么继承的？

还记得进程地址空间有一部分是环境变量，为子进程创建地址空间的时候就会拷贝。

在这里插入图片描述

那我既想要系统的，又想要自定的，怎么办？

`int puenv(char* string)`

作用：修改或添加一个环境变量（其实就是添加到环境变量表中）

int main()
{
    printf("process running...\n");
    
    pid_t id = fork();
    assert(id != -1);

    if(id == 0)
    {
        sleep(1);
        
        extern char** environ;
        putenv((char*)"MYENV=8848"); //将指定环境变量导入到environ指向的环境变量表中
        execle("./bin", "bin",  NULL, environ); 

        exit(1); 
    }
    
    int status = 0;
    pid_t ret = waitpid(id, &status, 0);

    if(ret == id) printf("wait success: exit code = %d  |  sig = %d\n", ((status)>>8) & 0xFF, status & 0x7F);

    printf("process running done!\n");

    return 0;
}

[bacon@VM-12-5-centos substitution]$ ./test
process running...
PATH=/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/bacon/.local/bin:/home/bacon/bin
PWD=/home/bacon/linux/5-process_control/substitution
MYENV=8848
This is the other .exe!
This is the other .exe!
This is the other .exe!
This is the other .exe!
This is the other .exe!
This is the other .exe!
This is the other .exe!
This is the other .exe!
This is the other .exe!
This is the other .exe!
wait success: exit code = 0  |  sig = 0
process running done!

将指定的自定义环境变量导入到environ指向的环境变量表中，在程序替换时选择传入系统的环境变量。自定的和系统的都能获取到了。

`int execv(const char path, char const argv[]);`

char* const myargv[] = {
            "ls", 
            "-a", 
            "-l", 
            "--color=auto", 
            NULL
        };
execv("/usr/bin/ls", myargv);

`int execvp(const char file, char const argv[]);`

…

`int execvpe(const char file, char const argv[],char *const envp[]);`

…

还有一个系统调用，以上函数都是通过这个系统调用封装来的。

`int execve(const char filename, char const argv[], char *const envp[]);`

NAME
       execve - execute program

SYNOPSIS
       #include <unistd.h>

       int execve(const char *filename, char *const argv[],
                  char *const envp[]);

#加载器

Linux的exec系列函数，也叫做加载器。

main函数也是通过加载器加载。

int main(int argc, char* argv[], char* env)

int execle(const char* path, const char* arg, ..., char* const envp[]);

argc 和 argv 通过 arg 获取，env 通过envp 获取。

应用：`shell`

简易地实现一个shell

makefile

myshell:myshell.c
	gcc -o $@ $^ -std=c99 #-DDEBUG #DDEBUG:定义宏——DEBUG

.PHONY:clean
clean:
	rm -f myshell

myshell.c

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <assert.h>
#include <string.h>

#define NUM 1024
#define OPT_NUM 32


char lineCommand[NUM];
char* myargv[OPT_NUM];

int main()
{
    while(1)
    {
        //1. 提示符
        printf("[%s@%s]# ", getenv("USER"), getenv("HOSTNAME"));
        fflush(stdout);
        
        //2. 获取输入的命令（排除用户输入的\n）
        char* ret = fgets(lineCommand, sizeof(lineCommand) - 1, stdin); 
        assert(ret != NULL); 
        lineCommand[strlen(lineCommand) - 1] = 0; //排除用户\n
    
        //*测试获取命令
        //printf("test: %s\n", lineCommand);
    
        //3. 命令解析（字符串切割）
        //"ls -a -l" ==> "ls" "-a" "-l"
        myargv[0] = strtok(lineCommand, " ");
        //没有子串后，strtok返回NULL，正巧myargv需要以NULL结尾，所以...
        int i = 1;
        while(myargv[i++] = strtok(NULL, " "));
    
#ifdef DEBUG 
        //*测试命令解析
        for(int j = 0; myargv[j]; ++j) printf("myargv[%d] = %s\n", j, myargv[j]);
#endif

        //4. 执行命令
        pid_t id = fork();
        assert(id != -1);

        if(id == 0)
        {
            execvp(myargv[0], myargv);
            exit(1);
        }

        waitpid(id, NULL, 0);

    }

    return 0;
}

[bacon@VM-12-5-centos myshell]$ ./myshell 
[bacon@VM-12-5-centos]# ls
makefile  myshell  myshell.c
[bacon@VM-12-5-centos]# pwd
/home/bacon/linux/5-process_control/myshell

主要就四个部分：

打印提示符
获取命令
解析命令（字符串切割）
执行命令（程序替换）

基本也能跑了，但有点bug：

[bacon@VM-12-5-centos]# pwd
/home/bacon/linux/5-process_control/myshell
[bacon@VM-12-5-centos]# cd ..
[bacon@VM-12-5-centos]# pwd
/home/bacon/linux/5-process_control/myshell
[bacon@VM-12-5-centos]# cd ..
[bacon@VM-12-5-centos]# pwd
/home/bacon/linux/5-process_control/myshell

这是什么原因？想知道，首先得了解pwd想获取的当前工作目录cwd。

在这里插入图片描述

cwd(current working directory)是某进程当前的工作目录，exe就是可执行程序在磁盘上的位置。
既然cwd是属于进程的，不同进程的cwd自然互相独立。

我们的shell执行命令时采用“子进程内程序替换”的方式。

子进程cd ..改变子进程的cwd（从父进程继承，又互相独立）
子进程退出，我们改变的cwd也随之去了
父进程的cwd还是没变，白忙活了。

知道了原因（子进程cwd不影响父进程cwd），怎么解决？

可以通过“内建命令”的方式。什么叫内建命令？可以理解为在当前进程执行，不创建子进程执行。

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <assert.h>
#include <string.h>

#define NUM 1024
#define OPT_NUM 32


char lineCommand[NUM];
char* myargv[OPT_NUM];

int main()
{
    while(1)
    {
        //提示符
        printf("[%s@%s]# ", getenv("USER"), getenv("HOSTNAME"));
        fflush(stdout);
        
        //获取输入的命令（排除用户输入的\n）
        char* ret = fgets(lineCommand, sizeof(lineCommand) - 1, stdin); 
        assert(ret != NULL); 
        lineCommand[strlen(lineCommand) - 1] = 0; //排除用户\n
    
        //测试获取命令
        //printf("test: %s\n", lineCommand);
    
        //命令解析（字符串切割）
        //"ls -a -l" ==> "ls" "-a" "-l"
        myargv[0] = strtok(lineCommand, " ");

        //没有子串后，strtok返回NULL，正巧myargv需要以NULL结尾，所以...
        int i = 1;
        while(myargv[i++] = strtok(NULL, " ")); //最后一次先赋NULL给myargv[i]，再判断为假跳出
    
        //内建命令
        if(myargv[0] != NULL && strcmp(myargv[0], "cd") == 0)
        {
            if(myargv[1] != NULL) chdir(myargv[1]);
            continue;
        }

#ifdef DEBUG 
        //测试命令解析
        for(int j = 0; myargv[j]; ++j) printf("myargv[%d] = %s\n", j, myargv[j]);
#endif

        //执行命令
        pid_t id = fork();
        assert(id != -1);

        if(id == 0)
        {
            execvp(myargv[0], myargv);
            exit(1);
        }

        waitpid(id, NULL, 0);

    }

    return 0;
}

[bacon@VM-12-5-centos myshell]$ ./myshell 
[bacon@VM-12-5-centos]# pwd
/home/bacon/linux/5-process_control/myshell
[bacon@VM-12-5-centos]# cd ..
[bacon@VM-12-5-centos]# pwd
/home/bacon/linux/5-process_control
[bacon@VM-12-5-centos]# cd ..
[bacon@VM-12-5-centos]# pwd
/home/bacon/linux

还可以再实现一个内建命令echo。

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <assert.h>
#include <string.h>

#define NUM 1024
#define OPT_NUM 32


char  lineCommand[NUM];
char* myargv[OPT_NUM];
int   lastCode = 0;
int   lastSig = 0;

int main()
{
    while(1)
    {
        //提示符
        printf("[%s@%s]# ", getenv("USER"), getenv("HOSTNAME"));
        fflush(stdout);
        
        //获取输入的命令（排除用户输入的\n）
        char* ret = fgets(lineCommand, sizeof(lineCommand) - 1, stdin); 
        assert(ret != NULL); 
        lineCommand[strlen(lineCommand) - 1] = 0; //排除用户\n
    
        //测试获取命令
        //printf("test: %s\n", lineCommand);
    
        //命令解析（字符串切割）
        //"ls -a -l" ==> "ls" "-a" "-l"
        myargv[0] = strtok(lineCommand, " ");

        //没有子串后，strtok返回NULL，正巧myargv需要以NULL结尾，所以...
        int i = 1;
        while(myargv[i++] = strtok(NULL, " "));
    
        //内建命令
        if(myargv[0] != NULL && strcmp(myargv[0], "cd") == 0)
        {
            if(myargv[1] != NULL) chdir(myargv[1]);
            continue;
        }

        if(myargv[0] != NULL && myargv[1] != NULL && strcmp(myargv[0], "echo") == 0)
        {
            //"$?" 实际上是获取环境变量的值，这里只是演示，就硬判断了
            if(strcmp(myargv[1], "$?") == 0)
            {
                printf("exitCode = %d  |  exitSig = %d\n", lastCode, lastSig);
            }
            else 
            {
                printf("%s\n", myargv[1]);
            }
            continue;
        }

#ifdef DEBUG 
        //测试命令解析
        for(int j = 0; myargv[j]; ++j) printf("myargv[%d] = %s\n", j, myargv[j]);
#endif

        //执行命令
        pid_t id = fork();
        assert(id != -1);

        if(id == 0)
        {
            execvp(myargv[0], myargv);
            exit(1);
        }

        int status = 0;
        pid_t wait_ret = waitpid(id, &status, 0);
        assert(wait_ret > 0);

        lastCode = (status >> 8) & 0xFF;
        lastSig = status & 0x7F;

    }

    return 0;
}