关于for循环中调用fork()系统调用的执行原理解析
该问题来源于操作系统概念(第九版)一书中的第三章的习题3.5,分析for循环中fork的执行原理
1、预备知识
fork()系统调用原理:
fork()
系统调用用来创建新的进程,调用fork
的进程为父进程,新创建的进程为子进程,新进程的地址空间复制了原来进程的地址空间,fork()
函数执行完毕后,这两个进程(父和子)都继续执行处于系统调用fork()
之后的指令,但有一点需要特别注意,fork()
向新进程(子进程)的地址空间中返回0,而向父进程中返回的值为新进程的pid(进程标识符)。
解读:
关于 “新进程的地址空间复制了原来进程的地址空间” 这句话的话一开始我没太理解,后来搞清楚了,用大白话来讲的话就是,在fork
执行完毕后,两个进程所拥有的所有东西都是一样的,书中也提到子进程就是父进程的一个copy,但是他们的区别是进程标识符----pid不同。
搞清楚了fork()系统调用的概念和原理后就可以接着来看这道题目了。
2、题目解析
题目描述:
请分析下列C程序创建了多少个进程?
#include <stdio.h>
#include <unistd.h>
int main() {
int i;
for (i = 0; i < 4; i++) {
pid = fork();
}
return 0;
}
在不影响最终执行结果的基础上,我对原始代码进行了一点改动,加了一些打印和同步的语句方便分析运行结果
补充说明:
getpid()
用来获取当前进程的进程标识符,我们显示为pid;getppid()
用来获取当前进程的父进程的进程标识符,我们取别名叫做ppid;wait(NULL);
语句的作用是等待当前进程的子进程执行完毕;
#include <stdio.h>
#include <unistd.h>
int main() {
int i;
int n = 4;
printf("main process pid is :%d\n", getpid());
pid_t pid;
for (i = 0; i < n; i++) {
pid = fork();
if (pid < 0) {
printf("fork failed...\n");
return 1;
} else if (pid == 0) {
printf("child process... pid=%d ppid=%d current i=%d\n", getpid(), getppid(), i);
} else {
printf("parent process... pid=%d ppid=%d current i=%d\n", getpid(), getppid(), i);
wait(NULL);
printf("parent process... pid=%d his child compeleted...\n", getpid());
}
}
return 0;
}
执行结果:
main process pid is :29574
parent process... pid=29574 ppid=22955 current i=0
child process... pid=29575 pid=29574 current i=0
parent process... pid=29575 ppid=29574 current i=1
child process... pid=29576 pid=29575 current i=1
parent process... pid=29576 ppid=29575 current i=2
child process... pid=29577 pid=29576 current i=2
parent process... pid=29576 his child compeleted...
parent process... pid=29575 his child compeleted...
parent process... pid=29575 ppid=29574 current i=2
child process... pid=29578 pid=29575 current i=2
parent process... pid=29575 his child compeleted...
parent process... pid=29574 his child compeleted...
parent process... pid=29574 ppid=22955 current i=1
child process... pid=29579 pid=29574 current i=1
parent process... pid=29579 ppid=29574 current i=2
child process... pid=29580 pid=29579 current i=2
parent process... pid=29579 his child compeleted...
parent process... pid=29574 his child compeleted...
parent process... pid=29574 ppid=22955 current i=2
child process... pid=29581 pid=29574 current i=2
parent process... pid=29574 his child compeleted...
从执行结果中可以看出,加上我们原本的main进程,最终一共有16个进程。再仔细分析,我们可以发现他有点类似建树的过程。但是这样直接根据一次的结果来分析不太严谨。
3、剖析原理
我们将循环条件按照1 -> 5顺序依次调整,看执行结果。
当 n == 1 时,执行结果为:
main process pid is :516
parent process... pid=516 ppid=22955 current i=0
child process... pid=517 ppid=516 current i=0
parent process... pid=516 his child compeleted...
用树来表示: 图中方框内数字代表进程标识符pid
,圆圈中数字表示当前进程中变量i
的值
当 n == 2 时,执行结果为:
main process pid is :623
parent process... pid=623 ppid=22955 current i=0
child process... pid=624 ppid=623 current i=0
parent process... pid=624 ppid=623 current i=1
child process... pid=625 ppid=624 current i=1
parent process... pid=624 his child compeleted...
parent process... pid=623 his child compeleted...
parent process... pid=623 ppid=22955 current i=1
child process... pid=626 ppid=623 current i=1
parent process... pid=623 his child compeleted...
用树来表示:
当 n == 3 时,执行结果为:
main process pid is :1120
parent process... pid=1120 ppid=22955 current i=0
child process... pid=1121 ppid=1120 current i=0
parent process... pid=1121 ppid=1120 current i=1
child process... pid=1122 ppid=1121 current i=1
parent process... pid=1122 ppid=1121 current i=2
child process... pid=1123 ppid=1122 current i=2
parent process... pid=1122 his child compeleted...
parent process... pid=1121 his child compeleted...
parent process... pid=1121 ppid=1120 current i=2
child process... pid=1124 ppid=1121 current i=2
parent process... pid=1121 his child compeleted...
parent process... pid=1120 his child compeleted...
parent process... pid=1120 ppid=22955 current i=1
child process... pid=1125 ppid=1120 current i=1
parent process... pid=1125 ppid=1120 current i=2
child process... pid=1126 ppid=1125 current i=2
parent process... pid=1125 his child compeleted...
parent process... pid=1120 his child compeleted...
parent process... pid=1120 ppid=22955 current i=2
child process... pid=1127 ppid=1120 current i=2
parent process... pid=1120 his child compeleted...
用树来表示:
当 n == 4 时,执行结果为:
main process pid is :1758
parent process... pid=1758 ppid=22955 current i=0
child process... pid=1759 ppid=1758 current i=0
parent process... pid=1759 ppid=1758 current i=1
child process... pid=1760 ppid=1759 current i=1
parent process... pid=1760 ppid=1759 current i=2
child process... pid=1761 ppid=1760 current i=2
parent process... pid=1761 ppid=1760 current i=3
child process... pid=1762 ppid=1761 current i=3
parent process... pid=1761 his child compeleted...
parent process... pid=1760 his child compeleted...
parent process... pid=1760 ppid=1759 current i=3
child process... pid=1763 ppid=1760 current i=3
parent process... pid=1760 his child compeleted...
parent process... pid=1759 his child compeleted...
parent process... pid=1759 ppid=1758 current i=2
child process... pid=1764 ppid=1759 current i=2
parent process... pid=1764 ppid=1759 current i=3
child process... pid=1765 ppid=1764 current i=3
parent process... pid=1764 his child compeleted...
parent process... pid=1759 his child compeleted...
parent process... pid=1759 ppid=1758 current i=3
child process... pid=1766 ppid=1759 current i=3
parent process... pid=1759 his child compeleted...
parent process... pid=1758 his child compeleted...
parent process... pid=1758 ppid=22955 current i=1
child process... pid=1767 ppid=1758 current i=1
parent process... pid=1767 ppid=1758 current i=2
child process... pid=1768 ppid=1767 current i=2
parent process... pid=1768 ppid=1767 current i=3
child process... pid=1769 ppid=1768 current i=3
parent process... pid=1768 his child compeleted...
parent process... pid=1767 his child compeleted...
parent process... pid=1767 ppid=1758 current i=3
child process... pid=1770 ppid=1767 current i=3
parent process... pid=1767 his child compeleted...
parent process... pid=1758 his child compeleted...
parent process... pid=1758 ppid=22955 current i=2
child process... pid=1771 ppid=1758 current i=2
parent process... pid=1771 ppid=1758 current i=3
child process... pid=1772 ppid=1771 current i=3
parent process... pid=1771 his child compeleted...
parent process... pid=1758 his child compeleted...
parent process... pid=1758 ppid=22955 current i=3
child process... pid=1773 ppid=1758 current i=3
parent process... pid=1758 his child compeleted...
用树来表示:(为了方便,后面的图中省略了pid标识)
当 n == 5 时,由于篇幅限制,省略执行结果,直接上树图
几张树图应该表示的很清晰了,这里我们选取n == 5
的树图进行解释:
- 首先
main
进程执行for循环,我们先不考虑子进程的执行,只考虑main进程的执行步骤,那么很简单就能知道他一定会循环5次,也就是说调用5次fork()
创建了5个新的进程,在图中就是树的第二层,每个节点的数字代表当前变量i
的值 - 接下来我们去分析第一个新进程,也就是第二层中的
0
号进程。因为fork()
执行完毕后,新进程也会接着执行fork()
系统调用后的代码,在这里就是继续执行for循环,首先进行i++
操作,那么此时i == 1
,只针对浅红色0
号进程而言,for
循环需要从i == 1
开始执行,那么当前进程只会在循环4次,也就是创建出4个新的进程 - 剩下的以此类推,就可以得到完整的进程树
4、结论和意外发现
4.1、结论:
最朴素的计算方法就是按照上述图的解析的方法去画出进程树,最终统计出进程数中节点个数。
4.2、意外发现
如果我们从第一张图分析道第五张图,把关注点放在进程数的每层的节点数量上,我们可以很轻松的发现它每层的节点数量是关于中间层对称的,是不是超级像一个数学知识点 ------ 杨辉三角
先给出推论:
for
循环中n
的值和进程树的深度depth
的关系为:depth = n + 1
for
循环中n
的值与杨辉三角的行line
的值关系为:line = n + 1
- 结合上述两条,给定一个
n
,我们可以确定进程树中从上往下每层节点总数与杨辉三角中的第line = n + 1
行的数值一一对应。
- 当
n == 1
,进程树深度为n + 1 = 2
;从上到下每层节点总数分别为1、1
; - 当
n == 2
,进程树深度为n + 1 = 3
;从上到下每层节点总数分别为1、2、1
; - 当
n == 3
,进程树深度为n + 1 = 4
;从上到下每层节点总数分别为1、3、3、1
; - 当
n == 4
,进程树深度为n + 1 = 5
;从上到下每层节点总数分别为1、4、6、4、1
; - 当
n == 5
,进程树深度为n + 1 = 6
;从上到下每层节点总数分别为1、5、10、10、5、1
;
个人结论: 给定循环边界n
,可根据公式 全部进程数 = 杨辉三角第 n + 1 行的和;
计算出创建的所有进程数量。