关于fork的整理和思考

最新推荐文章于 2021-12-23 12:28:23 发布

wtt1002

最新推荐文章于 2021-12-23 12:28:23 发布

阅读量199

点赞数

分类专栏： java虚拟机

本文链接：https://blog.csdn.net/lxsz0214/article/details/82827960

版权

java虚拟机专栏收录该内容

2 篇文章 0 订阅

订阅专栏

linux中的fork：

fork调用的一个奇妙之处就是它仅仅被调用一次，却能够返回两次，它可能有三种不同的返回值：
    1）在父进程中，fork返回新创建子进程的进程ID；
    2）在子进程中，fork返回0；
    3）如果出现错误，fork返回一个负值；

在fork函数执行完毕后，如果创建新进程成功，则出现两个进程，一个是子进程，一个是父进程。在子进程中，fork函数返回0，在父进程中，fork返回新创建子进程的进程ID。我们可以通过fork返回的值来判断当前进程是子进程还是父进程。

    引用一位网友的话来解释fpid的值为什么在父子进程中不同。“其实就相当于链表，进程形成了链表，父进程的fpid(p 意味point)指向子进程的进程id,因为子进程没有子进程，所以其fpid为0.
fork出错可能有两种原因：
    1）当前的进程数已经达到了系统规定的上限，这时errno的值被设置为EAGAIN。
    2）系统内存不足，这时errno的值被设置为ENOMEM。

#include <unistd.h>  
#include <stdio.h>  
int main(void)  
{  
   int i=0;  
   printf("i son/pa ppid pid  fpid/n");  
   //ppid指当前进程的父进程pid  
   //pid指当前进程的pid,  
   //fpid指fork返回给当前进程的值  
   for(i=0;i<2;i++){  
       pid_t fpid=fork();  
       if(fpid==0)  
           printf("%d child  %4d %4d %4d/n",i,getppid(),getpid(),fpid);  
       else  
           printf("%d parent %4d %4d %4d/n",i,getppid(),getpid(),fpid);  
   }  
   return 0;  
}

运行结果：

    i son/pa ppid pid  fpid
    0 parent 2043 3224 3225
    0 child  3224 3225    0
    1 parent 2043 3224 3226
    1 parent 3224 3225 3227
    1 child     1 3227    0
    1 child     1 3226    0

当i=0时：

通过fork得到一个子进程。在当前进程中，fpid为子进行的id，执行else语句，输出“0 parent 2043 3224 3225”。可知当前进程id为3224，当前进程的父进程id为2043，当前进程的子进程id为3225。

在子进程中，fpid为0，if判断通过，输出“0 child 3224 3225 0”。可知当前进程id为3225，当前进程的父进程id为3224，当前进程暂时还没有子进程。

当i=1时：

此时已经有两个进程走进循环，进程id分别为3224好3225。

进程id为3224 的进程执行fork后，会创建一个子进程，子线程id为3226。在进程3224中，fpid为子进程的id值，即3226，执行else，输出“1 parent 2043 3224 3226”。可知当前进程id为3224，当前进程父进程id为2043，当前进程的子进程为3226。在子进程中，fpid为0，满足if判断，输出“1 child 1 3226 0 ”。为什么进程3226的父进程为1？后面再解释。

进程id为3225的进程执行fork之后，会创建一个子进程，子进程id为3227。在进程3225中，fpid为子进程id值，即3227，执行else语句，输出“1 parent 3224 3225 3227”。可知当前进程id为3225，当前进程的父进程为3224，当前进程的子进程id为3227.在子进程中，fpid为0，满足if判断，输出“1 child 1 3226 0”。

说一下为什么进程3226和3227明明有自己的父进程，打印出来的“1 child 1 3226 0 ”和“1 child 1 3226 0”，显示父进程id为1？
在p3224和p3225执行完第二个循环后，main函数就该退出了，也即进程该死亡了，因为它已经做完所有事情了。p3224和p3225死亡后，p3226，p3227就没有父进程了，这在操作系统是不被允许的，所以p3226，p3227的父进程就被置为p1了，p1是永远不会死亡的。

java中的fork/join

参考廖雪峰老师给的例子。

import java.util.Random;
import java.util.concurrent.ForkJoinPool;
import java.util.concurrent.ForkJoinTask;
import java.util.concurrent.RecursiveTask;

/**
 * Package: algorithm.fork_join
 * Description： TODO
 * Author: TingTing W
 * Date: Created in 2018/9/24 9:26
 */
public class SumTask extends RecursiveTask<Long> {
    static final int THRESHOLD = 100;
    long[] array;
    int start;
    int end;

    SumTask(long[] array, int start, int end) {
        this.array = array;
        this.start = start;
        this.end = end;
    }

    @Override
    protected Long compute() {
        if (end - start <= THRESHOLD) {
            // 如果任务足够小,直接计算:
            long sum = 0;
            for (int i = start; i < end; i++) {
                sum += array[i];
            }
            try {
                Thread.sleep(1000);
            } catch (InterruptedException e) {
            }
            System.out.println(String.format("compute %d~%d = %d", start, end, sum));
            return sum;
        }
        // 任务太大,一分为二:
        int middle = (end + start) / 2;
        System.out.println(String.format("split %d~%d ==> %d~%d, %d~%d", start, end, start, middle, middle, end));
        SumTask subtask1 = new SumTask(this.array, start, middle);
        SumTask subtask2 = new SumTask(this.array, middle, end);
        invokeAll(subtask1, subtask2);
        //subtask1.fork();
        //subtask2.fork();
        Long subresult1 = subtask1.join();
        Long subresult2 = subtask2.join();
        Long result = subresult1 + subresult2;
        System.out.println("result = " + subresult1 + " + " + subresult2 + " ==> " + result);
        return result;
    }
    public static void main(String[] args) throws Exception {
        // 创建随机数组成的数组:
        long[] array = new long[400];
        fillRandom(array);
        // fork/join task:
        ForkJoinPool fjp = new ForkJoinPool(4); // 最大并发数4
        ForkJoinTask<Long> task = new SumTask(array, 0, array.length);
        long startTime = System.currentTimeMillis();
        Long result = fjp.invoke(task);
        long endTime = System.currentTimeMillis();
        System.out.println("Fork/join sum: " + result + " in " + (endTime - startTime) + " ms.");
    }
    private static void fillRandom(long[] array){
        Random ra =new Random();
        for (int i = 0; i <array.length; i++){
            array[i] = ra.nextInt();
        }
    }
}

在代码中使用invokeAll(subtask1, subtask2);而不是使用subtask1.fork(); subtask2.fork();的原因：

这是因为执行compute()方法的线程本身也是一个Worker线程，当对两个子任务调用fork()时，这个Worker线程就会把任务分配给另外两个Worker，但是它自己却停下来等待不干活了！这样就白白浪费了Fork/Join线程池中的一个Worker线程，导致了4个子任务至少需要7个线程才能并发执行。

打个比方，假设一个酒店有400个房间，一共有4名清洁工，每个工人每天可以打扫100个房间，这样，4个工人满负荷工作时，400个房间全部打扫完正好需要1天。

使用invokeAll(subtask1, subtask2)：Fork/Join的工作模式就像这样：首先，工人甲被分配了400个房间的任务，他一看任务太多了自己一个人不行，所以先把400个房间拆成两个200，然后叫来乙，把其中一个200分给乙。紧接着，甲和乙再发现200也是个大任务，于是甲继续把200分成两个100，并把其中一个100分给丙，类似的，乙会把其中一个100分给丁，这样，最终4个人每人分到100个房间，并发执行正好是1天。

使用subtask1.fork(); subtask2.fork()：甲把400分成两个200后，这种写法相当于甲把一个200分给乙，把另一个200分给丙，然后，甲成了监工，不干活，等乙和丙干完了他直接汇报工作。乙和丙在把200分拆成两个100的过程中，他俩又成了监工，这样，本来只需要4个工人的活，现在需要7个工人才能1天内完成，其中有3个是不干活的。

使用invokeAll(subtask1, subtask2)的运行结果:

split 0~400 ==> 0~200, 200~400
split 0~200 ==> 0~100, 100~200
split 200~400 ==> 200~300, 300~400
compute 0~100 = -23250455557
compute 200~300 = -25966646373
compute 300~400 = -2068808265
compute 100~200 = 3518158950
result = -25966646373 + -2068808265 ==> -28035454638
result = -23250455557 + 3518158950 ==> -19732296607
result = -19732296607 + -28035454638 ==> -47767751245
Fork/join sum: -47767751245 in 1022 ms.

使用subtask1.fork(); subtask2.fork()的执行结果：

split 0~400 ==> 0~200, 200~400
split 200~400 ==> 200~300, 300~400
split 0~200 ==> 0~100, 100~200
compute 200~300 = 746524341
compute 0~100 = -14877182962
compute 100~200 = 3577160199
result = -14877182962 + 3577160199 ==> -11300022763
compute 300~400 = -6444503009
result = 746524341 + -6444503009 ==> -5697978668
result = -11300022763 + -5697978668 ==> -16998001431
Fork/join sum: -16998001431 in 2023 ms.

观察invokeAll()源码，会发现invokeAll的N个任务中，其中N-1个任务会使用fork()交给其它线程执行，但是，它还会留一个任务自己执行，这样，就充分利用了线程池，保证没有空闲的不干活的线程。

当invoke传入两个进程任务，只有t2会重新fork()。

 /**
     * Forks the given tasks, returning when {@code isDone} holds for
     * each task or an (unchecked) exception is encountered, in which
     * case the exception is rethrown. If more than one task
     * encounters an exception, then this method throws any one of
     * these exceptions. If any task encounters an exception, the
     * other may be cancelled. However, the execution status of
     * individual tasks is not guaranteed upon exceptional return. The
     * status of each task may be obtained using {@link
     * #getException()} and related methods to check if they have been
     * cancelled, completed normally or exceptionally, or left
     * unprocessed.
     *
     * @param t1 the first task
     * @param t2 the second task
     * @throws NullPointerException if any task is null
     */
    public static void invokeAll(ForkJoinTask<?> t1, ForkJoinTask<?> t2) {
        int s1, s2;
        t2.fork();
        if ((s1 = t1.doInvoke() & DONE_MASK) != NORMAL)
            t1.reportException(s1);
        if ((s2 = t2.doJoin() & DONE_MASK) != NORMAL)
            t2.reportException(s2);
    }

当传入可变参数的进程任务只执行length-1个fork：

    /**
     * Forks the given tasks, returning when {@code isDone} holds for
     * each task or an (unchecked) exception is encountered, in which
     * case the exception is rethrown. If more than one task
     * encounters an exception, then this method throws any one of
     * these exceptions. If any task encounters an exception, others
     * may be cancelled. However, the execution status of individual
     * tasks is not guaranteed upon exceptional return. The status of
     * each task may be obtained using {@link #getException()} and
     * related methods to check if they have been cancelled, completed
     * normally or exceptionally, or left unprocessed.
     *
     * @param tasks the tasks
     * @throws NullPointerException if any task is null
     */
    public static void invokeAll(ForkJoinTask<?>... tasks) {
        Throwable ex = null;
        int last = tasks.length - 1;
        for (int i = last; i >= 0; --i) {
            ForkJoinTask<?> t = tasks[i];
            if (t == null) {
                if (ex == null)
                    ex = new NullPointerException();
            }
            else if (i != 0)
                t.fork();
            else if (t.doInvoke() < NORMAL && ex == null)
                ex = t.getException();
        }
        for (int i = 1; i <= last; ++i) {
            ForkJoinTask<?> t = tasks[i];
            if (t != null) {
                if (ex != null)
                    t.cancel(false);
                else if (t.doJoin() < NORMAL)
                    ex = t.getException();
            }
        }
        if (ex != null)
            rethrow(ex);
    }

参考：

http://www.cnblogs.com/bastard/p/2664896.html

https://www.liaoxuefeng.com/article/001493522711597674607c7f4f346628a76145477e2ff82000

wtt1002

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
关于fork的整理和思考

linux中的fork：fork调用的一个奇妙之处就是它仅仅被调用一次，却能够返回两次，它可能有三种不同的返回值： 1）在父进程中，fork返回新创建子进程的进程ID； 2）在子进程中，fork返回0； 3）如果出现错误，fork返回一个负值；在fork函数执行完毕后，如果创建新进程成功，则出现两个进程，一个是子进程，一个是父进程。在子进程中，fork函数返...
复制链接

扫一扫

专栏目录