聊聊java那些事(一)--并发

最新推荐文章于 2021-07-13 15:11:41 发布

jimmy_游戏人生

最新推荐文章于 2021-07-13 15:11:41 发布

阅读量1.3k

点赞数

分类专栏： java

本文链接：https://blog.csdn.net/majinfei/article/details/52712463

版权

java 专栏收录该内容

35 篇文章 1 订阅

订阅专栏

在编程的世界里，说的最多的也是最具体有技术含量的就是并发编程了。有的同学一说到并发，就想的多线程，多进程。还有没有其它的并发编程的方案呢，答案是肯定的。有一本书叫《seven concurrency models in seven days》，里面就讲了7种并发的编程模型，今天的学习，就带领大家领略7种并发编程的技术及笔者的一些思考。

在开始说并发之前，让我们先区分一下 concurrency 与 parallel .concurrency 就是并发，parallel就是并行。这两个概念，笔者在写这篇博客之前，是不清楚他们的区别的。后来我认真研究了一下它们的定义，才真正弄明白。

concurrency is about dealing with lots of things at once.

parallelism is about doing lots of things at once.

非english native people 对 dealing with 和doing 的区别理解不深刻，包括笔者在内，也是如何。但大体来说，deal with 有处理，应付的意思。下面看一下例子。

my wife is a teacher. like most of teachers, she is a master of multitasks. at any one instant, she is only doing one thing, but she has to deal with many things concurrently. while listening a child read, she might break off to calm down a rowdy classroom or answer a question. This is concurrenct, but it's not parallel. (that is only one of her.)

if she has a assistant, (one of them listening child's reading , the other answering the question), we now have something that's both concurrency and parallel.

下面笔者给出，并行，串行，并发的理解。想想一下如下的场景，厨子下厨，一个厨子要做饭和抄菜两件事。

串行：厨子先做饭，等饭做熟之后，再做菜。 (估计没有这像的厨子。)

并发: 厨子把饭先放在电饭锅，然后开始做菜，最后饭菜同时好。(统筹的方法)

并行：有两个厨子，一个做饭，一个做菜。最终饭菜同时做好。

一般来说，要提高系统的整体性能，并行和并发要结合起来。你可以把distributed system 想成并行系统。one node of distributed system 想成并发node.

一. threads and locks

从title中就可以看出，这种并发编程的方式，是大家最熟悉的通过多线程和锁的方式来实现并发。说到多线程，就要先说单线程。单线程其实就是一种串行的处理方式，英文叫 sequential programming. 从这个名称中就看到出来，它是一种基于顺序(队列)的处理方式。在多核时代，这种方式显然无法充分发挥硬件的性能。

于是就引入了multi-thread, 多线程。多个线程，同时处理tasks，多好了，性能应该成倍增加吧。但不一定。你想像一下，你一个人处理10件事，然后慢一点，但中间没有同其它的人沟通的成本，10件事10个人做，看到效率提高了，但免不了与人沟通，谁做哪个事，大家要商量，这一商量，然后会有竞争，有人觉得这个事不错，他想做，另外一个人同时也想做同样的事，这些就引入了竞争的问题。

多线程要处理的问题list如下：

1. 多线程的个数如何分配？

2. 多线程如何保证同步？

第一个问题看似简单，其实不好回答。但有一个基本的公式：最佳线程数= ((线程等待时间+ 线程cpu时间)/线程cpu时间) * cpu 核数

通过数学公式整理一下得到如下：

最佳线程数= （线程等待时间 / 线程cpu时间 + 1 ） * cpu 核数

比如平均每个线程CPU运行时间为0.5s，而线程等待时间（非CPU运行时间，比如IO）为1.5s，CPU核心数为8，那么根据上面这个公式估算得到：((0.5+1.5)/0.5)*8=32。

从上面的公式可以看出，线程等待时间越长(IO操作密集型，网络IO, 磁盘IO等)，线程数也越多。线程cpu时间(计算密集型，如加密解密，视频压缩，文件压缩等)越长，要的线程数越小。如果线程cpu时间无穷大，则最佳线程数= cpu核数。网上经常说的有几个cpu core，就用几个线程，还是有一定的道理，只不过是适用于计算密集型的场景。

import java.math.BigDecimal;
import java.math.RoundingMode;
import java.util.Timer;
import java.util.TimerTask;
import java.util.concurrent.BlockingQueue;

/**
 * A class that calculates the optimal thread pool boundaries. It takes the desired target utilization and the desired
 * work queue memory consumption as input and retuns thread count and work queue capacity.
 *
 * @author Niklas Schlimm
 */
public abstract class PoolSizeCalculator {

    /**
     * The sample queue size to calculate the size of a single {@link Runnable} element.
     */
    private final int SAMPLE_QUEUE_SIZE = 1000;

    /**
     * Accuracy of test run. It must finish within 20ms of the testTime otherwise we retry the test. This could be
     * configurable.
     */
    private final int EPSYLON = 20;

    /**
     * Control variable for the CPU time investigation.
     */
    private volatile boolean expired;

    /**
     * Time (millis) of the test run in the CPU time calculation.
     */
    private final long testtime = 3000;

    /**
     * Calculates the boundaries of a thread pool for a given {@link Runnable}.
     *
     * @param targetUtilization    the desired utilization of the CPUs (0 <= targetUtilization <= 1)
     * @param targetQueueSizeBytes the desired maximum work queue size of the thread pool (bytes)
     */
    protected void calculateBoundaries(BigDecimal targetUtilization, BigDecimal targetQueueSizeBytes) {
        calculateOptimalCapacity(targetQueueSizeBytes);
        Runnable task = creatTask();
        start(task);
        start(task); // warm up phase
        long cputime = getCurrentThreadCPUTime();
        start(task); // test intervall
        cputime = getCurrentThreadCPUTime() - cputime;
        long waittime = (testtime * 1000000) - cputime;
        calculateOptimalThreadCount(cputime, waittime, targetUtilization);
    }

    private void calculateOptimalCapacity(BigDecimal targetQueueSizeBytes) {
        long mem = calculateMemoryUsage();
        BigDecimal queueCapacity = targetQueueSizeBytes.divide(new BigDecimal(mem), RoundingMode.HALF_UP);
        System.out.println("Target queue memory usage (bytes): " + targetQueueSizeBytes);
        System.out.println("createTask() produced " + creatTask().getClass().getName() + " which took " + mem
                + " bytes in a queue");
        System.out.println("Formula: " + targetQueueSizeBytes + " / " + mem);
        System.out.println("* Recommended queue capacity (bytes): " + queueCapacity);
    }

    /**
     * Brian Goetz' optimal thread count formula, see 'Java Concurrency in Practice' (chapter 8.2)
     *
     * @param cpu               cpu time consumed by considered task
     * @param wait              wait time of considered task
     * @param targetUtilization target utilization of the system
     */
    private void calculateOptimalThreadCount(long cpu, long wait, BigDecimal targetUtilization) {
        BigDecimal waitTime = new BigDecimal(wait);
        BigDecimal computeTime = new BigDecimal(cpu);
        BigDecimal numberOfCPU = new BigDecimal(Runtime.getRuntime().availableProcessors());
        BigDecimal optimalthreadcount = numberOfCPU.multiply(targetUtilization).multiply(
                new BigDecimal(1).add(waitTime.divide(computeTime, RoundingMode.HALF_UP)));
        System.out.println("Number of CPU: " + numberOfCPU);
        System.out.println("Target utilization: " + targetUtilization);
        System.out.println("Elapsed time (nanos): " + (testtime * 1000000));
        System.out.println("Compute time (nanos): " + cpu);
        System.out.println("Wait time (nanos): " + wait);
        System.out.println("Formula: " + numberOfCPU + " * " + targetUtilization + " * (1 + " + waitTime + " / "
                + computeTime + ")");
        System.out.println("* Optimal thread count: " + optimalthreadcount);
    }

    /**
     * Runs the {@link Runnable} over a period defined in {@link #testtime}. Based on Heinz Kabbutz' ideas
     * (http://www.javaspecialists.eu/archive/Issue124.html).
     *
     * @param task the runnable under investigation
     */
    public void start(Runnable task) {
        long start = 0;
        int runs = 0;
        do {
            if (++runs > 5) {
                throw new IllegalStateException("Test not accurate");
            }
            expired = false;
            start = System.currentTimeMillis();
            Timer timer = new Timer();
            timer.schedule(new TimerTask() {
                public void run() {
                    expired = true;
                }
            }, testtime);
            while (!expired) {
                task.run();
            }
            start = System.currentTimeMillis() - start;
            timer.cancel();
        } while (Math.abs(start - testtime) > EPSYLON);
        collectGarbage(3);
    }

    private void collectGarbage(int times) {
        for (int i = 0; i < times; i++) {
            System.gc();
            try {
                Thread.sleep(10);
            } catch (InterruptedException e) {
                Thread.currentThread().interrupt();
                break;
            }
        }
    }

    /**
     * Calculates the memory usage of a single element in a work queue. Based on Heinz Kabbutz' ideas
     * (http://www.javaspecialists.eu/archive/Issue029.html).
     *
     * @return memory usage of a single {@link Runnable} element in the thread pools work queue
     */
    public long calculateMemoryUsage() {
        BlockingQueue<Runnable> queue = createWorkQueue();
        for (int i = 0; i < SAMPLE_QUEUE_SIZE; i++) {
            queue.add(creatTask());
        }
        long mem0 = Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory();
        long mem1 = Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory();
        queue = null;
        collectGarbage(15);
        mem0 = Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory();
        queue = createWorkQueue();
        for (int i = 0; i < SAMPLE_QUEUE_SIZE; i++) {
            queue.add(creatTask());
        }
        collectGarbage(15);
        mem1 = Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory();
        return (mem1 - mem0) / SAMPLE_QUEUE_SIZE;
    }

    /**
     * Create your runnable task here.
     *
     * @return an instance of your runnable task under investigation
     */
    protected abstract Runnable creatTask();

    /**
     * Return an instance of the queue used in the thread pool.
     *
     * @return queue instance
     */
    protected abstract BlockingQueue<Runnable> createWorkQueue();

    /**
     * Calculate current cpu time. Various frameworks may be used here, depending on the operating system in use. (e.g.
     * http://www.hyperic.com/products/sigar). The more accurate the CPU time measurement, the more accurate the results
     * for thread count boundaries.
     *
     * @return current cpu time of current thread
     */
    protected abstract long getCurrentThreadCPUTime();

}

实现部分

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.lang.management.ManagementFactory;
import java.math.BigDecimal;
import java.net.HttpURLConnection;
import java.net.URL;
import java.util.concurrent.BlockingQueue;
import java.util.concurrent.LinkedBlockingQueue;

public class MyPoolSizeCalculator extends PoolSizeCalculator {

    public static void main(String[] args) throws InterruptedException,
            InstantiationException,
            IllegalAccessException,
            ClassNotFoundException {
        MyPoolSizeCalculator calculator = new MyPoolSizeCalculator();
        calculator.calculateBoundaries(new BigDecimal(1.0), //cpu 的利用率，1就是100%
                new BigDecimal(100000));
    }

    // u need to implement
    protected long getCurrentThreadCPUTime() {
        return ManagementFactory.getThreadMXBean().getCurrentThreadCpuTime();
    }

    // u need to implement
    protected Runnable creatTask() {
        return new AsyncIOTask();
    }

    // u need to implement
    protected BlockingQueue<Runnable> createWorkQueue() {
        return new LinkedBlockingQueue();
    }

}


class AsyncIOTask implements Runnable {

    @Override
    public void run() {
        HttpURLConnection connection = null;
        BufferedReader reader = null;
        try {
            String getURL = "http://baidu.com";
            URL getUrl = new URL(getURL);

            connection = (HttpURLConnection) getUrl.openConnection();
            connection.connect();
            reader = new BufferedReader(new InputStreamReader(
                    connection.getInputStream()));

            String line;
            while ((line = reader.readLine()) != null) {
                // empty loop
            }
        } catch (IOException e) {

        } finally {
            if (reader != null) {
                try {
                    reader.close();
                } catch (Exception e) {

                }
            }
            connection.disconnect();
        }
    }
}

这段代码 targetQueueSizeBytes ，不知道什么含义。

2. 多线程间如何同步呢？一般有好几种方式。

第一种是 synchronized

第二种 lock

前二种是带锁的方式，有没有不带锁的方式呢？当然有

atomic, copyonwrite等。

多线程同时写，加锁，我们可以理解，但读与写，要不要加锁呢？要回答这个问题，就要明白我们多线程为什么要同步? 答案很简单，要保持数据的完整性。试想一下，一个线程向一个文件写数据，大约在一天内完成。一个线程从这个文件读文件，也大约要一天的时间，那么读线程刚读了一部分数据，还没有读完，未读的部分被写线程修改了，读线程读取修改的部分。试想一下，这份数据还有价值吗？完整是废数据，一点用都没有。基于这一点，读写线程，写写线程才做同步，保证数据的完整性。由此我们抛出几个问题

1. synchronized 的原理是怎么样的，性能如何？

2. volatile

3. Lock 的实现原理

4. synchronized 与Lock 的区别。

我们来说第一个问题， synchroinzed 的原理

Synchronized

This is the java keyword, which isimplemented using monitors. Each java is associated with a monitor, which athread can lock or unlock.

Only one thread at a time may hold a lockon a monitor. Any other threads attempting to lock that monitor are blockeduntil they can obtain a lock on that monitor. A thread t may lock particular monitormultiple times. Each unlock reverses the effect of one lock operation.

The synchronized statement computes areference to an object.it then attempts to perform a lock on that’s objectsmonitor and does not proceed further until the lock action hassuccessfully completed. After the lock action has been performed, thebody of the synchronized statement is executed. If execution of the body isever completed, either normally or abruptly, an unlock action is automatically performedon that same monitor.

Othermechanisms, such as reads and writes of volatile variables and the use ofclasses in the java.util.concurrent package, provide alternative ways ofsynchronized.

从以上的文字，我们可以看出，synchronized 的实现方式，在对象上面加一个monitor lock，哪个线程获取到lock，哪个线程就可以perform. synchronized 范围内的代码执行完之后，自动释放monitor lock. 在wait set 里的线程，被notify, 开始获取monitor lock执行。

会在同步块的前后分别形成monitorenter和monitorexit这两个字节码指令。根据虚拟机规范的要求，在执行monitorenter指令时，首先要尝试获取对象的锁，如果获得了锁，把锁的计数器加1，相应地，在执行monitorexit指令时会将锁计数器减1，当计数器为0时，锁便被释放了。

synchronized 锁在java 1.6 里有四种状态：

无锁状态，偏向锁状态，轻量级锁状态和重量级锁状态，它会随着竞争情况逐渐升级。锁可以升级但不能降级，意味着偏向锁升级成轻量级锁后不能降级成偏向锁。

偏向锁：

Hotspot的作者经过以往的研究发现大多数情况下锁不仅不存在多线程竞争，而且总是由同一线程多次获得，为了让线程获得锁的代价更低而引入了偏向锁。当一个线程访问同步块并获取锁时，会在对象头和栈帧中的锁记录里存储锁偏向的线程ID，以后该线程在进入和退出同步块时不需要花费CAS操作来加锁和解锁，而只需简单的测试一下对象头的Mark Word里是否存储着指向当前线程的偏向锁，如果测试成功，表示线程已经获得了锁，如果测试失败，则需要再测试下Mark Word中偏向锁的标识是否设置成1（表示当前是偏向锁），如果没有设置，则使用CAS竞争锁，如果设置了，则尝试使用CAS将对象头的偏向锁指向当前线程。

偏向锁的撤销：偏向锁使用了一种等到竞争出现才释放锁的机制，所以当其他线程尝试竞争偏向锁时，持有偏向锁的线程才会释放锁。偏向锁的撤销，需要等待全局安全点（在这个时间点上没有字节码正在执行），它会首先暂停拥有偏向锁的线程，然后检查持有偏向锁的线程是否活着，如果线程不处于活动状态，则将对象头设置成无锁状态，如果线程仍然活着，拥有偏向锁的栈会被执行，遍历偏向对象的锁记录，栈中的锁记录和对象头的Mark Word要么重新偏向于其他线程，要么恢复到无锁或者标记对象不适合作为偏向锁，最后唤醒暂停的线程

lightweight lock （轻量级锁）

轻量级锁加锁：线程在执行同步块之前，JVM会先在当前线程的栈桢中创建用于存储锁记录的空间，并将对象头中的Mark Word复制到锁记录中，官方称为Displaced Mark Word。然后线程尝试使用CAS将对象头中的Mark Word替换为指向锁记录的指针。如果成功，当前线程获得锁，如果失败，表示其他线程竞争锁，当前线程便尝试使用自旋来获取锁。

轻量级锁解锁：轻量级解锁时，会使用原子的CAS操作来将Displaced Mark Word替换回到对象头，如果成功，则表示没有竞争发生。如果失败，表示当前锁存在竞争，锁就会膨胀成重量级锁。

锁的优缺点对比

锁	优点	缺点	适用场景
偏向锁	加锁和解锁不需要额外的消耗，和执行非同步方法比仅存在纳秒级的差距。	如果线程间存在锁竞争，会带来额外的锁撤销的消耗。	适用于只有一个线程访问同步块场景。
轻量级锁	竞争的线程不会阻塞，提高了程序的响应速度。	如果始终得不到锁竞争的线程使用自旋会消耗CPU。	追求响应时间。同步块执行速度非常快。
重量级锁	线程竞争不使用自旋，不会消耗CPU。	线程阻塞，响应时间缓慢。	追求吞吐量。同步块执行速度较长。

下面说第二个问题：

Volatile

If youare working with muti-threaded programming, the volatile keyword will be moreuseful. When multiple threads using the same variable, each thread will haveits own copy of the local cache for that variable.

So,when it’s updating the value, it is actually updated in the local cache not inthe main variable memory. The other thread which is using the same variabledoesn’t know anything about the values changed by the another thread.

To avoid this problem, if you declare a variable as volatile, then it will not bestored in the local cache. Whenever thread are updating the values, it isupdated to the main memory .So other threads can access the updated value.

第三个问题：

Lock

A lock is a thread synchronizationmechanism like synchronized blocks except locks can be more sophisticated thanjava’s synchronized blocks, so it is not like we can get totally rid of thesynchronized keyword.

Fromjava 5 the package java.util.concurrent.locks contains several lockimplementation , so you may not have to implement your own locks, But you willstill need to know how to use them,and it can still be useful to know the theory behind their implementation.

Lock Reentrance

Synchronized blocks in java are reentrance. This means, that if a javathread enters a synchronized block of code, and thereby take the lock on themonitor object the block is synchronized on, the thread can enter other java codeblocks synchronized on the same monitor object. Here is an example:

public class Reentrant{

public synchronized outer(){

inner();

}

public synchronized inner(){

//do something

}

Notice how both outer() and inner() aredeclared synchronized, which in java is equivalent to a synchronized(this)block. If a thread calls outer() there is no problem calling inner() frominside outer(), since both methods are synchronized on the same monitor object(‘this’).If a thread already holds the lock on a monitor object, it has access to allblocks synchronized on the same monitor object. This is called reentrance. Thethread can reenter any block of code for which it already holds the lock.

Lock Fairness

Java’ssynchronized blocks makes no guarantees about the sequence in which threadstrying to enter them are granted access. Therefore, if many threads areconstantly competing for access to the same synchronized block, there is a riskthat one or more of the threads are never granted access-that access is alwaysgranted to other threads. This is called starvation. To avoid this a Lockshould be fair. Since the Lock implementations shown in this text usessynchronized blocks internally, they do not guarantee fairness.

第四个问题：

从上面的分析，我们可以看出，lock 在写法上更加的flexible, 同时Lock 引入了Lock fairness的concept.

参考文档：

https://my.oschina.net/u/557580/blog/225554

https://docs.oracle.com/javase/specs/jls/se7/html/jls-17.html