今天面试过程中碰到一个简单的多线程面试题目,竟然一时钻了牛角尖,没有回答上来,结束面试立刻醒悟过来,想想真丢人。
面试题目如下:如何多线程计算 1+2+3+……n,其中n是一个很大的数值,不使用直接的求职公式。
因为总是碰到类似于计数器的问题,(多个线程进行计数),所以思路不自觉的就转到了计数器的处理思路上去了:设置多个线程共享的一个 Integer sum,然后多个线程瓜分 1到n 的整数值,并分别增加共享的sum。当时写的实现思路(只是核心部分):
/**
*简单示例程序,说明算法,不能编译
*/
public class MultiSum implements Runnable {
int min;
int max;
Integer sum;
MultiSum(int min, int max, Integer sum) {
this.min = min;
this.max = max;
this.sum = sum;
}
public void run() {
synchronized (sum) {
for (int i = min; i <= max; i++) {
sum += i;
}
}
}
public static void main(String[] args) {
Integer sum = 0;
//简单使用3个线程计算从1到300的加和
new Thread(new MultiSum(1, 100, sum) ).start();
new Thread(new MultiSum(101, 200, sum)).start();
new Thread(new MultiSum(201, 300, sum)).start();
Thread.sleep(200);
System.out.println("sum from 1 to 300 is:"+sum);
}
}
因为对共享的 sum 变量进行加和,需要同步,这里对 sum 的加和就成为了一个瓶颈--多线程完全无法并行计算,只能等待拿到Sum的锁。其实这就变成了一个需要执行线程调度的串行操作,无论如何效率都不会超过单线程执行。
上面的错误的原因就在于片面的理解了使用多线程的原因:多线程不仅可以在阻塞场景下的可以允许调度其他线程执行以便获得更多的 throughput, 而且可以充分的利用现在多核技术(包括分布式环境下),使得多个线程可以并行的计算。
应用在此加和的场景下,其实就是一个简单的分治算法,将 1到n 的加和平均分配到多个线程计算,然后再将加和的结果汇总。下面是写出的程序(注意分割时候边界,注意 Integer的范围:2^(31)-1):
package com.llq.TestJava;
/**
* 类MultiThreadSum.java的实现描述:多线程执行加和操作
*
* @author llq 2014-3-5 下午10:32:10
*/
public class MultiThreadSum implements Runnable {
private Integer sum;
private final int fromInt;
private final int toInt;
private final int threadNo;
public MultiThreadSum(Integer sum, int fromInt, int toInt, int threadNo) {
this.sum = sum;
this.fromInt = fromInt;
this.toInt = toInt;
this.threadNo = threadNo;
}
/***
* 对sum进行加和计算,在sum原始值的基础上从 fromInt 开始加和,一直到 toInt 结束(包含fromInt 和 toInt)的数值
*/
public void run() {
long current = System.currentTimeMillis();
for (int i = fromInt; i <= toInt; i++) {
this.sum += i;
}
current = System.currentTimeMillis() - current;
System.out.println("Thread." + threadNo + " executes sum from " + fromInt + " to " + toInt + " in " + current
+ " milseconds. Sum is " + sum);
}
public static void main(String[] args) {
Integer toMax = 20000; //对从1到20,000进行加和
Integer sumInteger = 0;
int threads = 8; //计算线程数
//每个线程计算一段连续的加和,并将加和结果保存在数组中。
Integer[] subSum = new Integer[threads];
for (int i = 0; i < threads; i++) {
subSum[i] = new Integer(0);
}
for (int i = 0; i < threads; i++) {
int fromInt = toMax * i / threads + 1; //边界条件
int toInt = toMax * (i + 1) / threads; //边界条件
new Thread(new MultiThreadSum(subSum[i], fromInt, toInt, i)).start();
}
try {
Thread.sleep(10); //等待加和程序执行结束
for (int i = 0; i < threads; i++) {
sumInteger += subSum[i];
}
System.out.println("The sum is :" + sumInteger);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
大功告成!执行,结果却出乎意料:
Thread.0 executes sum from 1 to 2500 in 1 milseconds. Sum is 3126250
Thread.1 executes sum from 2501 to 5000 in 1 milseconds. Sum is 9376250
Thread.2 executes sum from 5001 to 7500 in 0 milseconds. Sum is 15626250
Thread.4 executes sum from 10001 to 12500 in 0 milseconds. Sum is 28126250
Thread.6 executes sum from 15001 to 17500 in 0 milseconds. Sum is 40626250
Thread.5 executes sum from 12501 to 15000 in 0 milseconds. Sum is 34376250
Thread.3 executes sum from 7501 to 10000 in 0 milseconds. Sum is 21876250
Thread.7 executes sum from 17501 to 20000 in 0 milseconds. Sum is 46876250
The sum is :0
每个线程的执行都是OK的,但是最后给出的结果是: 0 !注意了:Integer 等java基本类型的包装类虽然是传递的对象引用,但对象本身是不可变的。String同样如此。对这些对象值的改变,其实改变了引用本身。即变量保存了一个新的引用。 可见写程序一定要小心谨慎,打好基础,否则处处是坑啊。
这种问题的解决非常方便了,只要包装一个类就OK了。修改后的程序:
package com.llq.TestJava;
/**
* 定义简单的可变int包装类,事例用
*/
class MutableInteger {
int value;
MutableInteger(int v) {
this.value = v;
}
}
package com.llq.TestJava;
/**
* 类MultiThreadSum.java的实现描述:多线程执行加和操作
*
* @author llq 2014-3-5 下午10:32:10
*/
public class MultiThreadSum implements Runnable {
private final MutableInteger sum;
private final int fromInt;
private final int toInt;
private final int threadNo;
public MultiThreadSum(MutableInteger sum, int fromInt, int toInt, int threadNo) {
this.sum = sum;
this.fromInt = fromInt;
this.toInt = toInt;
this.threadNo = threadNo;
}
/***
* 对sum进行加和计算,在sum原始值的基础上从 fromInt 开始加和,一直到 toInt 结束(包含fromInt 和 toInt)的数值
*/
public void run() {
long current = System.currentTimeMillis();
for (int i = fromInt; i <= toInt; i++) {
this.sum.value += i;
}
current = System.currentTimeMillis() - current;
System.out.println("Thread." + threadNo + " executes sum from " + fromInt + " to " + toInt + " in " + current
+ " milseconds. Sum is " + sum.value);
}
public static void main(String[] args) {
Integer toMax = 20000; //对从1到20,000进行加和
Integer sumInteger = 0;
int threads = 8; //计算线程数
//每个线程计算一段连续的加和,并将加和结果保存在数组中。
MutableInteger[] subSum = new MutableInteger[threads];
for (int i = 0; i < threads; i++) {
subSum[i] = new MutableInteger(0);
}
for (int i = 0; i < threads; i++) {
int fromInt = toMax * i / threads + 1; //边界条件
int toInt = toMax * (i + 1) / threads; //边界条件
new Thread(new MultiThreadSum(subSum[i], fromInt, toInt, i)).start();
}
try {
Thread.sleep(10); //等待加和程序执行结束
for (int i = 0; i < threads; i++) {
sumInteger += subSum[i].value;
}
System.out.println("The sum is :" + sumInteger);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
结果如下,OK!
Thread.1 executes sum from 2501 to 5000 in 0 milseconds. Sum is 9376250
Thread.3 executes sum from 7501 to 10000 in 0 milseconds. Sum is 21876250
Thread.5 executes sum from 12501 to 15000 in 0 milseconds. Sum is 34376250
Thread.7 executes sum from 17501 to 20000 in 0 milseconds. Sum is 46876250
Thread.0 executes sum from 1 to 2500 in 0 milseconds. Sum is 3126250
Thread.2 executes sum from 5001 to 7500 in 0 milseconds. Sum is 15626250
Thread.4 executes sum from 10001 to 12500 in 0 milseconds. Sum is 28126250
Thread.6 executes sum from 15001 to 17500 in 0 milseconds. Sum is 40626250
The sum is :200010000
几个需要注意的地方:
- 这种通过分治算法,充分利用多核提高性能的多线程程序,在归并的时候,需要进行线程间的同步,同步的策略还需要在研究下(wait,notify, notifyAll 机制是否能满足同步要求?)。这里为了简单,主线程使用了Sleep 的模式,在耗时或者无法估算时间的情况下,肯定是不行的。
- 如何扩展到多个JVM的场景,即分布式场景下。
- 如何确定线程的数量,从而达到最高的执行效率。(这个是面试的时候着重问的,完全没有思路。)