Fork/Join异步
在ForkJoinPool中执行 ForkJoinTask时,可以采用同步或异步方式。当采用同步方式执行时,发送任务给Fork/Join线程池的方法直到任务执行完成后才会返回结果。而采用异步方式执行时,发送任务给执行器的方法将立即返回结果,但是任务仍能够继续执行。
需要明白这两种方式在执行任务时的一个很大的区别。当采用同步方式,调用这些方法(比如,invokeAll()方法)时,任务被挂起,直到任务被发送到Fork/Join线程池中执行完成。这种方式允许ForkJoinPool类采用工作窃取算法(Work-StealingAlgorithm)来分配一个新任务给在执行休眠任务的工作者线程(WorkerThread)。相反,当采用异步方法(比如,fork()方法)时,任务将继续执行,因此ForkJoinPool类无法使用工作窃取算法来提升应用程序的性能。在这个示例中,只有调用join()或get()方法来等待任务的结束时,ForkJoinPool类才可以使用工作窃取算法。
代码实例
ForkJoinPool和ForkJoinTask类所提供的异步方法来管理任务。我们将实现一个程序:
在一个文件夹及其子文件夹中来搜索带有指定扩展名的文件。
ForkJoinTask类将实现处理这个文件夹的内容。而对于这个文件夹中的每一个子文件,任务将以异步的方式发送一个新的任务给ForkJoinPool类。对于每个文件夹中的文件,任务将检查任务文件的扩展名,如果符合条件就将其增加到结果列表中。
Main.java
package com.packtpub.java7.concurrency.chapter5.recipe03.core;
import java.util.List;
import java.util.concurrent.ForkJoinPool;
import java.util.concurrent.TimeUnit;
import com.packtpub.java7.concurrency.chapter5.recipe03.task.FolderProcessor;
public class Main {
/**
* Main method of the example
*/
public static void main(String[] args) {
// Create the pool
ForkJoinPool pool=new ForkJoinPool();
// Create three FolderProcessor tasks for three diferent folders
FolderProcessor system=new FolderProcessor("C:\\Windows", "log");
FolderProcessor apps=new FolderProcessor("C:\\Program Files","log");
FolderProcessor documents=new FolderProcessor("C:\\Documents And Settings","log");
// Execute the three tasks in the pool
pool.execute(system);
pool.execute(apps);
pool.execute(documents);
// Write statistics of the pool until the three tasks end
do {
System.out.printf("******************************************\n");
System.out.printf("Main: Parallelism: %d\n",pool.getParallelism());
System.out.printf("Main: Active Threads: %d\n",pool.getActiveThreadCount());
System.out.printf("Main: Task Count: %d\n",pool.getQueuedTaskCount());
System.out.printf("Main: Steal Count: %d\n",pool.getStealCount());
System.out.printf("******************************************\n");
try {
TimeUnit.SECONDS.sleep(1);
} catch (InterruptedException e) {
e.printStackTrace();
}
} while ((!system.isDone())||(!apps.isDone())||(!documents.isDone()));
// Shutdown the pool
pool.shutdown();
// Write the number of results calculate by each task
List<String> results;
results=system.join();
System.out.printf("System: %d files found.\n",results.size());
results=apps.join();
System.out.printf("Apps: %d files found.\n",results.size());
results=documents.join();
System.out.printf("Documents: %d files found.\n",results.size());
}
}
FolderProcessor.java
package com.packtpub.java7.concurrency.chapter5.recipe03.task;
import java.io.File;
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.RecursiveTask;
/**
* Task that process a folder. Throw a new FolderProcesor task for each
* subfolder. For each file in the folder, it checks if the file has the extension
* it's looking for. If it's the case, it add the file name to the list of results.
*
*/
public class FolderProcessor extends RecursiveTask<List<String>> {
/**
* Serial Version of the class. You have to add it because the
* ForkJoinTask class implements the Serializable interfaces
*/
private static final long serialVersionUID = 1L;
/**
* Path of the folder this task is going to process
*/
private String path;
/**
* Extension of the file the task is looking for
*/
private String extension;
/**
* Constructor of the class
* @param path Path of the folder this task is going to process
* @param extension Extension of the files this task is looking for
*/
public FolderProcessor (String path, String extension) {
this.path=path;
this.extension=extension;
}
/**
* Main method of the task. It throws an additional FolderProcessor task
* for each folder in this folder. For each file in the folder, it compare
* its extension with the extension it's looking for. If they are equals, it
* add the full path of the file to the list of results
*/
@Override
protected List<String> compute() {
List<String> list=new ArrayList<>();
List<FolderProcessor> tasks=new ArrayList<>();
File file=new File(path);
File content[] = file.listFiles();
if (content != null) {
for (int i = 0; i < content.length; i++) {
if (content[i].isDirectory()) {
// If is a directory, process it. Execute a new Task
FolderProcessor task=new FolderProcessor(content[i].getAbsolutePath(), extension);
task.fork();
tasks.add(task);
} else {
// If is a file, process it. Compare the extension of the file and the extension
// it's looking for
if (checkFile(content[i].getName())){
list.add(content[i].getAbsolutePath());
}
}
}
// If the number of tasks thrown by this tasks is bigger than 50, we write a message
if (tasks.size()>50) {
System.out.printf("%s: %d tasks ran.\n",file.getAbsolutePath(),tasks.size());
}
// Include the results of the tasks
addResultsFromTasks(list,tasks);
}
return list;
}
/**
* Add the results of the tasks thrown by this task to the list this
* task will return. Use the join() method to wait for the finalization of
* the tasks
* @param list List of results
* @param tasks List of tasks
*/
private void addResultsFromTasks(List<String> list,
List<FolderProcessor> tasks) {
for (FolderProcessor item: tasks) {
list.addAll(item.join());
}
}
/**
* Checks if a name of a file has the extension the task is looking for
* @param name name of the file
* @return true if the name has the extension or false otherwise
*/
private boolean checkFile(String name) {
if (name.endsWith(extension)) {
return true;
}
return false;
}
}
运行结果
Main: Parallelism: 4
Main: Active Threads: 4
C:\Windows: 59 tasks ran.
Main: Task Count: 91
Main: Steal Count: 0
C:\Windows\assembly\GAC_MSIL: 294 tasks ran.
C:\Windows\assembly\NativeImages_v2.0.50727_32: 122 tasks ran.
C:\Windows\assembly\NativeImages_v2.0.50727_64: 114 tasks ran.
C:\Windows\assembly\NativeImages_v4.0.30319_32: 147 tasks ran.
C:\Windows\assembly\NativeImages_v4.0.30319_64: 141 tasks ran.
C:\Windows\Microsoft.NET\assembly\GAC_MSIL: 193 tasks ran.
Main: Parallelism: 4
Main: Active Threads: 38
Main: Task Count: 1030
Main: Steal Count: 615
C:\Windows\SysWOW64: 88 tasks ran.
C:\Windows\System32: 91 tasks ran.
Main: Parallelism: 4
Main: Active Threads: 24
Main: Task Count: 521
Main: Steal Count: 1362
C:\Windows\System32\DriverStore\FileRepository: 264 tasks ran.
Main: Parallelism: 4
Main: Active Threads: 9
Main: Task Count: 1852
Main: Steal Count: 2113
Main: Parallelism: 4
Main: Active Threads: 4
Main: Task Count: 1341
Main: Steal Count: 2268
Main: Parallelism: 4
Main: Active Threads: 4
Main: Task Count: 1313
Main: Steal Count: 4623
C:\Windows\winsxs: 13303 tasks ran.
Main: Parallelism: 4
Main: Active Threads: 5
Main: Task Count: 870
Main: Steal Count: 4623
Main: Parallelism: 4
Main: Active Threads: 2
Main: Task Count: 0
Main: Steal Count: 13058
Main: Parallelism: 4
Main: Active Threads: 2
Main: Task Count: 0
Main: Steal Count: 13058
System: 21 files found.
Apps: 6 files found.
Documents: 0 files found.
工作原理
这个范例的重点在于FolderProcessor类。每一个任务处理一个文件夹中的内容。文件夹中的内容有以下两种类型的元素:
文件;
其他文件夹。
如果主任务发现一个文件夹,它将创建另一个Task对象来处理这个文件夹,调用fork()方法把这个新对象发送到线程池中。fork()方法发送任务到线程池时,如果线程池中有空闲的工作者线程(WorkerThread)或者将创建一个新的线程,那么开始执行这个任务,fork()方法会立即返回,因此,主任务可以继续处理文件夹里的其他内容。对于每一个文件,任务开始比较它的文件扩展名,如果与要搜索的扩展名相同,那么将文件的完整路径增加到结果列表中。
一旦主任务处理完指定文件夹里的所有内容,它将调用join()方法等待发送到线程池中的所有子任务执行完成。join()方法在主任务中被调用,然后等待任务执行结束,并通过compute()方法返回值。主任务将所有的子任务结果进行合并,这些子任务发送到线程池中时带有自己的结果列表,然后通过调用compute()方法返回这个列表并作为主任务的返回值。
ForkJoinPool类也允许以异步的方式执行任务。调用execute()方法发送3个初始任务到线程池中。在Main主类中,调用shutdown()方法结束线程池,并在控制台输出线程池中任务的状态及其变化的过程。ForkJoinPool类包含了多个方法可以实现这个目的。
扩展
本范例使用join()方法来等待任务的结束,然后获取它们的结果。也可以使用get()方法以下的两个版本来完成这个目的。
- get():如果ForkJoinTask类执行结束,或者一直等到结束,那么
- get()方法的这个版本则返回由compute()方法返回的结果。
- get(long timeout, TimeUnit unit):如果任务的结果未准备好,那么get()方法的 这个版本将等待指定的时间。如果超过指定的时间了,任务的结果仍未准备好,那么这 个方法将返回 null值。
TimeUnit是一个枚举类,有如下的常量:DAYS、HOURS、MICROSECONDS、MILLISECONDS、MINUTES、NANOSECONDS和SECONDS。
get()方法和join()方法还存在两个主要的区别:
- join()方法不能被中断,如果中断调用join()方法的线程,方法将抛出InterruptedException异常;
- 如果任务抛出任何运行时异常,那么 get()方法将返回ExecutionException异常,但是join()方法将返回RuntimeException异常。