Using JDK 7's Fork/Join Framework

最新推荐文章于 2021-01-30 17:32:29 发布

hanxinis

最新推荐文章于 2021-01-30 17:32:29 发布

阅读量487

点赞数

分类专栏： java 文章标签： jdk processing internationalization file string concurrency

java 专栏收录该内容

16 篇文章 0 订阅

订阅专栏

Using JDK 7's Fork/Join Framework

June 27, 2011

What Is Fork/Join?

Fork/Join is an enhancement to the ExecutorService implementation that allows you to more easily break up processing to be executed concurrently, and recursively, with little effort on your part. It's based on the work of Doug Lea, a thought leader on Java concurrency, at SUNY Oswego. Fork/Join deals with the threading hassles; you just indicate to the framework which portions of the work can be broken apart and handled recursively. It employs a divide and conquer algorithm that works like this in pseudocode (as taken from Doug Lea's paper on the subject):

 
         Result doWork(Work work) {  
        
         if 
         (work is small) {  
        
         process the work  
        
         }  
        
         else 
         {  
        
         split up work  
        
         invoke framework to solve both parts  
        
         }  
        
         }

It's your job to determine the amount of work to process before splitting it up. If it's too granular, the overhead of the Fork/Join framework may hurt performance. But if it's just right, the advantage of parallelism will increase performance. For instance, the sample application we'll examine will look for XML files to process in a set of directories. If there are too many files, the code will use the Fork/Join framework to recursively break down the workload across multiple threads. Since XML file processing involves a combination of I/O and CPU work, this is a perfect use of Fork/Join.

The framework handles the threads based on available resources. It also employs a second algorithm called work stealing, where idle threads can steal work from busy threads to help spread the load around without spawning new threads. The same type of algorithm is often used in garbage collectors that use parallel worker threads to walk the heap.

Java 7 Fork/Join Processing Example

Let's explore a sample application that checks a set of work directories for new XML files. As the files are processed, they're moved out of the work directories and into a special "processed" directory. This sample is loosely based on a news processing system I worked on years ago, where news articles were written to the appropriate directories as they were published. Then, a worker process that periodically checked the directories would process the files, and make them available on a website.

The code below is the complete Fork/Join XML processing application (minus the actual XML processing details). The main class, XMLProcessingForkJoin, starts off the actual parsing of files within a directory periodically. It uses the ProcessXMLFiles class, which extends the Fork/Join framework's java.util.concurrent.RecursiveAction base class, to recursively split up and process all the files in the source directory.

 
         public 
          class 
         XMLProcessingForkJoin {  
        
         class 
         ProcessXMLFiles  
         extends 
         RecursiveAction {  
        
         static 
         final 
         int 
          FILE_COUNT_THRESHOLD =  
         2 
         ;  
        
         String sourceDirPath;  
        
         String targetDirPath;  
        
         File[] xmlFiles =  
         null 
         ;  
        
         public 
         ProcessXMLFiles(String sourceDirPath, String targetDirPath, File[] xmlFiles) {  
        
         this 
         .sourceDirPath = sourceDirPath;  
        
         this 
         .targetDirPath = targetDirPath;  
        
         this 
         .xmlFiles = xmlFiles;  
        
         }  
        
         @Override 
        
         protected 
         void 
         compute() {  
        
         try 
         {  
        
         // Make sure the directory has been scanned  
        
         if 
         ( xmlFiles ==  
         null 
         ) {  
        
         File sourceDir =  
         new 
         File(sourceDirPath);  
        
         if 
         ( sourceDir.isDirectory() ) {  
        
         xmlFiles = sourceDir.listFiles();  
        
         }  
        
         }  
        
         // Check the number of files  
        
         if 
         ( xmlFiles.length <= FILE_COUNT_THRESHOLD ) {  
        
         parseXMLFiles(xmlFiles);  
        
         }  
        
         else 
         {  
        
         // Split the array of XML files into two equal parts  
        
         int 
         center = xmlFiles.length /  
         2 
         ;  
        
         File[] part1 = (File[])splitArray(xmlFiles,  
         0 
         , center);  
        
         File[] part2 = (File[])splitArray(xmlFiles, center, xmlFiles.length);  
        
         invokeAll( 
         new 
         ProcessXMLFiles(sourceDirPath, targetDirPath, part1 ),  
        
         new 
         ProcessXMLFiles(sourceDirPath, targetDirPath, part2 ));  
        
         }  
        
         }  
        
         catch 
         ( Exception e ) {  
        
         e.printStackTrace();  
        
         }  
        
         }  
        
         protected 
         Object[] splitArray(Object[] array,  
         int 
         start,  
         int 
          end) {  
        
         int 
         length = end - start;  
        
         Object[] part =  
         new 
         Object[length];  
        
         for 
         (  
         int 
          i = start; i < end; i++ ) {  
        
         part[i-start] = array[i];  
        
         }  
        
         return 
         part;  
        
         }  
        
         protected 
         void 
         parseXMLFiles(File[] filesToParse) {  
        
         // Parse and copy the given set of XML files  
        
         // ...  
        
         }  
        
         }  
        
         public 
         XMLProcessingForkJoin(String source, String target) {  
        
         // Periodically invoke the following lines of code:  
        
         ProcessXMLFiles process =  
         new 
         ProcessXMLFiles(source, target,  
         null 
         );                  
        
         ForkJoinPool pool =  
         new 
         ForkJoinPool();  
        
         pool.invoke(process);  
        
         }  
        
         // Start the XML file parsing process with the Java SE 7 Fork/Join framework  
        
         public 
         static 
         void 
          main(String[] args) {  
        
         if 
         ( args.length <  
         2 
         ) {  
        
         System.out.println( 
         "args - please specify source and target dirs" 
         );  
        
         System.exit(- 
         1 
         );  
        
         }  
        
         String source = args[ 
         0 
         ];  
        
         String target = args[ 
         1 
         ];  
        
         XMLProcessingForkJoin forkJoinProcess =   
        
         new 
         XMLProcessingForkJoin(source, target);  
        
         }  
        
         }

It starts with the main class's constructor, XMLProcessingForkJoin, where a new ProcessXMLFiles object is created and handed off to the Fork/Join framework via a call to ForkJoinPool.invoke(). The framework then calls the object's compute() method. First, a check is made to populate the list of files within the directory. Next, if the number of files to process is at or below a threshold (two files in this case), the files are processed and we're done. Otherwise, the array of files is split into two parts, and two new Fork/Join tasks are created to process each sublist of files, and so on, recursively, until all the files are parsed and processed.

Since the code just parses XML files, I chose to extend RecursiveAction in this application. If your processing actually returns a result that needs to be combined with the results of other Fork/Join subtasks (i.e. sorting, compressing data, tallying numbers, and so on), then you can extend RecursiveTask. I'll take a closer look at this and other changes to the concurrent classes in Java SE 7 in a future blog.

Happy coding!
-EJB