Java parallel programming to calculate PI

最新推荐文章于 2021-06-30 16:58:13 发布

code_tailor

最新推荐文章于 2021-06-30 16:58:13 发布

阅读量93

点赞数

分类专栏：平行计算文章标签：多线程

原文链接：https://blog.csdn.net/billgates10001/article/details/50911452

版权

平行计算专栏收录该内容

11 篇文章 0 订阅

订阅专栏

Java parallel programming to calculate PI

/**
我在研究Java平行计算的时候，遇到了一些问题，虽然我觉得平行计算最佳选择还是 cpp or c. 从 MPI 到 Open mp，再有就是 cuda. 我尝试用java计算PI值，线性代码没有问题（因为我们采用的是dart 方法，从cpp移植过来的）。但是，按照cpp的平行计算思维使用java出现了问题，问题就是：无法加速，也就是加速比都小于1，基本在 0.6 左右震荡.

我后来搜到了如下的代码，给了我提示，要感谢作者：BillGates10001
https://blog.csdn.net/billgates10001/article/details/50911452
希望作者能成为第二个Gates，虽然我没有这个打算.
**/

 public void run()   
{  
    
  int i;  
  step=1.0/(double)num_steps_wy;  
  for(i=start_wy;i<num_steps_wy;i+=2)  
  {  
   x=(i+0.5)*step;  
   sum=sum+4.0/(1.0+x*x);  
  }  
  
 }  
 public void seril_pi()  
 {  
  int i;  
  step=1.0/(double)num_steps_wy;  
  for(i=1;i<num_steps_wy;i++)  
  {  
   x=(i+0.5)*step;  
   sum=sum+4.0/(1.0+x*x);  
  }  
 }  
  
 public static void main(String[] args) throws InterruptedException {  
  
  double pi_wy,sum_wy=0.0,seri_t_wy,para_t_wy;  
  
  
  Pi_thread thread1=new Pi_thread(1);  
  Pi_thread thread2=new Pi_thread(2);  
  
  double t1=System.currentTimeMillis();  
  
  thread1.start();  
  thread2.start();  
  thread1.join();  
  thread2.join();  
  
  double t2=System.currentTimeMillis();  
  
  sum_wy=thread1.sum+thread2.sum;  
  pi_wy=thread1.step*sum_wy;  
  para_t_wy=t2-t1;  
  System.out.println("并行结果: "+pi_wy);  // parallel 
  System.out.println("并行时间: "+para_t_wy);  
    
  Pi_thread thread3=new Pi_thread(3);  
  
  t1=System.currentTimeMillis();  
  
  thread3.seril_pi();  
  
  t2=System.currentTimeMillis();  
  
  pi_wy=thread3.sum*thread3.step;  
  
  seri_t_wy=t2-t1;  
  System.out.println("窜行结果： "+pi_wy);  // 串行 ，sequential 
  System.out.println("串行时间： "+seri_t_wy);  
  System.out.println("加速比： "+(seri_t_wy/para_t_wy));  // 加速比 speed up
    
 }  
  
}

这段代码的一个小问题

问题就是：

  Pi_thread thread1=new Pi_thread(1);  
  Pi_thread thread2=new Pi_thread(2);

我们不能根据运行主机的cpu核心数来自动创建线程，目前只能手动。我曾经尝试过，加速比只有0.6 毫无意义。

public static int cpu_num = Runtime.getRuntime().availableProcessors();
    
    List thread_list = new ArrayList<Pi_thread>();
    
           for (int i = 0; i < cpu_num; i++) {
	            Pi_thread thread1 = new Pi_thread(i+1);
	            thread1.start();
	            thread1.join();
	
	            thread_list.add(thread1);
        }
        
for (int i = 0; i < thread_list.size(); i++) {
            Pi_thread thread = (Pi_thread)thread_list.get(i);
            sum_wy += thread.sum;
        }

/**
以上是我的尝试，失败了，如果你有好的办法，请回复，或者联系我。
失败的原因我猜是：
数值密集型计算，比如：加和 10亿次线性代码还可以接受，但是用平行代码，多线程，反倒是有点慢，主要是线程的管理（加入list 从list取出），而不是线程通讯。如果你使用线程池，那么速度也许会更慢，平行计算对业务场景数据规模有要求，必须符合才能显现出平行计算的威力，否则，毫无意义，只是更费电了。

作者使用的是 leap frog 方法让线程做蛙跳，这样真正实现了，一个cpu核心只做自己的一部分任务，最后把各个cpu核心计算结果相加。

Leap frog：举例：
我们有 4个线程，做加和计算从 1 到 16
for(int i=1;i<17;i++){ sum+=i; }
那么每个线程的起始数值是不一样的，thread 1: i = 1,thread 2: i=2,…thread 4 : i=4 . 然后每个线程 ++ 4 而不是 1
结果就是：
thread 1 : i=1,5,9,13
thread 2: i=2,6,10,14
thread 3: i=3,7,11,15
thread 4: i=4,8,12,16

当然，你要根据机器的CPU核心数还有计算量动态调整，记得：只有满足for循环条件的才能继续执行对于每个线程来说。

这只是一个例子，是伪代码，需要读者自己实践。其实，最好的情况是根据cpu cores 划分任务，每个core 一个任务，或者两个任务