并行计算圆周率

最新推荐文章于 2023-10-13 17:30:48 发布

judyge

最新推荐文章于 2023-10-13 17:30:48 发布

阅读量4.1k

点赞数 2

分类专栏：高级计算与工程

高级计算与工程专栏收录该内容

104 篇文章 1 订阅

订阅专栏

看到这个题目，俗了，大家都在计算圆周率。不过咱们的目的是看一下并行计算的基本流程。

书上计算PI用的是精确的数值计算方法，我这里再给出一种概率计算方法。

OpenMP和MPI将同时亮相。

计算PI的方法

1.tan(PI/4)=1 => PI=4arctan1。知道arctan1转化为定积分的形式是什么吧。

利用arctan(x)的幂级数展开式，可以手工地计算PI

另外也可以采用正式手工计算PI

 
         #include<stdio.h> 
        
         #include<time.h> 
        
         #define N 1000000 
        
         main(){ 
        
         double  
         local,pi=0.0,w; 
        
         long  
         i; 
        
         w=1.0/N; 
        
         clock_t  
         t1= 
         clock 
         (); 
        
         for 
         (i=0;i<N;i++){ 
        
         local=(i+0.5)*w; 
        
         pi=pi+4.0/(1.0+local*local); 
        
         } 
        
         clock_t  
         t2= 
         clock 
         (); 
        
         printf 
         ( 
         "PI is %.20f\n" 
         ,pi*w); 
        
         printf 
         ( 
         "Time: %.2f seconds\n" 
         ,( 
         float 
         )(t2-t1)/CLOCKS_PER_SEC); 
        
         }

orisun@orisun-desktop:~/Program$ ./PI1

PI is 3.14159265358976336202

Time: 0.02 seconds

2.以坐标原点为形心，作半径为1的圆和边长为2的正方形。正方形与圆的面积之比即为PI

 
         #include<stdio.h> 
        
         #include<stdlib.h> 
        
         #include<time.h> 
        
         #include<math.h> 
        
         #define N 1000000 
        
         main(){ 
        
         long  
         i,sum; 
        
         double  
         x,y; 
        
         srand 
         ((unsigned) 
         time 
         (NULL)); 
        
         sum=0; 
        
         clock_t  
         t1= 
         clock 
         (); 
        
         for 
         (i=0;i<N;i++){ 
        
         x=( 
         double 
         ) 
         rand 
         ()/RAND_MAX; 
        
         y=( 
         double 
         ) 
         rand 
         ()/RAND_MAX; 
        
         if 
         (x*x+y*y<1) 
        
         sum++; 
        
         } 
        
         clock_t  
         t2= 
         clock 
         (); 
        
         printf 
         ( 
         "PI is %.20f\n" 
         ,4*( 
         double 
         )sum/N); 
        
         printf 
         ( 
         "Time: %.2f\n" 
         ,( 
         float 
         )(t2-t1)/CLOCKS_PER_SEC); 
        
         }

orisun@orisun-desktop:~$ ./PI0

PI is 3.14301599999999980994

Time: 0.16

对比可以看到方法1在计算精度和速度上都具有绝对的优势。在下面的openMP和MPI计算中我们都采用方法1。

OpenMP

OpenMP[OMP]是一个编译器指令和库函数的集合（已包含在gcc中），它用于为共享存储器计算机创建并行程序。OMP组合了C、C++和Fortran。

 
         #include<stdio.h> 
        
         #include<time.h> 
        
         #include<omp.h> 
        
         #define N 1000000 
        
         main(){ 
        
         double  
         local,pi=0.0,w; 
        
         long  
         i; 
        
         w=1.0/N; 
        
         clock_t  
         t1= 
         clock 
         (); 
        
         #pragma omp parallel for private(local) reduction(+:pi) 
        
         for 
         (i=0;i<N;i++){ 
        
         local=(i+0.5)*w; 
        
         pi=pi+4.0/(1.0+local*local); 
        
         } 
        
         clock_t  
         t2= 
         clock 
         (); 
        
         printf 
         ( 
         "PI is %.20f\n" 
         ,pi*w); 
        
         printf 
         ( 
         "Time: %.2f seconds\n" 
         ,( 
         float 
         )(t2-t1)/CLOCKS_PER_SEC); 
        
         }

orisun@orisun-desktop:~/Program$ ./PI2

PI is 3.14159265358976336202

Time: 0.02 seconds

跟串行计算结果是一模一样。

#pragma omp parallel表示下面的一行代码或代码块要分配到多个执行单元中并行计算。

#pragma omp parallel for用在一个for循环的前面

private(local)默认情况下定义在并行代码之外的变量为各并行的执行单元所共享，使用private限制，表示每个执行单元创建该变量的一个副本

reduction(+:pi)表示并行代码执行完毕后对各个执行单元中的pi进行相加操作

MPICH2

ubuntu下首先下载mpich.tar.gz，然后按照常规的软件安装方法（configure、make、make install）安装mpi就可以了。

MPI（Message Parsing Interface）消息传递接口是用于分布式存储器并行计算机的标准编程环境。MPI的核心构造是消息传递：一个进程将信息打包成消息，并将该消息发送给其他进程。MPI最常用的两个实现是LAM/MPI[LAM]和MPICH[MPI]。

在MPI中执行单元（UE）指的就是进程。

 
         #include<stdio.h> 
        
         #include<mpi.h> 
        
         #include<math.h> 
        
         int  
         main( 
         int  
         argc, 
         char  
         *argv[]){ 
        
         int  
         my_rank,num_procs; 
        
         int  
         i,n=0; 
        
         double  
         sum,width,local,mypi,pi; 
        
         double  
         start=0.0,stop=0.0; 
        
         int  
         proc_len; 
        
         char  
         processor_name[MPI_MAX_PROCESSOR_NAME]; 
        
         MPI_Init(&argc,&argv);           
         //初始化环境 
        
         MPI_Comm_size(MPI_COMM_WORLD,&num_procs);    
         //获取并行的进程数 
        
         MPI_Comm_rank(MPI_COMM_WORLD,&my_rank);      
         //当前进程在所有进程中的序号 
        
         MPI_Get_processor_name(processor_name,&proc_len);    
         //获取总的处理机数和各个处理机的名称 
        
         printf 
         ( 
         "Processor %d of %d on %s\n" 
         ,my_rank,num_procs,processor_name); 
        
         if 
         (my_rank==0){ 
        
         printf 
         ( 
         "please give n=" 
         ); 
        
         scanf 
         ( 
         "%d" 
         ,&n); 
        
         start=MPI_Wtime();               
         //MPI计时 
        
         } 
        
         MPI_Bcast(&n,1,MPI_INT,0,MPI_COMM_WORLD);    
         //把n广播给本通信环境中的所有进程 
        
         width=1.0/n; 
        
         sum=0.0; 
        
         for 
         (i=my_rank;i<n;i+=num_procs){ 
        
         local=width*(( 
         double 
         )i+0.5); 
        
         sum+=4.0/(1.0+local*local); 
        
         } 
        
         mypi=width*sum; 
        
         MPI_Reduce(&mypi,&pi,1,MPI_DOUBLE,MPI_SUM,0,MPI_COMM_WORLD);     
         //由进程0进行归约，把每个进程计算出来的mypi进行相加（MPI_SUM）,赋给pi 
        
         if 
         (my_rank==0){ 
        
         printf 
         ( 
         "PI is %.20f\n" 
         ,pi); 
        
         stop=MPI_Wtime(); 
        
         printf 
         ( 
         "Time: %f\n" 
         ,stop-start); 
        
         fflush 
         (stdout); 
        
         } 
        
         MPI_Finalize(); 
        
         return  
         0; 
        
         }

MPI_REDUCE(sendbuf,recvbuf,count,datatype,op,root,comm)
 IN   sendbuf   发送消息缓冲区的起始地址(可变)
 OUT  recvbuf   接收消息缓冲区中的地址(可变,仅对于根进程)
 IN   count     发送消息缓冲区中的数据个数(整型)
 IN   datatype  发送消息缓冲区的元素类型(句柄)
 IN   op        归约操作符(句柄)
 IN   root      根进程序列号(整型)
 IN   comm      通信子(句柄)

MPI_BCAST(buffer,count,datatype,root,comm) 
　IN/OUT　buffer　　  通信消息缓冲区的起始地址(可变)
　IN　　　 count　  　 通信消息缓冲区中的数据个数(整型) 
　IN 　　　datatype 　通信消息缓冲区中的数据类型(句柄) 
　IN　　　 root　  　　发送广播的根的序列号(整型) 
　IN 　　　comm   　　通信子(句柄)

orisun@orisun-desktop:~/Program$ mpicc -o PI3 PI3.c　　　　　　%使用mpicc编译

orisun@orisun-desktop:~/Program$ mpirun -np 4 ./PI3　　　　　　%指定number of processor为4

Processor 0 of 4 on orisun-desktop

please give n=Processor 2 of 4 on orisun-desktop

Processor 1 of 4 on orisun-desktop

Processor 3 of 4 on orisun-desktop

1000000

PI is 3.14159465358887635134

Time: 0.012510

orisun@orisun-desktop:~/Program$ mpdcleanup

时间是0.01251秒，比0.02秒明显减少。

注意输出中有这么一行：please give n=Processor 2 of 4 on orisun-desktop