Linux下MPI并行编程环境搭建配置

MPI的全称是Message Passing Interface即标准消息传递界面,可以用于并行计算。MPI有多种实现版本,如MPICH, CHIMP以及OPENMPI。这里我们采用MPICH版本。

一、MPICH安装

下载:http://www.mpich.org/static/downloads/3.0.4/mpich-3.0.4.tar.gz

tar -xzvf soft/mpich-3.0.4.tar.gz
cd mpich-3.0.4/
./configure --prefix=/usr/local/mpich
make && make install

安装后加入环境变量/etc/profile,并执行 source /etc/profile

PATH=$PATH:/usr/local/mpich/bin
MANPATH=$MANPATH:/usr/local/mpich/man
export PATH MANPATH

二、单节点测试

复制源代码包下的examples目录到安装目录下

cp -r examples/ /usr/local/mpich

执行

mpirun -np 10 ./examples/cpi

输出结果如下:

Process 0 of 10 is on server150
Process 9 of 10 is on server150
Process 1 of 10 is on server150
Process 4 of 10 is on server150
Process 5 of 10 is on server150
Process 7 of 10 is on server150
Process 2 of 10 is on server150
Process 3 of 10 is on server150
Process 6 of 10 is on server150
Process 8 of 10 is on server150

pi is approximately 3.1415926544231256, Error is 0.0000000008333325
wall clock time = 0.020644

如果我们现在想编译文件: 在/home/houqingdong下执行:  mpicc -o hello  hello.c  这时候会提醒:-bash:mpicc  command not found 这是因为我们还没有配置路径

在命令行下输入: export PATH=/home/houqingdong/mpiexe/bin:$PATH   注意:这里仅仅是暂时的设置路径,在重启后效果会消失,如果想一劳永逸的配置,请google查询

看一下我们配置是否成功可以执行一下  echo $PATH 看一下输出结果中是否有我们的路径

 

 

三、集群配置

1、集群机器上面需要配置ssh登录权限。参考:Hadoop-0.21.0在linux分布式集群配置  中的ssh配置(密钥无密码登录)部分。

2、复制编译程序到其他机器上面

scp -r mpich server140:/usr/local/
scp -r mpich server151:/usr/local/
scp -r mpich server130:/usr/local/
scp -r mpich server143:/usr/local/

同时在每台机器上面相应加入环境变量中。

3、
在/usr/local/mpich 下新建servers文件,内容如下:

server150:2 #run 2 process
server140:2
server130:2
server143:2
server151:2

执行下面命令,并指定servers文件

mpiexec -n 10 -f servers ./examples/cpi

输出

Process 0 of 10 is on server150
Process 1 of 10 is on server150
Process 4 of 10 is on server140
Process 5 of 10 is on server140
Process 6 of 10 is on server143
Process 7 of 10 is on server143
Process 8 of 10 is on server130
Process 9 of 10 is on server130
Process 2 of 10 is on server151
Process 3 of 10 is on server151
pi is approximately 3.1415926544231256, Error is 0.0000000008333325
wall clock time = 0.018768

四、参考资料

相关文章
阅读更多
想对作者说点什么?

博主推荐

换一批

没有更多推荐了,返回首页