Relion在K1 Power Linux实践
本篇文章主要介绍了冷冻电镜方案中,Relion软件如何在K1 PowerLinux上进行部署。 完整的记录了过程,其中关于几个常用的HPC软件方面的下载方式也一并提供出来。比如XLC、ESSL、SpectrumMPI都是社区版。可以注册IBM ID之后下载。
测试环境背景
硬件环境
服务器型号FP5468G2
部件 | 描述 | 数量 |
---|---|---|
CPU | IBM OpenPOWERSforza 22Core 2.75Ghz 190W 2颗 | 2 |
内存 | 32GB DDR4 RDIMM 2666MHz | 16 |
硬盘 | SATA 3.5寸4T | 20 |
硬盘 | NVMe 960GB | 1 |
RAID1 | 十六通道高性能SAS RAID卡2G缓存支持0,1,5,6,10,50,60 | 1 |
RAID2 | 八通道高性能SAS RAID卡2G缓存支持0,1,5,6,10,50,60 | 1 |
PCIe标准卡 | 10双光口 | 2 |
GPU | NVIDIA T4 | 16 |
电源 | 2000W金牌电源 双电源2+2冗余 | 1 |
软件环境
OS | CentOS-76-power9-Minimal-1810.iso |
---|---|
kernel | 4.14.0-115.el7a.0.1.ppc64le |
GPU-driver | 440.33.01 |
CUDA | cuda-repo-rhel7-10-2-local-10.2.89-440.33.01-1.0-1.ppc64le.rpm |
RELION | 3.07 |
ESSL | essl.community.6.2.0-0.tar.gz |
cmake3 | version 3.13.5 |
cmake | version 2.8.12.2 |
安装包准备:
安装操作系统、CUDA(包含NVIDIA-Driver)
relion下载:
git clone https://github.com/3dem/relion.git
relionbenchmark下载(50G):
wget -c ftp://ftp.mrc-lmb.cam.ac.uk/pub/scheres/relion_benchmark.tar.gz
下载fftw
wget http://www.fftw.org/fftw-3.3.8.tar.gz
XLC下载 (需要注册)
https://www.ibm.com/us-en/marketplace/xl-fortran-linux-compiler-power
IBM_XL_FORTRAN_V16.1.1.3_LINUX_COMMUNITY.tar.gz
https://www.ibm.com/us-en/marketplace/xl-cpp-linux-compiler-power
IBM_XL_C_CPP_V16.1.1.3_LINUX_COMMUNITY.tar.gz
下载spectrum MPI(国外能直接下载,国内不行)
https://www.ibm.com/us-en/marketplace/spectrum-mpi
下载ESSL(国外能直接下载国内不行)
https://developer.ibm.com/answers/questions/506654/essl-for-linux-on-power-community-edition-v62-avai/
下载OpenMPI
wget https://download.open-mpi.org/release/open-mpi/v4.0/openmpi-4.0.3.tar.gz
# relion_on_OpowerPOWER——_test
Part1 test with opmpi and gcc
############################create home folder###################
mkdir /data
fdisk nvme0n1 n 1 p 1
/dev/nvme0n1p1 on /data type ext4 (rw,relatime,data=ordered)
yum install cmake cmake3
#################################compile openmpi####################
mkdir /data/sunjian
cd /data/sunjian
wget https://download.open-mpi.org/release/open-mpi/v4.0/openmpi-4.0.3.tar.gz
tar zxvf openmpi-4.0.3.tar.gz
cd openmpi-4.0.3
./configure --prefix (default in /usr/local)
make -j32
make install
###########################install fftw ###########################
cd /data/sunjian
tar -zxvf fftw-3.3.8.tar.gz
rm -rf fftw-3.3.8.tar.gz
cd fftw-3.3.8
./configure --prefix /home/sunjian/relion/FFTW --enable-float --enable-openmp
make -j32
make install
#################################compile relion#####################
git clone https://github.com/3dem/relion.git
cd relion
cat > compile_p100.sh << - EOF
export FFTW_HOME=/data/sunjian/FFTW3
export FFTW_LIB=$FFTW_HOME/lib
export FFTW_INCLUDE=$FFTW_HOME/include
# -DGUI=OFF and -DFORCE_OWN_FFTW=OFF
CC=gcc CXX=g++ cmake3 -DBUILD_SHARED_LIBS=OFF -DCMAKE_C_FLAGS="-O3 -ftree-vectorize -ffast-math" -DCMAKE_CXX_FLAGS="-O3 -ftree-vectorize -ffast-math" -DCMAKE_EXE_LINKER_FLAGS="-L/opt/ibmmath/essl/6.2/lib64 -lessl" -DTIMING=ON -DTIMING_FFTW=ON -DCudaTexture=ON -DCUDA_ARCH=75 -DGUI=OFF -DCMAKE_BUILD_TYPE=release -DFORCE_OWN_FFTW=OFF -DFORCE_OWN_FLTK=OFF ..
EOF
-
yum install -y libtiff-devel.ppc64le libtiff-static.ppc64le
mkdir build_p100
yum install -y libtiff-devel.ppc64le libtiff-static.ppc64le
mkdir build_p100
cd build_p100
sh ../compile_p100.sh
make
ldd bin/relion_refine_mpi
bin/relion_refine_mpi --version
###############################relion_benchmark###################
tar -zxvf relion_benchmark.tar.gz
cd relion_benchmark
mkdir class3d
yum -y install time
cat > run.sh << - EOF
export PATH=/home/sunjian/relion/relion/build_P100/bin/:$PATH
#export CUDA_VISIBLE_DEVICES="0,1,2,3,4,5,6,7"
(time -p mpirun -bind-to none -n 16 relion_refine_mpi --j 11 --gpu 0:1:2:3 --pool 100 \
--dont_combine_weights_via_disc --i Particles/shiny_2sets.star --ref emd_2660.map:mrc --firstiter_cc \
--ini_high 60 --ctf --ctf_corrected_ref --iter 25 --tau2_fudge 4 --particle_diameter 360 --K 6 \
--flatten_solvent --zero_mask --oversampling 1 --healpix_order 2 --offset_range 5 --offset_step 2 --sym C1 \
--norm --scale --random_seed 0 --maxsig 500 --fast_subsets --o class3d/test01) >run_GPU4_9x22.log 2>&1
EOF
-
sh run.sh &
tail -f run_GPU4_9x22.log (其中最后real就是程序运行时间)
##
####
####
#####
######
#######
part2 install relion test with spectrum mpi and essl
############################create home folder###################
mkdir /data
fdisk nvme0n1 n 1 p 1
/dev/nvme0n1p1 on /data type ext4 (rw,relatime,data=ordered)
yum install cmake cmake3
mkdir /data/sunjian
#############################install XLC########################
tar -zxvf IBM_XL_C_CPP_V16.1.1.3_LINUX_COMMUNITY.tar.gz
#############################install XLC########################
tar -zxvf IBM_XL_C_CPP_V16.1.1.3_LINUX_COMMUNITY.tar.gz
cd images/littleEndian/rhel
sudo yum -y install *.rpm
#############################install XL FORTRAN#################
tar -zxvf IBM_XL_FORTRAN_V16.1.1.3_LINUX_COMMUNITY.tar.gz
cd images/littleEndian/rhel
sudo yum -y install *.rpm
rpm -qlp xlf.16.1.1-16.1.1.3-190426.ppc64le.rpm
#############################install SpectrumMPI################
cd /data/sunjian
ibm_smpi-10.03.01.00rtm5-rh7_20191114.ppc64le.rpm
ibm_smpi-devel-10.03.01.00rtm5-rh7_20191114.ppc64le.rpm
ibm_smpi_gpusupport-10.03.01.00rtm5-rh7_20191114.ppc64le.rpm
ibm_smpi_lic_c-10.03.00rtm5-rh7_20191114.ppc64le.rpm
sudo yum install -y *.rpm
export IBM_SPECTRUM_MPI_LICENSE_ACCEPT=yes (need root exec)
sh /opt/ibm/spectrum_mpi/lap_ce/bin/accept_spectrum_mpi_license.shsh /opt/ibm/spectrum_mpi/lap_ce/bin/accept_spectrum_mpi_license.sh
export PATH=/opt/ibm/spectrum_mpi/bin:$PATH
which mpirun
mpirun -np 1 date
cd /data/sunjian
#############################install ESSL ######################
cd /data/sunjian
tar -zxvf essl.community.6.2.0-0.tar.gz
export IBM_ESSL_LICENSE_ACCEPT=yes (Need root exec)
rpm -ivh essl.license.community-6.2.0-0.ppc64le.rpm (need root exec)
export IBM_ESSL_LICENSE_ACCEPT=yes (Need root exec)
rpm -ivh essl.license.community-6.2.0-0.ppc64le.rpm (need root exec)
cd RHEL/RHEL7
sudo yum -y install *.rpm
###########################install fftw with ESSL###########################
cd /data/sunjian
cp -r /opt/ibmmath/essl/6.2/FFTW3 .
cd FFTW3/src
make
mv lib64 ../
cd lib64
ln -fs libfftw3_essl_gcc.a libfftw3f.a
ln -fs libfftw3_essl_gcc.a libfftw3.a
cd ../include
ln -fs libfftw3_essl_gcc.a libfftw3.a
cd ../include
ln -fs fftw3_essl.h fftw3.h
///****if you want to use GCC not xlc ,pls use Makefile.gcc in src directory not Makefile*********///
#################################compile relion SPECMPI#####################
git clone https://github.com/3dem/relion.git
cd relion
cat > compile_p100.sh << - EOF
export FFTW_HOME=/data/sunjian/FFTW3
export FFTW_LIB=$FFTW_HOME/lib64
export FFTW_INCLUDE=$FFTW_HOME/include
# -DGUI=OFF and -DFORCE_OWN_FFTW=OFF
CC=gcc CXX=g++ cmake3 -DBUILD_SHARED_LIBS=OFF -DCMAKE_C_FLAGS="-O3 -ftree-vectorize -ffast-math" -DCMAKE_CXX_FLAGS="-O3 -ftree-vectorize -ffast-math" -DCMAKE_EXE_LINKER_FLAGS="-L/opt/ibmmath/essl/6.2/lib64 -lessl" -DTIMING=ON -DTIMING_FFTW=ON -DCudaTexture=ON -DCUDA_ARCH=75 -DGUI=OFF -DCMAKE_BUILD_TYPE=release -DFORCE_OWN_FFTW=OFF -DFORCE_OWN_FLTK=OFF ..
EOF
-
yum install -y libtiff-devel.ppc64le libtiff-static.ppc64le
mkdir build_p100
cd build_p100
sh ../compile_p100.sh
make
sh ../compile_p100.sh
make
ldd bin/relion_refine_mpi
bin/relion_refine_mpi --help
###############################relion_benchmark###################
tar -zxvf relion_benchmark.tar.gz
cd relion_benchmark
mkdir class3d
yum -y install time
cat > run.sh << - EOF
export PATH=/home/sunjian/relion/relion/build_essl_P100/bin:/opt/ibm/spectrum_mpi/bin:$PATH
#export CUDA_VISIBLE_DEVICES="0,1,2,3,4,5,6,7"
(time -p mpirun -pami_noib -bind-to none -n 16 relion_refine_mpi --j 11 --gpu 0:1:2:3 --pool 100 \
--dont_combine_weights_via_disc --i Particles/shiny_2sets.star --ref emd_2660.map:mrc --firstiter_cc \
--ini_high 60 --ctf --ctf_corrected_ref --iter 25 --tau2_fudge 4 --particle_diameter 360 --K 6 \
--flatten_solvent --zero_mask --oversampling 1 --healpix_order 2 --offset_range 5 --offset_step 2 --sym C1 \ --ini_high 60 --ctf --ctf_corrected_ref --iter 25 --tau2_fudge 4 --particle_diameter 360 --K 6 \
--flatten_solvent --zero_mask --oversampling 1 --healpix_order 2 --offset_range 5 --offset_step 2 --sym C1 \
--norm --scale --random_seed 0 --maxsig 500 --fast_subsets --o class3d/test01) >run_GPU2_essl_9x22.log 2>&1
EOF
-
附录:查看relion的版本
[root@kmaster ~/relion/relion_benchmark_shell]# ../relion/build_P100/bin/relion_refine_mpi --help
RELION version: 3.0.8
Precision: BASE=double, CUDA-ACC=single
=== RELION MPI setup ===
+ Number of MPI processes = 1
+ Master (0) runs on host = kmaster
=================
在relion目录下,编译T4, Cat
export FFTW_HOME=/data/rchen/FFTW3
export FFTW_LIB=$FFTW_HOME/lib
export FFTW_INCLUDE=$FFTW_HOME/include
export OMPI_CC=gcc
export OMPI_CXX=g++
# -DGUI=OFF and -DFORCE_OWN_FFTW=OFF
CC=gcc CXX=g++ cmake3 -DBUILD_SHARED_LIBS=OFF -DCMAKE_C_FLAGS="-O3 -ftree-vectorize -ffast-math" -DCMAKE_CXX_FLAGS="-O3 -ftree-vectorize -ffast-math" -DCMAKE_EXE_LINKER_FLAGS="-L/opt/ibmmath/essl/6.2/lib64 -lessl" -DTIMING=ON -DTIMING_FFTW=ON -DCudaTexture=ON -DCUDA_ARCH=75 -DGUI=OFF -DCMAKE_BUILD_TYPE=release -DFORCE_OWN_FFTW=OFF -DFORCE_OWN_FLTK=OFF ..
解压renlionbenchmark目录,执行nohup run.sh &
export PATH=/data/rchen/relion/build-essl-T4/bin:/opt/ibm/spectrum_mpi/bin:$PATH
#export CUDA_VISIBLE_DEVICES="0,1,2,3,4,5,6,7"
(time -p mpirun -pami_noib -bind-to none -n 17 relion_refine_mpi --j 11 --gpu 0:1 --pool 100 \
--dont_combine_weights_via_disc --i Particles/shiny_2sets.star --ref emd_2660.map:mrc --firstiter_cc \
--ini_high 60 --ctf --ctf_corrected_ref --iter 25 --tau2_fudge 4 --particle_diameter 360 --K 6 \
--flatten_solvent --zero_mask --oversampling 1 --healpix_order 2 --offset_range 5 --offset_step 2 --sym C1 \
--norm --scale --random_seed 0 --maxsig 500 --fast_subsets --o class3d/test01) >run_GPU2_9x22.log 2>&1
[root@relion /data/rchen/relion_benchmark]# cat run16.sh
export PATH=/data/rchen/relion/build-essl-T4/bin:/opt/ibm/spectrum_mpi/bin:$PATH
#export CUDA_VISIBLE_DEVICES="0,1,2,3,4,5,6,7"
#(time -p mpirun -pami_noib -bind-to none -n 33 relion_refine_mpi --j 5 --gpu 0:1:2:3:4:5:6:7:8:9:10:11:12:13:14:15 --pool 100 \
# --dont_combine_weights_via_disc --i Particles/shiny_2sets.star --ref emd_2660.map:mrc --firstiter_cc \
# --ini_high 60 --ctf --ctf_corrected_ref --iter 25 --tau2_fudge 4 --particle_diameter 360 --K 6 \
# --flatten_solvent --zero_mask --oversampling 1 --healpix_order 2 --offset_range 5 --offset_step 2 --sym C1 \
# --norm --scale --random_seed 0 --maxsig 500 --fast_subsets --o class3d/test01) >run_GPU16_33x5.log 2>&1
(time -p mpirun -pami_noib -bind-to none -n 17 relion_refine_mpi --j 11 --gpu 0:1:2:3:4:5:6:7:8:9:10:11:12:13:14:15 --pool 100 \
--dont_combine_weights_via_disc --i Particles/shiny_2sets.star --ref emd_2660.map:mrc --firstiter_cc \
--ini_high 60 --ctf --ctf_corrected_ref --iter 25 --tau2_fudge 4 --particle_diameter 360 --K 6 \
--flatten_solvent --zero_mask --oversampling 1 --healpix_order 2 --offset_range 5 --offset_step 2 --sym C1 \
--norm --scale --random_seed 0 --maxsig 500 --fast_subsets --o class3d/test01) >run_GPU16_17x11.log 2>&1
编译不同架构的GPU的区别,在-DCUDA_ARCH这个参数上面,架构对应是不同的。T4是75,