CLM5.0模式移植篇——驴某人的自我救赎

文章说明

首先说明,这是我第一次系统性的了解Ubuntu和数值模式,本文中将会出现很多基础错误,请大家多多指教。
本文的主要目的是记录一下模式的移植过程,以防以后再次移植时忘记。
将自己这一步一步的详细经历写下来,也希望能帮助像我一样的零基础的朋友快速实现模式的运行,共勉。
本文为详细介绍移植过程中的步骤以及遇到的问题和踩的坑,较为啰嗦,但对我来说具有纪念意义,勿喷。
第一次发文,如出现笔误或者描述不足请大家及时批评指正,望和各位共同探讨学习,hanlzh@hrbnu.edu.cn。

从零开始

得益于实验室建设,将实验室小型服务器集群利用了起来,对整套服务器各层节点进行Ubuntu-18.04的重新安装。因此可以说本次移植过程是完全从零开始(零基础,空白机)。

特别感谢@creative_peng,@mxj_Bruce等人的文章,让我在移植模式前有了一定的认识,对模式的成功移植起到了关键作用。
以下相关文章,供大家参考:

  1. https://escomp.github.io/CESM/versions/cesm2.1/html/index.html #官方说明
  2. https://blog.csdn.net/mayubins/article/details/122190826?utm_medium=distribute.pc_aggpage_search_result.none-task-blog-2aggregatepagefirst_rank_ecpm_v1~rank_v31_ecpm-3-122190826.pc_agg_new_rank&utm_term=CESM2%E5%AE%89%E8%A3%85&spm=1000.2123.3001.4430
  3. https://blog.csdn.net/m0_37388053/article/details/104080143?utm_term=CESM2%E5%AE%89%E8%A3%85&utm_medium=distribute.pc_aggpage_search_result.none-task-blog-2allsobaiduweb~default-2-104080143&spm=3001.4430

类似文章还有很多,再次感谢各位前辈们的经验分享。

CESM2安装

平台介绍

算力节点x3,主控管理节点x1。
配置思路:仅将CESM2主程序以及相关环境文件配置在主控管理节点,并通过建立NFS共享文件夹形式实时同步至各算力节点。
ps:此次移植平台各节点之间已通过设置静态ip可基于交换机进行通信,即ping的通。

首先创建存储文件夹并设置局域网共享

mkdir /app/ # 用于存放安装环境文件
mkdir /BIGDATA/ # 用于存放CLM主程序以及CESM2_inputdata

开启各节点间SSH免密登陆以及设置开启NFS局域网共享,可参看:

https://www.cnblogs.com/xiaohuiji190/p/14973563.html
https://blog.csdn.net/linzhiji/article/details/122539768
https://blog.csdn.net/weixin_32630003/article/details/119447126

环境配置

例如换国内源,update以及设置静态ip等问题,就不再赘述。

  1. 安装基础环境
    由于系统从零开始,首先需要配置一些基础程序,例如python、Cmake、gcc等,可通过apt-get install自行默认安装。
    (这里因为我对各个安装文件之间是否存在包含或并行关系尚不明确,所以我就把能想到的都安了,吃了没文化的亏>_<||| )
1| sudo apt-get install build-essential
2| sudo apt-get install libc6-dev
3| sudo apt-get install python
4| sudo apt-get install cmake
5| sudo apt-get install libxml2-utils
等等...
  1. 安装GNU gcc8 编译器
解压进入gcc源码目录进行编译
cd gcc-8.3.0
首先安装gcc依赖库
sudo apt-get install m4
sudo apt-get install texinfo bison flex
---------------------------------------
./contrib/download_prerequisites #这块需要网络下载,有点慢,下载好后分别进入文件夹进行配置。
---------------------------------------
cd gmp-6.1.0
./configure --prefix=/usr/local/gmp-6.1.0 #--prefix=存储路径,后续不再描述
make 
sudo make install
---------------------------------------
cd mpfr-3.1.4
./configure --prefix=/usr/local/mpfr-3.1.4 --with-gmp=/usr/local/gmp-6.1.0
make
sudo make install
---------------------------------------
cd mpc-1.0.3
./configure --prefix=/usr/local/mpc-1.0.3 --with-gmp=/usr/local/gmp-6.1.0 --with-mpfr=/usr/local/mpfr-3.1.4
make
sudo make install
---------------------------------------
cd isl-0.18
./configure --prefix=/usr/local/isl-0.18 --with-gmp-prefix=/usr/local/gmp-6.1.0
make
sudo make install
编辑依赖库环境变量
export LD_LIBRARY_PATH=/usr/local/gmp-6.1.0/lib:/usr/local/mpfr-3.1.4/lib:/usr/local/mpc-1.0.3/lib:/usr/local/isl-0.18$LD_LIBRARY_PATH

安装gcc主程序并配置默认

cd gcc-8.3.0
./configure prefix=/app/gcc8 --enable-checking=release --enable-languages=c,c++,fortran,obj-c++ --disable-multilib --with-gmp=/usr/local/gmp-6.1.0 --with-mpfr=/usr/local/mpfr-3.1.4 --with-mpc=/usr/local/mpc-1.0.3
**一定要enable fortran**
make # 超级慢,可尝试 -j4,但也有人也说不建议,我个人而言目前没遇到问题。
sudo make install #执行完此步即文件安装完成,可移除默认apt-get install等安装的gcc。
移除低版本编译器
sudo apt-get remove gcc
sudo apt-get remove g++
sudo apt-get remove gfortran
编辑环境变量
sudo vi ~/.bashrc #在最后添加如下内容
### gcc8
export GCC_ROOT=/app/gcc8
export PATH=$GCC_ROOT/bin:$PATH
export LD_LIBRARY_PATH=/app/gcc8/lib64/:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=/app/gcc8/lib/:$LD_LIBRARY_PATH
export MANPATH=/app/gcc8/share/man:$MANPATH
:wq #保存并退出编辑
source ~/.bashrc #类似于刷新一下
关联新库文件
sudo rm -rf /usr/bin/gcc
sudo rm -rf /usr/bin/g++
sudo rm -rf /usr/bin/gfortran
sudo ln -s /app/gcc8/bin/gcc /usr/bin/gcc
sudo ln -s /app/gcc8/bin/g++ /usr/bin/g++
sudo ln -s /app/gcc8/bin/gfortran /usr/bin/gfortran
至此完成gcc8.3的安装
gcc8 --version #终端输入
-----------------------
gcc (GCC) 8.3.0
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
  1. 安装MPICH3.3.1
进入MPICH源码目录进行编译
./configure --prefix=/app/mpich3
make -j4
sudo make install
编辑环境
sudo vi ~/.bashrc
## MPICH3
export MPI_ROOT=/app/mpich3
export PATH=$MPI_ROOT/bin:$PATH
export LD_LIBRARY_PATH=/app/mpich3/lib/:$LD_LIBRARY_PATH
export MANPATH=$MPI_ROOT/man:$MANPATH
:wq
source ~/.bashrc

配置完并行程序后可尝试集群内的并行测试,可参照@海岛Blog

https://blog.csdn.net/tigerisland45/article/details/53893350

测试并行
4. 安装zlib

cd zlib
./configure --prefix=/app/netcdf
make
sudo make install
  1. 安装curl
cd curl
./configure --prefix=/app/netcdf
make -j4
sudo make install
编辑环境
sudo vi ~/.bashrc
## netcdf
export NETCDF_ROOT=/app/netcdf
export PATH=$NETCDF_ROOT/bin:$PATH
export LD_LIBRARY_PATH=/app/netcdf/lib/:$LD_LIBRARY_PATH
export MANPATH=/app/netcdf/share/man:$MANPATH
:wq
source ~/.bashrc

!!!安装HDF5 Pnetcdf和NETCDF之前要先进行编译器的环境设置,且顺序不能乱!!!

如下可能存在没用设置,以防找不到相关文件,做一步也无妨。
sudo rm -rf /usr/bin/mpicc
sudo rm -rf /usr/bin/mpicxx
sudo rm -rf /usr/bin/mpif77
sudo rm -rf /usr/bin/mpif90
sudo rm -rf /usr/bin/mpifort
sudo ln -s /app/mpich3/bin/mpicc /usr/bin/mpicc
sudo ln -s /app/mpich3/bin/mpicxx /usr/bin/mpicxx
sudo ln -s /app/mpich3/bin/mpif77 /usr/bin/mpif77
sudo ln -s /app/mpich3/bin/mpif90 /usr/bin/mpif90
sudo ln -s /app/mpich3/bin/mpifort /usr/bin/mpifort
export CC=/app/mpich3/bin/mpicc
export CXX=/app/mpich3/bin/mpicxx
export FC=/app/mpich3/bin/mpifort
export F77=/app/mpich3/bin/mpifort
export F90=/app/mpich3/bin/mpifort
  1. 安装hdf5
cd hdf5
./configure --prefix=/app/netcdf --with-zlib=/app/netcdf --enable-fortran --enable-fortran2003 --enable-parallel --with-pic
make -j4
sudo make install
source ~/.bashrc #因为我把zlib curl hdf5 netcdf-c netcdf-f都放在一个文件夹,所以直接source就行
  1. 安装pnetcdf
cd pnetcdf
./configure --prefix=/app/pnetcdf
make -j4
sudo make install
编辑环境
sudo vi ~/.bashrc
## pnetcdf
export Pnetcdf_ROOT=/app/pnetcdf
export PATH=$Pnetcdf_ROOT/bin:$PATH
export LD_LIBRARY_PATH=/app/pnetcdf/lib/:$LD_LIBRARY_PATH
export MANPATH=/app/pnetcdf/share/man:$MANPATH
:wq
source ~/.bashrc
  1. 安装netcdf-c
cd netcdf-c
CFLAGS="-O3 -fPIC -I/app/netcdf/include" CPPFLAGS="-O3 -fPIC -I/app/netcdf/include" FFLAGS="-O3 -fPIC" LDFLAGS=-L/app/netcdf/lib ./configure --prefix=/app/netcdf --enable-static --enable-shared --enable-netcdf4 --enable-largefile --enable-large-file-tests --enable-diskless --enable-mmap --with-zlib=/app/netcdf
make -j4
sudo make install

配置成功会看到:

+-------------------------------------------------------------+
| Congratulations! You have successfully installed netCDF!    |
|                                                             |
| You can use script "nc-config" to find out the relevant     |
| compiler options to build your application. Enter           |
|                                                             |
|     nc-config --help                                        |
|                                                             |
| for additional information.                                 |
|                                                             |
| CAUTION:                                                    |
|                                                             |
| If you have not already run "make check", then we strongly  |
| recommend you do so. It does not take very long.            |
|                                                             |
| Before using netCDF to store important data, test your      |
| build with "make check".                                    |
|                                                             |
| NetCDF is tested nightly on many platforms at Unidata       |
| but your platform is probably different in some ways.       |
|                                                             |
| If any tests fail, please see the netCDF web site:          |
| http://www.unidata.ucar.edu/software/netcdf/                |
|                                                             |
| NetCDF is developed and maintained at the Unidata Program   |
| Center. Unidata provides a broad array of data and software |
| tools for use in geoscience education and research.         |
| http://www.unidata.ucar.edu                                 |
+-------------------------------------------------------------+
  1. 安装netcdf-fortran
    最核(恶)心的就是fortran的编译,我在这卡了好久,总是报error,就是之前的配置有误,作为一个不懂库文件和环境变量的新手,在崩溃的边缘不断游走。。。
cd netcdf-f
./configure --prefix=/app/netcdf --with-netCDF=/app/netcdf --enable-pnetcdf --disable-shared CPPFLAGS="-I/app/netcdf/include" LDFLAGS="-L/app/netcdf/lib" 
make -j4 
sudo make install

终于啊,终于。。

+-------------------------------------------------------------+
| Congratulations! You have successfully installed the netCDF |
| Fortran libraries.                                          |
|                                                             |
| You can use script "nf-config" to find out the relevant     |
| compiler options to build your application. Enter           |
|                                                             |
|     nf-config --help                                        |
|                                                             |
| for additional information.                                 |
|                                                             |
| CAUTION:                                                    |
|                                                             |
| If you have not already run "make check", then we strongly  |
| recommend you do so. It does not take very long.            |
|                                                             |
| Before using netCDF to store important data, test your      |
| build with "make check".                                    |
|                                                             |
| NetCDF is tested nightly on many platforms at Unidata       |
| but your platform is probably different in some ways.       |
|                                                             |
| If any tests fail, please see the netCDF web site:          |
| http://www.unidata.ucar.edu/software/netcdf/                |
|                                                             |
| NetCDF is developed and maintained at the Unidata Program   |
| Center. Unidata provides a broad array of data and software |
| tools for use in geoscience education and research.         |
| http://www.unidata.ucar.edu                                 |
+-------------------------------------------------------------+

10.安装blas, cblas, lapack库文件

http://www.netlib.org/lapack/index.html#_lapack_version_3_5_0

cd lapack-3.5.0
进入解压后文件夹后,复制make.inc.example为make.inc
cp make.inc.example make.inc
再编辑Makefile的内容,改成如下形式,将原本对第二行的注释,改为对第一行的注释
#lib: lapacklib tmglib
lib: blaslib variants lapacklib tmglib
CFLAGS = -O3 -I$/home/hanlzh/Desktop/lapack-3.5.0/INCLUDE -fno-stack-protector 
# 上方代码在桌面进行安装的 因此是Desktop
:wq
##
make时若出现:
--------------------------------------------------------
NEP: Testing Nonsymmetric Eigenvalue Problem routines
./EIG/xeigtstz < nep.in > znep.out 2>&1
Makefile:463: recipe for target 'znep.out' failed
make[1]: *** [znep.out] Error 139
make[1]: Leaving directory '/home/xfbupt/project/other/lapack-3.7.1/TESTING'
Makefile:42: recipe for target 'lapack_testing' failed
make: *** [lapack_testing] Error 2
--------------------------------------------------------
这应该是测试错误,编译其实基本已经完成了
这时只需要执行下面的语句,修改一下栈的大小就可以编译成功了。
ulimit -s 100000
make clean
make
--------------------------------------------------------
将生成的liblapack.a,librefblas.a,libtmglib.a 三个库拷贝到目标地址
sudo cp liblapack.a /app/lapack-3.5.0/liblapack.a
sudo cp librefblas.a /app/lapack-3.5.0/libblas.a #注意一下文件名,不同系统要求略有差异
sudo cp libtmglib.a /app/lapack-3.5.0/libtmglib.a
--------------------------------------------------------
sudo vi ~/.bashrc
## lapack-3.5.0
export PATH=/app/lapack-3.5.0/lapacke/include:$PATH
export LD_LIBRARY_PATH=/app/lapack-3.5.0/:$LD_LIBRARY_PATH
:wq
source ~/.bashrc
  1. EMSF安装
git clone https://github.com/esmf-org/esmf.git
cp -r /esmf /app/

sudo vi ~/.bashrc #修改~/.bashrc并添加如下信息
#---------------ESMF environment variables begin-----------
export ESMF_DIR=/app/esmf
export ESMF_BOPT=g
export ESMF_COMM=mpiuni
export ESMF_COMPILER=gfortran
export ESMF_ABI=64
export ESMF_INSTALL_PREFIX=/app/esmf/esmf_install
#export ESMF_NETCDF=($your path )/esmf/netcdf
export ESMF_NETCDF_INCLUDE=/app/netcdf/include
#export ESMF_NETCDF_LIBPATH=($your path )/netcdf-4.6.1/lib
export ESMF_NETCDF_LIBPATH=/app/netcdf/lib
export ESMF_NETCDF_LIBS="-lnetcdf -lnetcdff"

export ESMF_OS=Linux
export ESMF_TESTMPMD=ON
export ESMF_PTHREADS=ON
export ESMF_OPENMP=ON
export ESMF_TESTEXHAUSTIVE=ON
export ESMF_TESTHARNESS_ARRAY=RUN_ESMF_TestHarnessArrayUNI_2
export ESMF_TESTHARNESS_FIELD=RUN_ESMF_TestHarnessFieldUNI_1
export ESMF_NO_INTEGER_1_BYTE=FALSE
export ESMF_NO_INTEGER_2_BYTE=FALSE
export ESMF_FORTRANSYMBOLS=default
export ESMF_DEFER_LIB_BUILD=ON
export ESMF_TESTWITHTHREADS=OFF
export ESMF_CXXCOMPILER=g++
export ESMF_CXXLINKER=g++
export ESMF_F90COMPILER=gfortran
export ESMF_F90LINKER=gfortran
export ESMF_INSTALL_BINDIR=bin/bing/Linux.gfortran.64.mpiuni.default
export ESMF_INSTALL_MODDIR=mod/modg/Linux.gfortran.64.mpiuni.default
export ESMF_INSTALL_LIBDIR=lib/libg/Linux.gfortran.64.mpiuni.default
export ESMF_INSTALL_HEADERDIR=include
export ESMF_INSTALL_DOCDIR=doc

export ESMFBIN_PATH=/app/esmf/bin/bing/Linux.gfortran.64.mpiuni.default
export ESMFLIB_PATH=/app/esmf/lib/libg/Linux.gfortran.64.mpiuni.default
export MPIEXEC=/app/mpich3/bin/mpiexec
export MY_ESMF_REGRID=/app/esmf/bin/bing/Linux.gfortran.64.mpiuni.default/ESMF_RegridWeightGen
#----------ESMF environment variables end-------------

:wq
source ~/.bashrc

sudo chmod -R 777 /app/esmf #给文件夹权限,否则后续无法完成创建
make  #不能用 sudo make 否则无法读取bashrc环境变量
make check
make install
---------------------------- 
ESMF installation complete.
----------------------------

至此,CESM2绝大部分的环境已配置完成。

机器配置

首先下载cesm2框架

参照官方:https://escomp.github.io/CESM/versions/cesm2.1/html/downloading_cesm.html

git clone -b release-cesm2.1.3 https://github.com/ESCOMP/CESM.git my_cesm_sandbox
cd my_cesm_sandbox
./manage_externals/checkout_externals 
#刚开始会卡在这一步,后来才知道,可以让国外同学帮忙下载

将主程序cp到/BIGDATA下,并进行机器配置

sudo cp -r /my_cesm_sandbox /BIGDATA/clm5.0 #根据自己习惯进行了重命名
cd /BIGDATA/clm5.0/cime/config/cesm/machines #模式框架的机器参数指定

其中,config_batch.xml 为作业提交系统设定;config_compilers.xml为编译器设置,config_machines.xml为路径等关键信息设置。

  1. config_batch.xml设定
vi添加 
  <!-- hanlzh -->
  <batch_system MACH="lzh" type="none">
  </batch_system>
  1. config_compilers.xml设定
<compiler MACH="lzh" COMPILER="gnu">
  <CFLAGS>
    <append DEBUG="FALSE"> -O2 </append>
  </CFLAGS>
  <CONFIG_ARGS>
    <append> --host=Linux </append>
  </CONFIG_ARGS>
  <CPPDEFS>
    <append> -DLINUX -DFORTRANUNDERSCORE -DNO_R16 -DCPRGNU </append>
  </CPPDEFS>
  <FFLAGS>
    <append DEBUG="FALSE"> -O2 </append>
    <append> -ffree-line-length-none </append>
  </FFLAGS>
  <MPICC> mpicc </MPICC>
  <MPICXX> mpicxx </MPICXX>
  <MPIFC> mpif90 </MPIFC>
  <SCC> gcc </SCC>
  <SFC> gfortran </SFC>
  <ESMF_LIBDIR>/app/esmf/esmf_install/lib/libg/Linux.gfortran.64.mpiuni.default</ESMF_LIBDIR>
  <MPI_LIB_NAME>mpich</MPI_LIB_NAME>
  <MPI_PATH>/app/mpich3</MPI_PATH>
  <NETCDF_PATH>/app/netcdf</NETCDF_PATH>
  <PNETCDF_PATH>/app/netcdf</PNETCDF_PATH>
  <LAPACK_LIBDIR>/app/lapack-3.5.0</LAPACK_LIBDIR>
  <SLIBS>
          <append>-L/app/netcdf/lib -lnetcdf -lnetcdff -lhdf5 -lhdf5_hl -lz -lpnetcdf -L/app/lapack-3.5.0 -llapack -lblas</append>
  </SLIBS>
</compiler>

</config_compilers>
  1. config_machines.xml设定
<machine MACH="lzh">
    <DESC> Ubuntu gcc8.3 mpich3 </DESC>
    <NODENAME_REGEX>regex.expression.matching.your.machine</NODENAME_REGEX>
    <OS>LINUX</OS>
    <PROXY> https://howto.get.out </PROXY>
    <COMPILERS>gnu</COMPILERS>
    <MPILIBS>mpich</MPILIBS>
    <PROJECT>none</PROJECT>
    <SAVE_TIMING_DIR> </SAVE_TIMING_DIR>
    <CIME_OUTPUT_ROOT>/BIGDATA/clm5.0/cime/scripts/$CASE</CIME_OUTPUT_ROOT>
    <DIN_LOC_ROOT>/BIGDATA/cesm/inputdata</DIN_LOC_ROOT>
    <DIN_LOC_ROOT_CLMFORC>/BIGDATA/cesm/inputdata/atm/datm7</DIN_LOC_ROOT_CLMFORC>
    <DOUT_S_ROOT>/BIGDATA/clm5.0/cime/scripts/output/$CASE</DOUT_S_ROOT>
    <BASELINE_ROOT>/BIGDATA/clm5.0/cime/scripts/$CASE</BASELINE_ROOT>
    <CCSM_CPRNC>/BIGDATA/clm5.0/cime/scripts/$CASE</CCSM_CPRNC>
    <GMAKE>make</GMAKE>
    <GMAKE_J>4</GMAKE_J>
    <BATCH_SYSTEM>none</BATCH_SYSTEM>
    <SUPPORTED_BY>hanlzh@hrbnu.edu.can</SUPPORTED_BY>
    <MAX_TASKS_PER_NODE>24</MAX_TASKS_PER_NODE>
    <MAX_MPITASKS_PER_NODE>24</MAX_MPITASKS_PER_NODE>
    <PROJECT_REQUIRED>FALSE</PROJECT_REQUIRED>
    <mpirun mpilib="default">
      <executable>mpirun</executable>
      <arguments>
      <arg name="ntasks"> -n {{ total_tasks }} </arg>
      </arguments>
    </mpirun>
    <module_system type="none"/>
    <environment_variables>
      <env name="OMP_STACKSIZE">256M</env>
      <env name="NETCDF_PATH">/app/netcdf/</env>
    </environment_variables>
    <resource_limits>
    <resource name="RLIMIT_STACK">-1</resource>
    </resource_limits>
  </machine>

CESM2模式运行

inputdata等相关路径在模式运行前要提前创建好(mkdir)

cd /BIGDATA/clm5.0/cime/scripts
./create_newcase --case mycase1 --res f09_g16 --compset I2000Clm50BgcCru --mach lzh --run-unsupported
#创建一个名为mycase1的案例,分辨率为f09_g16,运行I2000Clm50BgcCru模块,在lzh机器上。
cd mycase1
./case.setup #制定好env_mach_pes.xml和env_run.xml后设定case。
./case.build --skip-provenance-check #设定输出变量,输出频率,时间间隔之后编译case,
这步时间较长,大多数错误都会出现在这个部分,如遇error,可查看所提示的log文件,绝大部分错误原因都是缺少库文件或读写权限。
#例如 sudo ln -s /usr/lib/x86_64-linux-gnu/libmpfr.so.6 /usr/lib/x86_64-linux-gnu/libmpfr.so.4

移植成功

./check_input_data #检查case所需的输入数据情况。没有的,可下载好导入设定的文件夹下。
./case.submit #作业提交并运行。

在/run下有很多.nc文件生成的话就代表成功了。
在这里插入图片描述

至此CESM2模式陆面过程模型CLM移植结束

  • 7
    点赞
  • 20
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值