CIME框架阅读翻译笔记第一章第五节(案例控制系统第一部分:基本使用——运行一个案例)

CIME官方使用手册的链接在这http://esmci.github.io/cime/versions/master/html/users_guide/index.html
本期翻译的是第一章第五节(案例控制系统第一部分:基本使用——运行一个案例)的内容。包括调用、重构和输入文件七个小节。

5.1. Calling case.submit

The script case.submit will submit your run to the batch queueing system on your machine. If you do not have a batch queueing system, case.submit will start the job interactively, given that you have a proper MPI environment defined. Running case.submit is the ONLY way you should start a job.
case.submit脚本会将您的作业提交到您电脑的批量排队系统。如果你没有批量排队系统,并且假设您已经定义了适当的MPI环境,case.submit脚本将会交互式地开启作业。运行case.submit是您开始 一个作业的唯一办法。
To see the options to case.submit, issue the command
为了查看作业提交的选择,使用这个指令:

./case.submit --help

A good way to see what case.submit will do, is to first call preview_run
为了查看case.sumit是如何运行的,首先调用preview_run是个不错的办法。

./preview_run

which will output the environment for your run along with the batch submit and mpirun commands. As an example, on the NCAR machine, cheyenne, for an A compset at the f19_g17_rx1 resolution, the following is output from preview_run:
它将向批提交和mpirun命令输出运行环境。例如,在NCAR机器cheyenne上,对于f19_g17_rx1分辨率的组件,执行preview_run输出以下内容:

CASE INFO:
nodes: 1
total tasks: 36
tasks per node: 36
thread count: 1

BATCH INFO:
FOR JOB: case.run
ENV:
module command is /glade/u/apps/ch/opt/lmod/7.5.3/lmod/lmod/libexec/lmod python purge
module command is /glade/u/apps/ch/opt/lmod/7.5.3/lmod/lmod/libexec/lmod python load ncarenv/1.2 intel/17.0.1 esmf_libs mkl esmf-7.0.0-defio-mpi-O mpt/2.16 netcdf-mpi/4.5.0 pnetcdf/1.9.0 ncarcompilers/0.4.1
Setting Environment OMP_STACKSIZE=256M
Setting Environment TMPDIR=/glade/scratch/mvertens
Setting Environment MPI_TYPE_DEPTH=16
SUBMIT CMD:
qsub -q regular -l walltime=12:00:00 -A P93300606 .case.run

FOR JOB: case.st_archive
ENV:
module command is /glade/u/apps/ch/opt/lmod/7.5.3/lmod/lmod/libexec/lmod python purge
module command is /glade/u/apps/ch/opt/lmod/7.5.3/lmod/lmod/libexec/lmod python load ncarenv/1.2 intel/17.0.1 esmf_libs mkl esmf-7.0.0-defio-mpi-O mpt/2.16 netcdf-mpi/4.5.0 pnetcdf/1.9.0 ncarcompilers/0.4.1
Setting Environment OMP_STACKSIZE=256M
Setting Environment TMPDIR=/glade/scratch/mvertens
Setting Environment MPI_TYPE_DEPTH=16
Setting Environment TMPDIR=/glade/scratch/mvertens
Setting Environment MPI_USE_ARRAY=false
SUBMIT CMD:
qsub -q share -l walltime=0:20:00 -A P93300606 -W depend=afterok:0 case.st_archive

MPIRUN:
mpiexec_mpt -np 36 -p “%g:” omplace -tm open64 /glade/scratch/mvertens/jim/bld/cesm.exe >> cesm.log.$LID 2>&1

Each of the above sections is defined in the various $CASEROOT xml files and the associated variables can be modified using the xmlchange command (or in the case of tasks and threads, this can also be done with the pelayout command).
上述的每一个部分都在 $CASEROOT目录下xml文件的变量中定义,并且每个相关的变量都可以使用xmlchange命令修改(或者对于任务数和线程数来说,也可以通过pelayout命令修改)

The PE layout is set by the xml variables NTASKS, NTHRDS and ROOTPE. To see the exact settings for each component, issue the command
PE布局由xml变量NTASKS、NTHRDS和ROOTPE设置。要查看每个组件的确切设置,请发出以下命令:

./xmlquery NTASKS,NTHRDS,ROOTPE

To change all of the NTASKS settings to say 30 and all of the NTHRDS to 4, you can call
为了让所有的NTASKS设置成30,所有的NTHRDS设置成40,你可以如此调用:

./xmlchange NTASKS=30,NTHRDS=40

To change JUST the ATM NTASKS to 8, you can call
如果只是ATM NTASKS数量设置成8,你可以调用:

./xmlchange NTASKS_ATM=8

Submit parameters are set by the xml variables in the file env_batch.xml. This file is special in certain xml variables can appear in more than one group. NOTE: The groups are the list of jobs that are submittable for a case. Normally, the minimum set of groups are case.run and case.st_archive. We will illustrate how to change an xml variable in env_batch.xml using the xml variable JOB_WALLCLOCK_TIME.
提交参数被设置在env_batch.xml文件中的xml变量里。此文件是特殊的,某些xml变量可以出现在多个组中。注意:组是可为案例提交的作业列表。通常,组的最小集合是case.run和case.st_archive。我们将演示如何使用xml变量JOB_wallcock_TIME,在env_batch.xml中更改xml变量。

To change JOB_WALLCLOCK_TIME for all groups to 2 hours for cheyenne, use
改变所有组的JOB_WALLCLOCK_TIME成2h,使用以下命令:

./xmlchange JOB_WALLCLOCK_TIME=02:00:00

To change JOB_WALLCLOCK_TIME to 20 minutes for cheyenne for just case.run, use
改变所有组的JOB_WALLCLOCK_TIME成20min,使用以下命令:

./xmlchange JOB_WALLCLOCK_TIME=00:20:00 --subgroup case.run

Before you submit the case using case.submit, make sure the batch queue variables are set correctly for your run. In particular, make sure that you have appropriate account numbers (PROJECT), time limits (JOB_WALLCLOCK_TIME), and queue (JOB_QUEUE).
你使用case.submit命令提交案例前,确保为运行正确设置了批处理队列变量。特别是,确保您有正确的帐号(PROJECT)、时间限制(JOB_WALLCLOCK_TIME)和队列(JOB_QUEUE)。

Also modify $CASEROOT/env_run.xml for your case using xmlchange.
Once you have executed case.setup and case.build , call case.submit to submit the run to your machine’s batch queue system.

并且为你的案例可以使用xmlchange修改$CASEROOT/env_run.xml文件。
一旦你已经成功运行了cdase.setup和case.bulid,调用case.submit命令将提交作业至您的机器批处理提交系统。

cd $CASEROOT
./case.submit

5.1.1. Result of running case.submit 运行case.submit的结果

When called, the case.submit script will:一旦运行,case.submit将会:

  • Load the necessary environment. 加载环境
  • Confirm that locked files are consistent with the current xml files. 确认锁定文件是否和现在的xml文件是否一致
  • Run preview_namelist, which in turn will run each component’s cime_config/buildnml script. 运行preview_namelist,它将依次运行每个组件的cime_config/buildnml脚本。
  • Run check_input_data to verify that the required data are present. 运行check_input_data以验证是否存在所需数据。
  • Submit the job to the batch queue. which in turn will run the case.run script. 将作业提交到批处理队列。这又将运行case.run脚本。
  • Upon successful completion of the run, case.run will:
    Put timing information in $CASEROOT/timing. See model timing data for details.
    成功完成运行后,case.run运行将计时信息放在 $CASEROOT/timing中。有关详细信息,请参见模型计时数据。
  • Submit the short-term archiver script case.st_archive to the batch queue if $DOUT_S is TRUE. Short-term archiving will copy and move component history, log, diagnostic, and restart files from $RUNDIR to the short-term archive directory $DOUT_S_ROOT. 如果 $DOUT_S为TRUE,提交短期归档脚本case.st_archive到批处理队列。短期存档将组件历史记录、日志、诊断和重启文件从 $RUNDIR复制并移动到短期存档目录 $DOUT_S_ROOT。

Resubmit case.run if $RESUBMIT > 0. 如果 $RESUBMIT>0的话,重新提交case.run。

5.1.2. Monitoring case job statuses 监控案例作业状态

The $CASEROOT/CaseStatus file contains a log of all the job states and xmlchange commands in chronological order. Below is an example of status messages:
$CASEROOT/CaseStatus文件按时间顺序包含所有作业状态和xmlchange命令的日志。以下是状态消息的示例:

Note 注意:

After a successful first run, set the env_run.xml variable $CONTINUE_RUN to TRUE before resubmitting or the job will not progress.
首次运行成功后,重新提交之前,将env_run.xml变量 $CONTINUE_RUN设置为TRUE,否则作业将无法进行。

You may also need to modify the env_run.xml variables $STOP_OPTION, $STOP_N and/or $STOP_DATE as well as $REST_OPTION, $REST_N and/or $REST_DATE, and $RESUBMIT before resubmitting.
您可能还需要修改env_run.xml变量 $STOP_OPTION、 $STOP_ N和/或 $STOP-DATE以及 $REST-OPTION、 $REST-N和/或者 $REST_DATE以及重新提交之前的 $RESUBMIT。

See the basic example for a complete example of how to run a case.
有关如何运行案例的完整示例,请参见基本示例。

5.1.3. Troubleshooting a job that fails 故障排查

There are several places to look for information if a job fails. Start with the STDOUT and STDERR file(s) in $CASEROOT. If you don’t find an obvious error message there, the $RUNDIR/ $model.log. $datestamp files will probably give you a hint.
如果作业失败,有几个地方可以查找信息。从 $CASEROOT中的STDOUT和STDERR文件开始。如果在那里没有找到明显的错误消息, $RUNDIR/ $model.log $datestamp 文件可能会给你一个提示。

First, check cpl.log. $datestamp, which will often tell you when the model failed. Then check the rest of the component log files. See troubleshooting run-time problems for more information.
首先,检查cpl.log. $datestamp,它通常会告诉您模型何时失败。然后检查其余组件日志文件。有关详细信息,请参阅故障排除运行时问题。

5.2. Input data 输入数据

The check_input_data script determines if the required data files for your case exist on local disk in the appropriate subdirectory of $DIN_LOC_ROOT. It automatically downloads missing data required for your simulation.
check_input_data脚本确定您的案例所需的数据文件是否存在于本地磁盘上 $DIN_LOC_ROOT的相应子目录中。它会自动下载模拟所需的缺失数据。

Note注意
It is recommended that users on a given system share a common $DIN_LOC_ROOT directory to avoid duplication on disk of large amounts of input data. You may need to talk to your system administrator in order to set this up.
建议给定系统上的用户共享一个共同的 $DIN_LOC_ROOT目录,以避免在磁盘上重复大量输入数据。您可能需要与系统管理员联系才能进行设置。

The required input data sets needed for each component are found in the $CASEROOT/Buildconf directory. These files are generated by a call to preview_namlists and are in turn created by each component’s buildnml script. For example, for compsets consisting only of data models (i.e. A compsets), the following files are created:
每个组件所需的输入数据集位于 $CASEROOT/Buildconf目录中。这些文件由调用preview_namlists生成,然后由每个组件的buildnml脚本创建。例如,对于仅由数据模型组成的compset(即compset),将创建以下文件:

cpl.input_data_list
datm.input_data_list
dice.input_data_list
docn.input_data_list
drof.input_data_list
You can independently verify the presence of the required data by using the following commands:
您可以使用以下命令独立验证所需数据的存在:

cd $CASEROOT
./check_input_data --help
./check_input_data

If data sets are missing, obtain them from the input data server(s) via the commands:
如果数据缺失,请通过以下命令从输入数据服务器获取数据集:

cd $CASEROOT
./check_input_data --download

check_input_data is automatically called by the case control system, when the case is built and submitted. So manual usage of this script is optional.
当构建和提交案例时,案例控制系统自动调用check_input_data命令。因此,手动使用该脚本是可选的。

5.2.1. Distributed Input Data Repositories 分布式输入数据存储库

CIME has the ability to utilize multiple input data repositories, with potentially different protocols. The repositories are defined in the file $CIMEROOT/config/ $model/config_inputdata.xml. The currently supported server protocols are: gridftp, subversion, ftp and wget. These protocols may not all be supported on your machine, depending on software configuration.
CIME能够利用具有潜在不同协议的多个输入数据存储库。存储库在文件 $CIMEROOT/config/ $model/config_inputdata.xml中定义。当前支持的服务器协议有:gridftp、subversion、ftp和wget。根据软件配置,您的机器可能不支持所有这些协议。

Note

You now have the ability to create your own input data repository and
add it to the config_inputdata.xml. This will permit you to easily
collaborate by sharing your required inputdata with others.
现在,您可以创建自己的输入数据存储库,并将其添加到configinputdata.xml中。这将允许您轻松地通过与他人共享您所需的输入数据进行协作。

5.3. Starting, Stopping and Restarting a Run 开始,停止和重新运行作业

The file env_run.xml contains variables that may be modified at initialization or any time during the course of a model run. Among other features, the variables comprise coupler namelist settings for the model stop time, restart frequency, coupler history frequency, and a flag to determine if the run should be flagged as a continuation run.
文件env_run.xml包含可以在初始化时或模型运行过程中随时修改的变量。在其他功能中,变量包括模型停止时间、重启频率、耦合器历史频率的耦合器名称列表设置,以及确定运行是否应标记为连续运行的标志。

At a minimum, you will need to set the variables $STOP_OPTION and $STOP_N. Other driver namelist settings then will have consistent and reasonable default values. The default settings guarantee that restart files are produced at the end of the model run.
您至少需要设置变量 $STOP_OPTION和 $STOP_N。其他驱动程序名称列表设置将具有一致且合理的默认值。默认设置保证在模型运行结束时生成重新启动文件。

By default, the stop time settings are: 默认设置中的停止时间为:

STOP_OPTION = ndays
STOP_N = 5
STOP_DATE = -999

The default settings are appropriate only for initial testing. Before starting a longer run, update the stop times based on the case throughput and batch queue limits. For example, if the model runs 5 model years/day, set RESUBMIT=30, STOP_OPTION= nyears, and STOP_N= 5. The model will then run in five-year increments and stop after 30 submissions.
默认设置仅适用于初始测试。在开始较长的运行之前,请根据案例吞吐量和批处理队列限制更新停止时间。例如,如果模型运行5个车型年/天,请设置RESUBMIT=30、STOP_OPTION=nyears和STOP_N=5。然后,模型将以五年为单位运行,并在30次提交后停止。

5.3.1. Run-type initialization 运行类型初始化

The case initialization type is set using the $RUN_TYPE variable in env_run.xml. A CIME run can be initialized in one of three ways:
案例初始化类型是使用env_run.xml文件中的 $RUN_TYPE变量设置的。CIME运行可以通过以下三种方式之一进行初始化:

startup 启动

In a startup run (the default), all components are initialized using baseline states. These states are set independently by each component and can include the use of restart files, initial files, external observed data files, or internal initialization (that is, a “cold start”). In a startup run, the coupler sends the start date to the components at initialization. In addition, the coupler does not need an input data file. In a startup initialization, the ocean model does not start until the second ocean coupling step.
在启动运行(默认)中,所有组件都使用基线状态进行初始化。这些状态由每个组件独立设置,可以包括重新启动文件、初始文件、外部观测数据文件或内部初始化(即“冷启动”)的使用。在启动运行中,耦合器在初始化时向组件发送开始日期。此外,耦合器不需要输入数据文件。在启动初始化中,海洋模型直到第二个海洋耦合步骤才开始。

branch 分支

In a branch run, all components are initialized using a consistent set of restart files from a previous run (determined by the $RUN_REFCASE and $RUN_REFDATE variables in env_run.xml). The case name generally is changed for a branch run, but it does not have to be. In a branch run, the $RUN_STARTDATE setting is ignored because the model components obtain the start date from their restart data sets. Therefore, the start date cannot be changed for a branch run. This is the same mechanism that is used for performing a restart run (where $CONTINUE_RUN is set to TRUE in the env_run.xml file).
在分支运行中,所有组件都使用与先前运行一致的一组重新启动文件(由env_run.xml中的 $RUN_REFCASE和 $RUN_REFDATE变量确定)进行初始化。案例名称通常因分支运行会更改,但不是必须的。在分支运行中, $RUN_STARTDATE设置被忽略,因为模型组件从其重新启动数据集获取开始日期。因此,不能更改分支运行的开始日期。这与用于执行重新启动运行的机制相同(其中,env_run.xml文件中的 $CONTINUE_run设置为TRUE)。

Branch runs typically are used when sensitivity or parameter studies are required, or when settings for history file output streams need to be modified while still maintaining bit-for-bit reproducibility. Under this scenario, the new case is able to produce an exact bit-for-bit restart in the same manner as a continuation run if no source code or component namelist inputs are modified. All models use restart files to perform this type of run. $RUN_REFCASE and $RUN_REFDATE are required for branch runs. To set up a branch run, locate the restart tar file or restart directory for $RUN_REFCASE and $RUN_REFDATE from a previous run, then place those files in the $RUNDIR directory. See Starting from a reference case.
分支运行通常在需要灵敏度或参数研究时使用,或者在需要修改历史文件输出流设置的同时仍保持逐位再现性时使用。在这种情况下,如果没有修改源代码或组件名称列表输入,新的案例能够以与相同的方式产生精确的逐位重新启动继续运行。所有模型都使用重新启动文件来执行这种类型的运行。分支运行需要 $RUN_REFCASE和 $RUN_REFDATE。要设置分支运行,请找到上次运行中重新启动tar文件或重新启动目录中的 $run_REFCASE和 $run_REFDATE变量,然后将这些文件放在 $RUNDIR目录中。请参见从参考案例开始。

hybrid 混合的

A hybrid run is initialized like a startup but it uses initialization data sets from a previous case. It is similar to a branch run with relaxed restart constraints. A hybrid run allows users to bring together combinations of initial/restart files from a previous case (specified by $RUN_REFCASE) at a given model output date (specified by $RUN_REFDATE). Unlike a branch run, the starting date of a hybrid run (specified by $RUN_STARTDATE) can be modified relative to the reference case. In a hybrid run, the model does not continue in a bit-for-bit fashion with respect to the reference case. The resulting climate, however, should be continuous provided that no model source code or namelists are changed in the hybrid run. In a hybrid initialization, the ocean model does not start until the second ocean coupling step, and the coupler does a “cold start” without a restart file.
混合运行的初始化方式类似于启动,但它使用的是以前案例中的初始化数据集。它类似于具有宽松的启动约束的分支运行。混合运行允许用户在给定的模型输出日期(由 $RUN_REFDATE指定)将来自以前案例(由 $RUN_REFCASE指定)的初始/重新启动文件组合在一起。与分支运行不同,混合运行的开始日期(由 $run_STARTDATE指定)可以相对于参考案例进行修改。在混合运行中,相对于参考案例,模型不会以逐位方式继续。然而,如果混合运行中没有更改模型源代码或名称列表,则产生的气候应该是连续的。在混合初始化中,海洋模型直到第二个海洋耦合步骤才开始,耦合器在没有重新启动文件的情况下进行“冷启动”。

The variable $RUN_TYPE determines the initialization type. This setting is only important for the initial production run when the $CONTINUE_RUN variable is set to FALSE. After the initial run, the $CONTINUE_RUN variable is set to TRUE, and the model restarts exactly using input files in a case, date, and bit-for-bit continuous fashion.
变量 $RUN_TYPE确定初始化类型。当 $CONTINUE_run变量设置为FALSE时,此设置仅对初始生产运行很重要。初始运行后, $CONTINUE_run变量被设置为TRUE,模型以case的输入文件、日期和逐位连续的方式精确地重新启动。

The variable $RUN_STARTDATE is the start date (in yyyy-mm-dd format) for either a startup run or a hybrid run. If the run is targeted to be a hybrid or branch run, you must specify values for $RUN_REFCASE and $RUN_REFDATE.
变量 $RUN_STARTDATE是启动运行或混合运行的开始日期(yyyy-mm-dd格式)。如果运行的目标是混合或分支运行,则必须为 $run_REFCASE和 $run/REFDATE指定值。

5.3.2. Starting from a reference case (REFCASE) 从参考案例(REFCASE)开始

There are several xml variables that control how either a branch or a hybrid case can start up from another case. The initial/restart files needed to start up a run from another case are required to be in $RUNDIR. The xml variable $GET_REFCASE is a flag that if set will automatically prestaging the refcase restart data.
有几个xml变量可以控制分支或混合案例从另一个案例开始。从另一个案例启动运行所需的初始/重新启动文件必须在 $RUNDIR中。xml变量 $GET_REFCASE是一个标志,如果设置该标志,将自动预存REFCASE重新启动数据。

If $GET_REFCASE is TRUE, then the the values set by $RUN_REFDIR, $RUN_REFCASE, $RUN_REFDATE and $RUN_TOD are used to prestage the data by symbolic links to the appropriate path.
如果 $GET_REFCASE为TRUE,则使用 $RUN_REFDIR、 $RUN_LEFCASE、 $RUN_REFDATE和 $RUN_TOD设置的值通过符号链接将数据预存到适当的路径。

The location of the necessary data to start up from another case is controlled by the xml variable $RUN_REFDIR.
从另一个案例启动所需数据的位置由xml变量 $RUN_REFDIR控制。

If $RUN_REFDIR is an absolute pathname, then it is expected that initial/restart files needed to start up a model run are in $RUN_REFDIR.
如果 $RUN_REFDIR是绝对路径名,那么启动模型运行所需的初始/重新启动文件应该在 $RUN_REFDIR中。

If $RUN_REFDIR is a relative pathname, then it is expected that initial/restart files needed to start up a model run are in a path relative to $DIN_LOC_ROOT with the absolute pathname $DIN_LOC_ROOT/ $RUN_REFDIR/ $RUN_REFCASE/ $RUN_REFDATE.
如果 $RUN_REFDIR是相对路径名,那么启动模型运行所需的初始/重新启动文件应该位于相对于 $DIN_LOC_ROOT的路径中,绝对路径名为 $DIN_LOC_ROOT/ $RUN_REFRIR/ $RUN_REFCASE/ $RUN_REFD。

If $RUN_REFDIR is a relative pathname AND is not available in $DIN_LOC_ROOT then CIME will attempt to download the data from the input data repositories.
如果 $RUN_REFDIR是相对路径名,并且在 $DIN_LOC_ROOT中不可用,则CIME将尝试从输入数据存储库下载数据。

If $GET_REFCASE is FALSE then the data is assumed to already exist in $RUNDIR.
如果 $GET_REFCASE为FALSE,则假定数据已存在于 $RUNDIR中。

5.4. Controlling output data 控制输出数据

During a model run, each model component produces its own output data sets in $RUNDIR consisting of history, initial, restart, diagnostics, output log and rpointer files. Component history files and restart files are in netCDF format. Restart files are used to either restart the same model or to serve as initial conditions for other model cases. The rpointer files are ascii text files that list the component history and restart files that are required for restart.
在模型运行期间,每个模型组件在 $RUNDIR中生成自己的输出数据集,包括历史记录、初始、重启、诊断、输出日志和rpointer文件。组件历史文件和重启文件都是nc格式数据。重启文件可以用于重启相同的模式或者作为其他模式的初始条件。rpinter文件是ascii文本文件,列出了重新启动所需的组件历史记录和重新启动文件。

archiving (referred to as short-term archiving here) is the phase of a model run when output data are moved from $RUNDIR to a local disk area (short-term archiving). It has no impact on the production run except to clean up disk space in the $RUNDIR which can help manage user disk quotas.
归档(这里称为短期归档)是将输出数据从 $RUNDIR移动到本地磁盘区域(短期归档)的模型运行的阶段。除了清理 $RUNDIR中的磁盘空间外,它对生产运行没有影响,这可以帮助管理用户磁盘配额。

Several variables in env_run.xml control the behavior of short-term archiving. This is an example of how to control the data output flow with two variable settings:
env_run.xml中的几个变量控制短期归档的行为。这是如何使用两个变量设置控制数据输出流的示例:

DOUT_S = TRUE
DOUT_S_ROOT = / $SCRATCH/ $user/ $CASE/archive

The first setting above is the default, so short-term archiving is enabled. The second sets where to move files at the end of a successful run.
上述第一个设置是默认的,所以短期归档是打开的。第二个设置选项是运行成功结束后文件移动到的路径。

Also:
此外

All output data is initially written to $RUNDIR. 所有的输出数据最初写入 $RUNDIR路径

Unless you explicitly turn off short-term archiving, files are moved to $DOUT_S_ROOT at the end of a successful model run. 除非您明确关掉了短期存档,否则模式成功运行后文件会被移入 $DOUT_S_ROOT路径。

Users generally should turn off short-term archiving when developing new code.
用户在开发新代码时,通常应该关掉短期归档。

Standard output generated from each component is saved in $RUNDIR in a log file. Each time the model is run, a single coordinated datestamp is incorporated into the filename of each output log file. The run script generates the datestamp in the form YYMMDD-hhmmss, indicating the year, month, day, hour, minute and second that the run began (ocn.log.040526-082714, for example).
每个组件生成的标准输出保存在日志文件的 $RUNDIR中。每次运行模型时,都会在每个输出日志文件的文件名中包含一个协调的日期戳。运行脚本生成格式为YYYMMDD hhmmss的日期戳,指示运行开始的年、月、日、小时、分钟和秒(例如,ocn.log.040526-082714)。

By default, each component also periodically writes history files (usually monthly) in netCDF format and also writes netCDF or binary restart files in the $RUNDIR directory. The history and log files are controlled independently by each component. History output control (for example, output fields and frequency) is set in each component’s namelists.
默认情况下,每个组件还定期以netCDF格式写入历史文件(通常按照每月输出),并在 $RUNDIR目录中写入netCDF或二进制重新启动文件。历史记录和日志文件由每个组件独立控制。在每个组件的名称列表中设置历史输出控制(例如,输出字段和频率)。

The raw history data does not lend itself well to easy time-series analysis. For example, CAM writes one or more large netCDF history file(s) at each requested output period. While this behavior is optimal for model execution, it makes it difficult to analyze time series of individual variables without having to access the entire data volume. Thus, the raw data from major model integrations usually is post-processed into more user-friendly configurations, such as single files containing long time-series of each output fields, and made available to the community.
原始历史数据不适合进行简单的时间序列分析。例如,CAM在每个请求的输出周期写入一个或多个大型netCDF历史文件。虽然这种行为对于模型执行来说是最优的,但它使得在不访问整个数据量的情况下分析单个变量的时间序列变得困难。因此,来自主要模型集成的原始数据通常被后期处理为更用户友好的配置,例如包含每个输出字段的长时间序列的单个文件,并提供给社区。

For CESM, refer to the CESM2 Output Filename Conventions for a description of output data filenames.
对于CESM,有关输出数据文件名的描述,请参阅CESM2输出文件名惯例。

5.5. Restarting a run 重启一次运行

Active components (and some data components) write restart files at intervals that are dictated by the driver via the setting of the $REST_OPTION and $REST_N variables in env_run.xml. Restart files allow the model to stop and then start again with bit-for-bit exact capability; the model output is exactly the same as if the model had not stopped. The driver coordinates the writing of restart files as well as the time evolution of the model.
活动组件(和一些数据组件)通过env_run.xml中 $REST_OPTION和 $REST_N变量的设置,以驱动程序指定的间隔写入重新启动文件。重新启动文件允许模型停止,然后使用逐位精确功能重新启动;模型输出与模型没有停止时完全相同。驱动程序协调重新启动文件的写入以及模型的时间演变。

Runs that are initialized as branch or hybrid runs require restart/initial files from previous model runs (as specified by the variables $RUN_REFCASE and $RUN_REFDATE). Pre-stage these files to the case $RUNDIR (normally $EXEROOT/…/run) before the model run starts. Normally this is done by copying the contents of the relevant $RUN_REFCASE/rest/ $RUN_REFDATE.00000 directory.
初始化为分支或混合运行的运行需要从以前的模型运行中重新启动/初始化文件(由变量 $RUN_REFCASE和 $RUN_REFDATE指定)。在模型运行开始之前,将这些文件预存到案例 $RUNDIR(通常为 $EXEROOT/…/run)中。通常,这是通过复制相关 $RUN_REFCASE/rest/ $RUN_REFDATE.0000目录的内容来完成的。

Whenever a component writes a restart file, it also writes a restart pointer file in the format rpointer.$component. Upon a restart, each component reads the pointer file to determine which file to read in order to continue the run. These are examples of pointer files created for a component set using full active model components.
每当组件写入重新启动文件时,它也会以 rpointer. $component 格式写入重新启动指针文件。重新启动后,每个组件都会读取指针文件,以确定要读取哪个文件才能继续运行。以下是使用完全活动模型零部件为零部件集创建的指针文件示例。

  • rpointer.atm
  • rpointer.drv
  • rpointer.ice
  • rpointer.lnd
  • rpointer.rof
  • rpointer.cism
  • rpointer.ocn.ovf
  • rpointer.ocn.restart

If short-term archiving is turned on, the model archives the component restart data sets and pointer files into $DOUT_S_ROOT/rest/yyyy-mm-dd-sssss, where yyyy-mm-dd-sssss is the model date at the time of the restart. (See below for more details.)
如果启用了短期存档,则模型会将组件重新启动数据集和指针文件存档到 $DOUTS_ROOT/rest/yyyy-mm-dd-ssss中,其中yyyy-mm-dd-ssss是重新启动时的模型日期。(有关详细信息,请参阅下文。)

5.5.1. Backing up to a previous restart 备份上一次重启

If a run encounters problems and crashes, you will normally have to back up to a previous restart. If short-term archiving is enabled, find the latest $DOUT_S_ROOT/rest/yyyy-mm-dd-ssss/ directory and copy its contents into your run directory ( $RUNDIR).
如果运行遇到问题和崩溃,您通常必须备份到以前的重新启动。如果启用了短期存档,请找到最新的 $DOUTS_ROOT/rest/yyyy-mm-dd-sssss/目录,并将其内容复制到您的运行目录( $RUNDIR)中。

Make sure that the new restart pointer files overwrite older files in in $RUNDIR or the job may not restart in the correct place. You can then continue the run using the new restarts.
请确保新的重新启动指针文件覆盖 $RUNDIR中的旧文件,否则作业可能无法在正确的位置重新启动。然后,您可以使用新的重新启动来继续运行。

Occasionally, when a run has problems restarting, it is because the pointer and restart files are out of sync. The pointer files are text files that can be edited to match the correct dates of the restart and history files. All of the restart files should have the same date.
偶尔,当运行在重新启动时出现问题时,这是因为指针和重新启动文件不同步。指针文件是可以编辑的文本文件,以匹配重新启动的正确日期和历史文件。所有重新启动文件的日期都应该相同。

5.6. Archiving model output data 存档模式输出数据

The output data flow from a successful run depends on whether or not short-term archiving is enabled, as it is by default.
成功运行的输出数据流取决于是否启用了短期归档(默认情况下是这样)。

5.6.1. No archiving 不存档

If no short-term archiving is performed, model output data remains remain in the run directory as specified by $RUNDIR.
如果不执行短期归档,则模型输出数据将保留在 $RUNDIR指定的运行目录中。

5.6.2. Short-term archiving 短期存档

If short-term archiving is enabled, component output files are moved to the short-term archiving area on local disk, as specified by $DOUT_S_ROOT. The directory normally is $EXEROOT/…/…/archive/ $CASE. and has the following directory structure:
如果短期存档被开启,组件输出文件会被移动到 $DOUT_S_ROOT所规定的本地存储的短期存档区域。这个短期存档目录通常是 $EXEROOT/…/…/archive/ $CASE,并且有以下的目录结构:在这里插入图片描述

The logs/ subdirectory contains component log files that were created during the run. Log files are also copied to the short-term archiving directory and therefore are available for long-term archiving.
这里的logs/及子目录包括了各组件的日志文件,这些日志文件在run.Log文件被复制到短期存档目录期间被创造出来,因此在长期存档里也可以找到它们。

The rest/ subdirectory contains a subset of directories that each contains a consistent set of restart files, initial files and rpointer files. Each subdirectory has a unique name corresponding to the model year, month, day and seconds into the day when the files were created. The contents of any restart directory can be used to create a branch run or a hybrid run or to back up to a previous restart date.
这里的rest/及其子目录包括各目录的子集,即每一个目录都包含了一组一致的重启文件、初始文件和指针文件。每一个子目录都有一个与文件被创造出来那天的模式年份、月份、天和秒数相关的独特的名字。任何重启目录的内容都可以被用作运行一次bratch run 或者 一次 hybrid run 或者回到先前的重启日期。

5.6.3. Long-term archiving 长期存档

Users may choose to follow their institution’s preferred method for long-term archiving of model output. Previous releases of CESM provided an external long-term archiver tool that supported mass tape storage and HPSS systems. However, with the industry migration away from tape archives, it is no longer feasible for CIME to support all the possible archival schemes available.
用户可以选择跟从他们机构对于模式输出长期存储更喜欢用的方法。先前版本的CESM提供一个支持大容量磁盘存储和HPSS系统的外部长期存档工具。但是,随着行业迁移不常使用磁盘存储,CIME不再支持所有可能的可用存储方案了。

5.7. Data Assimilation and other External Processing 数据同化和其他外部进程

CIME provides a capability to run a task on the compute nodes either before or after the model run. CIME also provides a data assimilation capability which will cycle the model and then a user defined task for a user determined number of cycles.
CIME提供了在模型运行之前或之后在计算节点上运行任务的能力。CIME也提供了资料同化的能力,这种能力将循环模式并且可以为用户确定的循环数量提供定义的任务。(好绕)

5.7.1. Pre and Post run scripts 前后处理运行程序

Variables PRERUN_SCRIPT and POSTRUN_SCRIPT can each be used to name a script which should be exectuted immediately prior starting or following completion of the CESM executable within the batch environment. The script is expected to be found in the case directory and will recieve one argument which is the full path to that directory. If the script is written in python and contains a subroutine with the same name as the script, it will be called as a subroutine rather than as an external shell script.
变量PRERUN_SCRIPT和POSTRUN_SCRIPT可以被分别用于命名脚本,该脚本应在批处理环境中的CESM可执行文件启动之前或完成之后立即执行。该脚本应在case目录中找到,并将接收一个参数,该参数是该目录的完整路径。如果脚本是用python编写的,并且包含一个与脚本同名的子例程,那么它将被调用为子例程,而不是外部shell脚本。

5.7.2. Data Assimilation scripts 资料同化脚本

Variables DATA_ASSIMILATION, DATA_ASSIMILATION_SCRIPT, and DATA_ASSIMILATION_CYCLES may also be used to externally control model evolution. If DATA_ASSIMILATION is true after the model completes the DATA_ASSIMILATION_SCRIPT will be run and then the model will be started again DATA_ASSIMILATION_CYCLES times. The script is expected to be found in the case directory and will recieve two arguments, the full path to that directory and the cycle number. If the script is written in python and contains a subroutine with the same name as the script, it will be called as a subroutine rather than as an external shell script.
变量DATA_ASSIMILATION, DATA_ASSIMILATION_SCRIPT,和DATA_ASSIMILATION_CYCLES 也可以被用来从外部控制模式分辨率。如果在模型完成后DATA_ASSIMILATION为true,则将运行DATA_ASSIMILATION_SCRIPT,然后将再次启动模型DATA_ASSIMILATION_CYCLES次。该脚本应在case目录中找到,并将接收两个参数,即该目录的完整路径和周期号。如果脚本是用python编写的,并且包含一个与脚本同名的子例程,那么它将被调用为子例程,而不是外部shell脚本。

A simple example pre run script.这是一个简单的前处理运行脚本示例:

#!/usr/bin/env python3
import sys
from CIME.case import Case

def myprerun(caseroot):
    with Case(caseroot) as case:
         print ("rundir is ",case.get_value("RUNDIR"))

 if __name__ == "__main__":
   caseroot = sys.argv[1]
   myprerun(caseroot)
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值