linux系统批量下载ERA5日数据

Lwei_bear

已于 2022-11-12 20:28:56 修改

阅读量2.6k

点赞数 9

文章标签： python 开发语言

于 2022-11-11 18:00:00 首次发布

本文链接：https://blog.csdn.net/LHgwei/article/details/127807676

版权

ERA5官网提供了下载每月daily数据的网址，但只能一个月一个月手动去下，非常麻烦，正在探索如何使用服务器实现自动批量下载

ERA5用户手册网址如下，里面有很多有用的信息，可以作为辅助工具查询信息

https://datastore.copernicus-climate.eu/documents/app-c3s-daily-era5-statistics/C3S_Application-Documentation_ERA5-daily-statistics-v2.pdf

问题背景:老师给了服务器，我希望能够在服务器里将ERA5 daily数据多年多月份自动遍历下载，即用py脚本批量下载

1.下载准备

创建自己的用户文件夹liuhw，创建数据文件夹data，下载数据集保存在liuhw/data目录下

搭建py运行环境，搭建自己的虚拟环境lhw（见另一篇文章），下载所需要的库

2.安装CDS API

首先，在服务器里创建.cdsapirc 文件

进入服务器，不需要进入自己的文件夹，在home下或者users下都行，在操作界面输入以下代码，即在HOME根目录下创建了.cdsapirc文件

vim $HOME/.cdsapirc

其次，复制自己的url和key

关于url和key，必须在ERA5官网才能找到并复制下来，前提是必须已经注册并登录了该网站，才会生成专属于你的url和key，后面下载数据的时候计算机就知道该去哪下了，相当于用户名和密码的作用？（瞎猜的）

路径在这里，点进去以后有详细说明http://How to use the CDS API | Copernicus Climate Data Store，将右侧黑框中的url和key复制过来即可（注意需要先登录才会显示自己的url和key）

最后，在.cdsapirc文件中填入自己的url和key

按照以下顺序操作，将复制的内容写入cdsapirc文件中并保存退出（使用MobaXterm操作会比较简单，便于粘贴复制）

vi .cdsapirc （写入内容到文件）

粘贴（paste）url和key

Esc （退出编辑模式）

：wq （保存内容并退出该文件）

3.安装cdsapi包

进入到自己的lhw环境下，安装该包，pip即可

pip install cdsapi

4.py代码编写

以上工作做完以后，准备工作就绪啦，现在编写核心的代码部分，没有这个代码计算机怎么会知道下载啥呢是吧

（先不管服务器，打开spyder软件，编写的以下内容，并将代码保存为download.py于电脑的任意位置）

以下模板可以套用，但要按需修改(之前的废太子prelev根本不起作用，总是下载错数据，经过我潜千辛万苦地修改，终于可以下载了！！可以下载任意变量、任意气压层的，任意区域和时间的daily数据了！哭了！)

在此感谢哥白尼官网代码示例，证明官网提供的东西就是yyds

daily data download use API

# -*- coding: utf-8 -*-
"""
Created on Fri Nov 11 15:55:04 2022

@author: lhw
"""
# -*- coding: utf-8 -*-

import time 
import cdsapi
import requests
import multiprocessing
 
# CDS API script to use CDS service to retrieve daily ERA5* variables and iterate over
# all months in the specified years.
 
# Requires:
# 1) the CDS API to be installed and working on your system
# 2) You have agreed to the ERA5 Licence (via the CDS web page)
# 3) Selection of required variable, daily statistic, etc
 
# Output:
# 1) separate netCDF file for chosen daily statistic/variable for each month
 
# Uncomment years as required
# For valid keywords, see Table 2 of:
# https://datastore.copernicus-climate.eu/documents/app-c3s-daily-era5-statistics/C3S_Application-Documentation_ERA5-daily-statistics-v2.pdf
# select your variable; name must be a valid ERA5 CDS API name.
# Select the required statistic, valid names given in link above
 
 
c = cdsapi.Client()


def Download(iyear, imonth):
    c = cdsapi.Client(timeout=600, retry_max=1000, quiet=False, debug=True)
    t000=time.time()
    result = c.service(
        "tool.toolbox.orchestrator.workflow",
        params={
            "realm": "user-apps",
            "project": "app-c3s-daily-era5-statistics",
            "version": "master",
            "kwargs": {
                "dataset": "reanalysis-era5-pressure-levels",
                "product_type": "reanalysis",
                "variable": "u_component_of_wind",
                "statistic":"daily_mean",
                "year": iyear,
                "month": imonth,
                "pressure_level": "10",
                "time_zone": "UTC+00:0",
                "frequency": "1-hourly",
                #
                # Users can change the output grid resolution and selected area
                #
                "grid": "1.0/1.0",
                "area":{"lat": [0, 90], "lon": [-180, 180]}
 
            },
            "workflow_name": "application"
        })
 
    # set name of output file for each month (statistic, variable, year, month
 
    file_name =  "daily_mean" + "_" + "u_component_of_wind" +  iyear + imonth + ".nc"
 
    location = result[0]['location']
    res = requests.get(location, stream=True)
    print("Writing data to " + file_name)
    with open(file_name, 'wb') as fh:
        for r in res.iter_content(chunk_size=1024):
            fh.write(r)
    fh.close()
    print('***样本%s 保存结束, 耗时: %.3f s / %.3f mins****************' % (file_name,(time.time() - t000), (time.time() - t000) / 60))
    
    
years = [ str(id1) for id1 in range(1950,2021) ]
months =[ '%02d' % id2 for id2 in range(1,13) ]
if __name__ == "__main__":
    t0 = time.time()
    # ===================================================================================
    print('*****************程序开始*********************')
    for yr in years:
        for mn in months:
            Download(yr, mn)
    print('***********************程序结束, 耗时: %.3f s / %.3f mins****************' % (
        (time.time() - t0), (time.time() - t0) / 60)

以上代码就是便利下载ERA5日数据某一种特定变量的文件，注意根据自己想要的类型修改dataset类型、变量名（一定要和API种要求的相符，可以通过ERA5网站数据集种提供的变量名选择）、格点分辨率、经纬度范围、频率等。

5.服务器运行download.py代码

重新进入服务器，将download.py文件从电脑中拖入服务器的liuhw/data目录下

（注意，download.py文件放在哪，下载的数据就保存在哪哦）

进入lhw虚拟环境，输入python download.py，回车运行即可实现下载!

码字不易，完结撒花