赛题简介
比赛使用的数据包括CMIP5/6模式的历史模拟数据和美国SODA模式重建的近100多年历史观测同化数据。每个样本包含以下气象及时空变量:海表温度异常(SST),热含量异常(T300),纬向风异常(Ua),经向风异常(Va),数据维度为(year,month,lat,lon)。对于训练数据提供对应月份的Nino3.4 index标签数据。
https://tianchi.aliyun.com/competition/entrance/531871/information
数据处理说明
数据介绍
Archive: enso_round1_train_20210201.zip
Length Method Size Cmpr Date Time CRC-32 Name
-------- ------ ------- ---- ---------- ----- -------- ----
0 Stored 0 0% 2021-01-25 19:12 00000000 enso_round1_train_20210201/
9246636414 Defl:N 4282415484 54% 2021-01-11 12:33 e0da2886 enso_round1_train_20210201/CMIP_train.nc
6148 Defl:N 178 97% 2021-01-25 18:51 6d88006a enso_round1_train_20210201/.DS_Store
726 Defl:N 406 44% 2021-01-25 19:12 d05d941d enso_round1_train_20210201/readme.txt
149310874 Defl:N 91379918 39% 2021-01-07 10:20 b99c0d6e enso_round1_train_20210201/SODA_train.nc
37536 Defl:N 7164 81% 2021-01-07 10:19 2c600d55 enso_round1_train_20210201/SODA_label.nc
1364676 Defl:N 277603 80% 2021-01-07 10:19 d040d038 enso_round1_train_20210201/CMIP_label.nc
-------- ------- --- -------
9397356374 4374080753 54% 7 files
readme:
CMIP_train.nc, CMIP模式数据,包含sst t300 ua va ,分别代表海温,热含量,表面风的东西分量(纬向风),表面风的南北分量(经向风)
CMIP_label.nc,为对应逐月nino3.4指数标签数据
SODA_train.nc,观测数据,包含sst t300 ua va ,分别代表海温,热含量,表面风的东西分量(纬向风),表面风的南北分量(经向风)
SODA_label.nc,为对应逐月nino3.4指数标签数据
是否允许使用外部数据:否
是否允许使用预训练权重:否
赛题baseline
引用baseline:
import tensorflow as tf
import tensorflow.keras.backend as K
from tensorflow.keras.layers import *
from tensorflow.keras.models import *
from tensorflow.keras.optimizers import *
from tensorflow.keras.callbacks import *
from tensorflow.keras.layers import Input
import numpy as np
import os
import zipfile
def RMSE(y_true, y_pred):
return tf.sqrt(tf.reduce_mean(tf.square(y_true - y_pred)))
def build_model():
inp = Input(shape=(12,24,72,4))
x_4 = Dense(1, activation='relu')(inp)
x_3 = Dense(1, activation='relu')(tf.reshape(x_4,[-1,12,24,72]))
x_2 = Dense(1, activation='relu')(tf.reshape(x_3,[-1,12,24]))
x_1 = Dense(1, activation='relu')(tf.reshape(x_2,[-1,12]))
x = Dense(64, activation='relu')(x_1)
x = Dropout(0.25)(x)
x = Dense(32, activation='relu')(x)
x = Dropout(0.25)(x)
output = Dense(24, activation='linear')(x)
model = Model(inputs=inp, outputs=output)
adam = tf.optimizers.Adam(lr=1e-3,beta_1=0.99,beta_2 = 0.99)
model.compile(optimizer=adam, loss=RMSE)
return model
model = build_model()
model.load_weights('./user_data/model_data/model_mlp_baseline.h5')
test_path = './tcdata/enso_round1_test_20210201/'
### 1. 测试数据读取
files = os.listdir(test_path)
test_feas_dict = {}
for file in files:
test_feas_dict[file] = np.load(test_path + file)
### 2. 结果预测
test_predicts_dict = {}
for file_name,val in test_feas_dict.items():
test_predicts_dict[file_name] = model.predict(val).reshape(-1,)
# test_predicts_dict[file_name] = model.predict(val.reshape([-1,12])[0,:])
### 3.存储预测结果
for file_name,val in test_predicts_dict.items():
np.save('./result/' + file_name,val)
#打包目录为zip文件(未压缩)
def make_zip(source_dir='./result/', output_filename = 'result.zip'):
zipf = zipfile.ZipFile(output_filename, 'w')
pre_len = len(os.path.dirname(source_dir))
source_dirs = os.walk(source_dir)
print(source_dirs)
for parent, dirnames, filenames in source_dirs:
print(parent, dirnames)
for filename in filenames:
if '.npy' not in filename:
continue
pathfile = os.path.join(parent, filename)
arcname = pathfile[pre_len:].strip(os.path.sep) #相对路径
zipf.write(pathfile, arcname)
zipf.close()
make_zip()
docker提交:
# Base Images
## 从天池基础镜像构建
# FROM registry.cn-shanghai.aliyuncs.com/tcc-public/python:3
FROM registry.cn-shanghai.aliyuncs.com/tcc-public/tensorflow:latest-cuda10.0-py3
## 把当前文件夹里的文件构建到镜像的根目录下(.后面有空格,不能直接跟/)
ADD . /
## 指定默认工作目录为根目录(需要把run.sh和生成的结果文件都放在该文件夹下,提交后才能运行)
WORKDIR /
## Install Requirements(requirements.txt包含python包的版本)
## 这里使用清华镜像加速安装
RUN pip install -i https://pypi.tuna.tsinghua.edu.cn/simple --upgrade pip
RUN pip install -i https://pypi.tuna.tsinghua.edu.cn/simple -r requirements.txt
#RUN pip install -i https://pypi.tuna.tsinghua.edu.cn/simple -r requirements.txt
## 镜像启动后统一执行 sh run.sh
CMD ["sh", "run.sh"]
关于命令的解释,可以看docker学习笔记(3):Dockerfile详解
上传阿里云的代码:
docker build -t submarineas/test_for_tianchi_submit:v1.0 .
sudo docker tag e4a98cf5ad64 registry.cn-shenzhen.aliyuncs.com/submarineas/qqq:v1.0
sudo docker push registry.cn-shenzhen.aliyuncs.com/submarineas/qqq:v1.0
首先build完,因为file只有6句话,所以build的步骤为6步:
有提示successful就是已经完成,那么docker images可以看到该镜像:
然后我们再按照阿里云的要求,对镜像名进行重新打标签,再push上去,然后就能提交了: