深度学习对通量的预测模型

目录

0.准备工作

1.因子的选择

2.网络的结构

3.优化的方法

4.结果的展示



0.准备工作

哦对了

因为我看了看联合观测资料里的有很多缺测,于是我决定用coare 资料

先加载一些包


import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from  scipy.io import loadmat 
# PyTorch
import torch
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader

# For data preprocess
import numpy as np
import csv
import os

# For plotting
import matplotlib.pyplot as plt
from matplotlib.pyplot import figure

然后

data = np.loadtxt('coare.txt')

数据差不多是这样的

 

'''
-------------------------------------------------------------------------------
00Date: YYMMDDHHmmss, YY=year, MM=month, DD=day, HH=hour, mm=minute,ss=sec 
1Us:   ship speed (as described above) 
2U:  true wind speed at 15-m height 
3Tru:  true wind direction rel. to N (meteorological convention) 
4Rel:  relative wind direction 
5Hed:  the direction the ship's bow is pointing 
6Ts:   sea surface temp (no cool skin correction) 
7T:    Vaisala air temperature (about 15 m) 
8qs:   sea surface specific humidity  (g/kg) (no cool skin correction) 
9q:    Vaisala air specific humidity (about 15 m) 
-------------------------------------------------------------------------------
10Hsc:  covariance sensible heat flux 
11Hsi:  inertial sensible heat flux 
12Hsb:  bulk sensible heat flux 
13Hlc:  covariance latent heat flux 
14Hli:  inertial latent heat flux 
15Hlb:  bulk latent heat flux 
16Tuc:  covariance surface stress (-wu part only) 
17Tui:  inertial-dissipation surface stress 
18Tub:  bulk surface stress 
-------------------------------------------------------------------------------
19Rs:   solar irradiance 
20Rl:   longwave irradiance 
21Rain: precipitation (mm/hr) 
22J:    ship plume/contamination index (0 implies good conditions) 
23Oph:  standard deviation of OPHIR hygrometer clear channel counts (<15 implies 
      reasonably clean optics). 
24Tlt:  mean wind vector tilt, degrees (<10 ok covariances) 
25Jm:   ship maneuver/contamination index, m/s (<2 implies good conditions) 
-------------------------------------------------------------------------------
26Ct:   sonic temperature structure function parameter (K^2/m^.667) 
27Cq:   water vapor structure function parameter ((g/m^3)/m^.667) 
28Cu:   streamwise velocity structure function parameter ((m/s)^2/m^.667) 
29Cw:   vertical velocity structure function parameter ((m/s)^2/m^.667) 
30Hr:   sensible heat flux due to precipitation at droplet wet-bulb T 
31To:   OPHIR air temperature 
32Qo:   OPHIR specific humidity 
-------------------------------------------------------------------------------
33Lat:  Latitude 
34Lon:  Longitude
-------------------------------------------------------------------------------
这是35列每列的变量。
'''

因为后面还需要参考,我就放进代码里


接下来就是选择用哪些来回归了

还有就是整理好数据,为了后面的训练数据用

得准备训练集,和测试集

验证集会在后面从训练集里面分出来的

feature = [2,6,8,9,19,21,33]
feature_for_train = [2,6,8,9,19,21,33,13]
feature_count = np.size(feature)
#feature = list(range(1,10))
#feature.extend([19,20,21,22,23,24,25,33,34])
#feature_for_train = list(range(1,10))
#feature_for_train.extend([19,20,21,22,23,24,25,33,34,13])
data_train1 = data[0:4000,:]
data_test1 = data[4000:4806,:]
hlc = data[:,13]
hlb = data[:,15]

data_train = data_train1[:,feature_for_train]
data_test = data_test1[:,feature]
data_train = pd.DataFrame(data_train)
data_test =pd.DataFrame(data_test)
data_train.to_csv('train.csv',index=False)
data_test.to_csv('test.csv',index=False)
#还是保存成csv吧。就先回归潜热吧

因为我比较喜欢研究潜热,主要是其他两个别人搞得已经非常好了,没啥搞头了

但是这个我后面还是会进行模拟的

只是潜热我更喜欢它

接下来就是调试深度学习的代码了,因为之前学的课,我有半成品的代码,我就不要从头开始敲了,这样方便一点了。

接下来我就放几个关键的地方

我准备写:

1.因子的选择

2.网络的结构

3.优化的方法

4.结果的展示

这也是深度学习里面最为关键的几个点,最核心的点吧

1.因子的选择

首先是选择参数

我们得 抓住重点,最大的 错误就是对于无关紧要的变量进行细致计算却丢掉了重要的参数。这差不多是朗道说的。

feature = [2,6,8,9,19,21,33]
feature_for_train = [2,6,8,9,19,21,33,13]

feature_count = np.size(feature)

class myDataset(Dataset):

    def __init__(self,
                 path,
                 mode='train',
                 target_only=False):
        self.mode = mode

        # Read data into numpy arrays
        with open(path, 'r') as fp:
            data = list(csv.reader(fp))
            data = np.array(data[1:])[:, :].astype(float)
        
        if not target_only:
            feats = list(range(feature_count))
        else:
           feats = list(range(feature_count))# feats = list(range(40))
           # feats.extend([57,75])# TODO

        if mode == 'test':
            # Testing data    
            data = data[:, feats]
            self.data = torch.FloatTensor(data)
        else:
            # Training data (train/dev sets)

            target = data[:, -1]
            data = data[:, feats]
            
            # Splitting training data into train & dev sets
            if mode == 'train':
                indices = [i for i in range(len(data)) if i % 10 != 0]
            elif mode == 'dev':
                indices = [i for i in range(len(data)) if i % 10 == 0]
            
            # Convert data into PyTorch tensors
            self.data = torch.FloatTensor(data[indices])
            self.target = torch.FloatTensor(target[indices])

        # Normalize features (you may remove this part to see what will happen)
        self.data[:, :] =  (self.data[:, :] - self.data[:, :].mean(dim=0, keepdim=True))   / self.data[:, :].std(dim=0, keepdim=True)

        self.dim = self.data.shape[1]

        print('Finished reading the {} set of COVID19 Dataset ({} samples found, each dim = {})'
              .format(mode, len(self.data), self.dim))

    def __getitem__(self, index):
        # Returns one sample at a time
        if self.mode in ['train', 'dev']:
            # For training
            return self.data[index], self.target[index]
        else:
            # For testing (no target)
            return self.data[index]

    def __len__(self):
        # Returns the size of the dataset
        return len(self.data)

这里面写了一个class来搞他

就是其实选择因子最主要还是前面的那两行,后面几乎不用动了。

2.网络的结构

这个网络的结构也就是模型的核心了,之后的调参都是对于这个框架的细节进行修改,网络的结构非常重要了。

但是具体每个问题需要什么结构还是需要对症下药的,我也讲不清楚。

在这个问题里面

我是这样的,但是我还在调试。

class NeuralNet(nn.Module):
    ''' A simple fully-connected deep neural network '''
    def __init__(self, input_dim):
        super(NeuralNet, self).__init__()

        # Define your neural network here
        # TODO: How to modify this model to achieve better performance?
        self.net = nn.Sequential(
            nn.Linear(input_dim, 64),
            nn.ReLU(),
            nn.Linear(64, 1)
        )

        # Mean squared error loss
        self.criterion = nn.MSELoss(reduction='mean')

    def forward(self, x):
        ''' Given input of size (batch_size x input_dim), compute output of the network '''
        return self.net(x).squeeze(1)

    def cal_loss(self, pred, target):
        ''' Calculate loss '''
        # TODO: you may implement L2 regularization here
        return self.criterion(pred, target)

具体的还需要多学习,看文献了。

3.优化的方法

优化也有很多方法,我一下就能想到的知道就是SGD和Adam

这两我都试了

config = {
    'n_epochs': 3000,                # maximum number of epochs
    'batch_size': 500,               # mini-batch size for dataloader
    'optimizer': 'Adam',              # optimization algorithm (optimizer in torch.optim)
    'optim_hparas': {                # hyper-parameters for the optimizer (depends on which optimizer you are using)
#        'lr': 0.001,                 # learning rate of SGD
#        'momentum': 0.09              # momentum for SGD
    },
    'early_stop': 10000,               # early stopping epochs (the number epochs since your model's last improvement)

因为我现在的更大的问题出现在model bias 

所以优化我觉得这个就可以了。

4.结果的展示

然后我就训练出了一些结果,然后不断地调整,目前也有了可以看得结果

 

可以看出来预测结果还是不错的,而且我只用了一天的时间就解决了,别的科学家搞了好多年的问题。 当然这是个不公平的对比,因为时代不同了。

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值