AI实战:搭建带注意力机制的 seq2seq 模型来做数值预测

AI实战:搭建带注意力机制的 seq2seq 模型来做数值预测

seq2seq 框架图

在这里插入图片描述

环境依赖

  • Linux
  • python3.6
  • tensorflow.keras

源码搭建模型及说明

  • 依赖库
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import backend as K
from tensorflow.keras import activations
from tensorflow.keras.layers import Layer, Input, Embedding, LSTM, Dense, Attention, GRU, Dropout
from tensorflow.keras.models import Model
import numpy as np
  • 编码
    使用GRU作为特征提取器(针对数据量相对较少的情况,若是数据量很大,建议使用LSTM)
class Encoder(keras.Model):
    def __init__(self, hidden_units):

        super(Encoder, self).__init__()
        # Encode LSTM Layer
        self.encoder_lstm = GRU(hidden_units, return_sequences=True, return_state=True, name="encode_lstm")
        self.dropout = Dropout(rate=0.5)
        
    def call(self, inputs):

        encoder_outputs, state_h = self.encoder_lstm(inputs)
        return encoder_outputs, state_h
  • 解码
    解码结构中添加了注意力机制
class Decoder(keras.Model):
    def __init__(self, hidden_units):
        super(Decoder, self).__init__()
        # Decode LSTM Layer
        self.decoder_lstm = GRU(hidden_units, return_sequences=True, return_state=True, name="decode_lstm")

        # Attention Layer
        self.attention = Attention()
        self.dropout = Dropout(rate=0.5)
    
    def call(self, enc_outputs, dec_inputs, states_inputs):

        dec_outputs, dec_state_h = self.decoder_lstm(dec_inputs, initial_state=states_inputs)
        attention_output = self.attention([dec_outputs, enc_outputs])
        
        return attention_output, dec_state_h
        
  • 搭建带注意力机制的 seq2seq 模型
    loss函数可以是tf自带的loss函数,也可以是自定义,如下面任选一个:
loss_fn = tf.keras.losses.MeanAbsoluteError()
loss_fn = mae

def mae(y_true, y_pred):
    return K.mean(K.abs(y_pred - y_true))

整体代码如下:

def seq2seq_attention(encode_shape, decode_shape, hidden_units, output_dim):
    """
    带注意力机制的seq2seq 模型
    """
	
    # Input Layer
    encoder_inputs = Input(shape=encode_shape, name="encode_input")
    decoder_inputs = Input(shape=decode_shape, name="decode_input")
    
    # Encoder Layer
    encoder = Encoder(hidden_units)
    enc_outputs, enc_state_h = encoder(encoder_inputs)
    dec_states_inputs = enc_state_h
    
    # Decoder Layer
    decoder = Decoder(hidden_units)
    attention_output, dec_state_h = decoder(enc_outputs, decoder_inputs, dec_states_inputs)
    
    # Dense Layer
    dense_outputs = Dense(output_dim, activation='sigmoid', name="dense")(Dropout(rate=0.5)(attention_output))
    
    # seq2seq model
    model = Model(inputs=[encoder_inputs, decoder_inputs], outputs=dense_outputs)
    model.summary()
    
    opt = keras.optimizers.Adam(lr=0.0005)
    #loss_fn = tf.keras.losses.MeanAbsolutePercentageError()
    #loss_fn = tf.keras.losses.MeanAbsoluteError()
    loss_fn = mae
    model.compile(loss=loss_fn, optimizer=opt)
    
    return model
  • 测试搭建的模型
if __name__ == '__main__':
    seq2seq_attention( (72, 21), (24, 20), 50, 1 )
  • 最后
    根据业务需求,生成对应的数据X1,X2,Y,分别对应encoder的输入、decoder的输入、输出

  • 训练函数

def train(train_data_path, test_data_path):
    batch_size = 512
	epochs = 1000
    
	
	X1, X2, Y = create_dataset(train_data_path)
    train_data, eval_data, y_train, y_eval = split_data(X1, X2, Y, test_size=0.2, shuffle=shuffle)
    
    
    #搭建模型
    model = seq2seq_attention(encode_shape, decode_shape, hidden_units, output_dim)
    
    #训练模型
    callback = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=20)
    checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(
                                filepath=model_path,
                                save_weights_only=True,
                                monitor='val_loss',
                                mode='min',
                                save_best_only=True)
    model.fit(x = train_data, 
                y = y_train, 
                batch_size = batch_size, 
                epochs = epochs, 
                callbacks=[callback, checkpoint_callback], 
                verbose = 2, 
                shuffle = True, 
                validation_data = (eval_data, y_eval))
    
	#模型测试
    X1, X2, Y = create_dataset(test_data_path)
    test_data = [X1, X2]
    y_test = Y
    scores = model.evaluate(test_data, y_test, verbose=0)
    print(model.metrics_names, scores)
	
  • 2
    点赞
  • 14
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 10
    评论
评论 10
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

szZack

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值