基于timegan扩增技术,进行多维度数据扩增(Python编程,数据集为瓦斯浓度气体数据集)

 实验数据来自山西某矿 15103 工作面,时间跨度为 2020 年 9 月 16 日至 2021 年 12 月 31 日,包括瓦斯浓度、CO 浓度、温度、风速、工作面粉尘、瓦斯流量、负压、管道内瓦斯浓度 8 种数据,数据采集间隔为 1 天。

1.数据集介绍

瓦斯是被预测气体,其它列为特征列,原始数据一共有472行数据,因为原始数据比较少,所以要对原始数据(总共8列数据)进行扩增。

开始数据截图

 截止数据截图

2. 文件夹介绍

lstm.py是对未扩增的数据进行训练和测试

gan_code.py是数据扩增文件。

gan_data1.npy保留扩增以后的伪数据

gan_lstm.py 是利用扩增后的数据与原始数据的一部分一起作为训练集,对测试集进行测试。

扩增程序运行视频(为减小视频时长,视频中训练次数设置为2)

基于timegan扩增技术,进行多维度数据扩增_哔哩哔哩_bilibili

扩增程序运行5000个train_steps,也就是训练5000次后,将扩增数据与真实数据,利用PCA和TNSE进行特征可视化,可以看出扩增出来的数据与原始数据特征分布近似,扩增数据效果较佳。

对8列数据,随机选出每列的24个连续的点,真实值与生成的数据对比:

 2.利用LSTM进行预测

将原始数据集的最后一半(236行,也就是263个样本作为测试集),前面一半单独作为训练集,模型经训练后,对测试集的效果如下:

RMSE: 0.18512594640992197

MAE: 0.11461186704684799

MSE: 0.0342716160341693

将原始数据集的最后一半(236行,也就是263个样本作为测试集),前面一半和扩增后的数据一起组成训练集,模型经训练后,对测试集的效果如下:

RMSE: 0.1454103476829004

MAE: 0.05941093629294589

MSE: 0.02114416921326198

将原始数据集的最后80%(378行,也就是378个样本作为测试集),前面20%(95个样本)单独作为训练集,模型经训练后,对测试集的效果如下:

RMSE: 0.18795273726595538

MAE: 0.12856747175336741

MSE: 0.03532623144576525

将原始数据集的最后80%(378行,也就是378个样本作为测试集),前面20%(95个样本)和扩增数据一起作为训练集,模型经训练后,对测试集的效果如下:

RMSE: 0.13263138712145212

MAE: 0.07024818880211464

MSE: 0.017591084849760494

对项目感兴趣的,可以关注最后一行

import numpy as np
import pandas as pd
from sklearn.preprocessing import MinMaxScaler

from tensorflow import function, GradientTape, sqrt, abs, reduce_mean, ones_like, zeros_like, convert_to_tensor,float32
from tensorflow import data as tfdata
from tensorflow import config as tfconfig

#代码和数据集的压缩包:https://mbd.pub/o/bread/mbd-ZpaXmJdv

只需要数据集的可以关注下方最后一行

import numpy as np
import pandas as pd
from sklearn.preprocessing import MinMaxScaler

from tensorflow import function, GradientTape, sqrt, abs, reduce_mean, ones_like, zeros_like, convert_to_tensor,float32
from tensorflow import data as tfdata
from tensorflow import config as tfconfig

#数据集:https://mbd.pub/o/bread/ZZ6Zlp9r

  • 1
    点赞
  • 36
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 20
    评论
Time series GAN (Generative Adversarial Network) is a type of GAN that is specifically designed for generating time series data. Time series data is a type of data that is collected over a period of time, such as stock prices, weather data, or medical data. Time series GANs use a combination of deep learning techniques and generative models to learn the patterns and relationships in time series data, and then generate new time series data that is similar to the original data. The basic architecture of a time series GAN consists of two networks: a generator network and a discriminator network. The generator network takes random noise as input and generates a time series. The discriminator network takes both real and generated time series data as input and tries to distinguish between them. The two networks are trained together in an adversarial manner, with the generator network trying to fool the discriminator network into thinking that its generated time series data is real, and the discriminator network trying to correctly identify the real time series data from the generated data. One of the challenges in developing time series GANs is ensuring that the generated data is realistic and retains the structure and patterns of the original data. This can be achieved by incorporating additional constraints and objectives into the training process, such as ensuring that the generated data has similar statistical properties and autocorrelation to the original data. Time series GANs have applications in a variety of fields, including finance, climate modeling, and healthcare.

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 20
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

深度学习的奋斗者

你的鼓励是我努力的动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值