r软件时间序列分析论文_高度比较的时间序列分析-一篇论文评论

r软件时间序列分析论文

数据科学机器学习 (Data Science, Machine Learning)

In machine learning with time series, using features extracted from series is more powerful than simply treating a time series in a tabular form, with each date/timestamp in a separate column. Such features can capture the characteristics of series, such as trend and autocorrelations.

在具有时间序列的机器学习中,使用从序列中提取的特征比仅以表格形式处理时间序列(每个日期/时间戳在单独的列中)更强大。 这些特征可以捕获序列的特征,例如趋势和自相关。

But… what sorts of features can you extract and how do you select among them?

但是……您可以提取哪些类型的特征,以及如何在其中进行选择?

In this article, I discuss the findings of two papers that analyze feature-based representations of time series. The papers conduct comprehensive work to collect thousands of time series feature extractors and evaluate which features capture the most useful information from a series.

在本文中,我讨论了两篇分析基于特征的时间序列表示的论文的发现。 这些论文进行了全面的工作,以收集成千上万个时间序列特征提取器,并评估哪些特征捕获了序列中最有用的信息。

The papers show how to compare time series by extracting features that describe the series behavior and suggest a pipeline for identifying an “optimal” subset of time series features.

这些论文展示了如何通过提取描述序列行为的特征并建议用于识别时间序列特征的“最佳”子集的管道来比较时间序列。

为什么这很重要? (Why Is This Important?)

There are two basic ways to compare time series:

有两种比较时间序列的基本方法:

  1. A similarity measure that quantifies whether two-time series are close (on average) across time, such as Dynamic Time Warping. These measures are typically best for short, aligned series of equal length. They tend to have poor scalability, with quadratic computation in both the number of time series and series length because distances must be computed between all pairs.

    一种用于量化两个时间序列在整个时间上是否接近(平均)的相似性度量 ,例如Dynamic Time Warping 。 这些措施通常最适合短而对齐的等长序列。 它们往往具有较差的可伸缩性,因为在时间序列的数量和序列长度上都需要进行二次计算,因为必须在所有对之间计算距离。

  2. Define similarity between series in terms of features extracted from time series using time series analysis algorithms. Feature extractors do not require series to be of equal length. The result is an interpretable summary of the dynamical characteristics of each series. These features can then be used for machine learning.

    使用时间序列分析算法从时间序列提取特征方面定义序列之间的相似性。 特征提取器不需要序列的长度相等。 结果是每个系列动力学特性的可解释性总结。 这些功能可以用于机器学习。

Interpretability is another key: time series features can capture complex, time-varying patterns in a set of interpretable characteristics.

可解释性是另一个关键:时间序列特征可以以一组可解释的特征捕获复杂的时变模式。

Problematically, there are a vast number of methods to extract interpretable features from time series. Further, feature selection is often done manually and subjectively.

有问题的是,有很多方法可以从时间序列中提取可解释的特征。 此外,特征选择通常是手动和主观地完成的。

What sort of features can be extracted from series and how could you select among them?

可以从系列中提取什么样的特征,如何从中选择?

Image for post
hctsa: A Computational Framework for Automated Time-Series Phenotyping Using Massive Feature Extraction hctsa:使用大规模特征提取进行自动时间序列表型分析的计算框架

高度比较的时间序列分析:时间序列的经验结构及其方法 (Highly comparative time-series analysis: the empirical structure of time series and their methods)

Paper motivation: although time series are studied across scientific disciplines (e.g. stock prices in finance, human heartbeats in medicine), different methods for time series analysis have been developed separately in different disciplines.

论文动机:尽管跨学科研究了时间序列(例如金融中的股票价格,医学上的人的心跳),但在不同学科中分别开发了不同的时间序列分析方法

Given the great number of methods, it is difficult to determine how methods developed by different disciplines are related. As a result, how can a practitioner select the optimal method for their data?

鉴于方法众多,因此很难确定不同学科开发的方法之间的关系。 结果,从业者如何为他们的数据选择最佳方法?

To address this challenge, the HCTSA paper…

为了应对这一挑战,HCTSA论文…

  • Assembles an extensive annotated library of time series data and methods for time series analysis.

    组装了一个广泛的带注释的时间序列数据库和时间序列分析方法。
  • Models time series methods according to their behavior on the data and group time series by their measured properties.

    根据时间序列方法在数据上的行为对时间序列方法进行建模,并通过其测量属性将时间序列分组。
  • Introduces a range of comparative analysis techniques for series and their methods. First, the ability to link given time series to similar real-world and model-generated series. Second, the ability to link specific time series analysis methods to a range of alternatives across the literature.

    介绍了一系列用于系列及其方法
  • 1
    点赞
  • 8
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值