Time-Series Classification with COTE: The Collective of Transformation-Based Ensembles

最新推荐文章于 2023-07-25 15:36:55 发布

一杯冰拿铁

最新推荐文章于 2023-07-25 15:36:55 发布

阅读量990

点赞数

分类专栏：算法时间序列分类时间序列

算法同时被 3 个专栏收录

14 篇文章

订阅专栏

时间序列分类

3 篇文章

订阅专栏

时间序列

2 篇文章

订阅专栏

1 INTRODUCTION

Our second hypothesis was that we can improve TSC
performance through ensembling.
Although the value of ensembling is well known, our approach is unusual in that we inject diversity by adopting a heterogeneous ensemble rather than by using resampling schemes with weak learners. Our approach is in fact a meta-ensemble, since two of the components (random forest and rotation forest) are themselves ensembles.
虽然集成的价值是众所周知的，但是我们的方法是不寻常的，因为我们通过采用异构集合来注入多样性，而不是使用与弱学习器一样的重采样方案。

总的来说，我们（为了方便，这里我们指的是作者他们）使用了35种 classifiers. 这方法是最精确的，但是解释性最小。

We investigate ways of forming hierarchical ensembles through choosing subsets of data representations to use based on training set performance.
我们通过选择基于训练集性能的数据表示子集来研究形成分层集成的方法。

2 TIME SERIES CLASSIFICATION BACKGROUND

we have a set of n time series,
$T = { T_1, T_2, ... , T_n }$
where each time series has $m$ ordered real-valued observations
$T_i = < t_{i1}, t_{i2}, ... , t_{im} >$
and a class value $c_i$ .

时间序列分类问题的特点与一般的分类问题不同的是 属性的排序 是很重要的。

The best discriminatory features for classification might be masked by the length of the series, confounded by noise in the phase of the series or embedded in the interaction of observations.
分类的最佳鉴别特征可能被系列的长度所掩盖，被系列相位中的噪声所掩盖，或者被嵌入到观测的相互作用中。

因此，TSC通常需要针对问题性质的技术。

The alternative approaches to TSC are best understood by considering how the data is represented or, equivalently, how similarity between series is quantified.
通过考虑如何表示数据，或者等同于如何量化系列之间的相似性，可以更好地理解 the alternative approaches to TSC/ TSC的替代方法。

序列之间的相似性可以基于几个判别标准，例如：时间相似性，谱或自相关结构; 全球或本地相似; 和数据驱动或基于模型的相似性。

2.1 Similarity in the Time Domain

1-NN DTW with the warping window size set through
cross-validation on the training data, is surprisingly hard to beat.
shapelet : We discuss recent shapelet research in more detail in Section 3.1.

The second popular localised approach involves deriving features from varying size intervals of the series.
BoP、SAX、构建距离、构建树什么的

Baydogan et al. [3] describe a bag-of-features approach that combines interval and frequency features.
time series based on a bag-of-features representation (TSBF)