笔记:Deep learning for time series classification: a review

1 Introduction

  • This paper targets the following open questions:
  1. What is the current state-of-the-art DNN for TSC?
  2. Is there a current DNN approach that reaches state-of-the-art performance for TSC and is less complex than HIVE-COTE?
  3. What type of DNN architectures works best for the TSC task?
  4. How does the random initialization affect the performance of deep learning classifiers?
  5. Could the black-box effect of DNNs be avoided to provide interpretability?


  • The main contributions of this paper can be summarized as follows:
  1. We explain with practical examples, how deep learning can be adapted to one dimensional time series data.
  2. We propose a unified taxonomy that regroups the recent applications of DNNs for TSC in various domains under two main categories: generative and discriminative models.
  3. We detail the architecture of nine end-to-end deep learning models designed specifically for TSC.
  4. We evaluate these models on the univariate UCR/UEA archive benchmark and 12 MTS classification datasets.
  5. We provide the community with an open source deep learning framework for TSC in which we have implemented all nine approaches.
  6. We investigate the use of Class Activation Map (CAM) in order to reduce DNNs’ black-box effect and explain the different decisions taken by various models.

2 Background

在这里插入图片描述

  • A general deep learning framework for TSC is depicted in Fig. 1. These networks are designed to learn hierarchical representations of the data.
  • In this review we focus on three main DNN architectures used for the TSC task: Multi Layer Perceptron (MLP), Convolutional Neural Network (CNN) and Echo State Network (ESN).

2.2 Deep learning for time series classification

2.2.1 Multi layer perceptrons
  • An MLP constitutes the simplest and most traditional architecture for deep learning models.
  • One impediment(阻碍) from adopting MLPs for time series data is that they do not exhibit any spatial invariance. In other words, each time stamp has its own weight and the temporal information is lost: meaning time series elements are treated independently from each other.
2.2.2 Convolutional neural networks
  • A convolution can be seen as applying and sliding a filter over the time series. Unlike images, the filters exhibit only one dimension (time) instead of two dimensions (width and height). The filter can also be seen as a generic non-linear transformation of a time series.
  • Unlike MLPs, the same convolution (the same filter values w and b) will be used to find the result for all time stamps t ∈ [1, T]. This is a very powerful property (called weight sharing) of the CNNs which enables them to learn filters that are invariant across the time dimension.

在这里插入图片描述

在这里插入图片描述

2.2.3 Echo state networks
  • Another popular type of architectures for deep learning models is the Recurrent Neural Network (RNN).
  • Apart from time series forecasting, we found that these neural networks were rarely applied for time series classification which is mainly due to three factors: (1) the type of this architecture is designed mainly to predict an output for each element (time stamp) in the time series (Längkvist et al. 2014); (2) RNNs typically suffer from the vanishing gradient problem due to training on long time series (Pascanu et al. 2012); (3) RNNs are considered hard to train and parallelize which led the researchers to avoid using them for computational reasons (Pascanu et al. 2013).


  • Given the aforementioned limitations, a relatively recent type of recurrent architecture was proposed for time series: Echo State Networks (ESNs) (Gallicchio and Micheli 2017).

在这里插入图片描述

  • To better understand the mechanism of these networks, consider an ESN with input dimensionality M, neurons in the reservoir Nr and an output dimensionality K equal to the number of classes in the dataset. Let X(t) ∈ RM, I(t) ∈ RNr and Y(t) ∈ RK denote the vectors of the input M-dimensional MTS, the internal (or hidden) state and the output unit activity for time t respectively. Further let Win ∈ RNr×M and W ∈ RNr×Nr and Wout ∈ RC×Nr denote respectively the weight matrices for the input time series, the internal connections and the output connections as seen in Fig. 4. The internal unit activity I(t) at time t is updated using the internal state at time step t−1 and the input time series element at time t. Formally the hidden state can be computed using the following recurrence:
    在这里插入图片描述
    with f denoting an activation function of the neurons, a common choice is tanh(·) applied element-wise (Tanisaro and Heidemann 2016). The output can be computed according to the following equation:
    在这里插入图片描述
    thus classifying each time series element X(t). Note that ESNs depend highly on the initial values of the reservoir that should satisfy a pre-determined hyperparameter: the spectral radius. Fig. 4 shows an example of an ESN with a univariate input time series to be classified into K classes.

2.3 Generative or discriminative approaches

在这里插入图片描述

2.3.1 Generative models
  • Generative models usually exhibit an unsupervised training step that precedes(在……之前) the learning phase of the classifier (Längkvist et al. 2014).This type of network has been referred to as Model-based classifiers in the TSC community (Bagnall et al. 2017).
  • For all generative approaches, the goal is to find a good representation of time series prior to training a classifier (Längkvist et al. 2014).
2.3.2 Discriminative models
  • A discriminative deep learning model is a classifier (or regressor) that directly learns the mapping between the raw input of a time series (or its hand engineered features) and outputs a probability distribution over the class variables in a dataset.
  • This type of model could be further sub-divided into two groups: (1) deep learning models with hand engineered features and (2) end-to-end deep learning models.

  • The most frequently encountered and computer vision inspired feature extraction method for hand engineering approaches is the transformation of time series into images using specific imaging methods such as Gramian fields (Wang and Oates 2015b, a), recurrence plots (Hatami et al. 2017; Tripathy and Acharya 2018) and Markov transition fields (Wang and Oates 2015c).
  • Unlike image transformation, other feature extraction methods are not domain agnostic. These features are first hand-engineered using some domain knowledge, then fed to a deep learning discriminative classifier.

  • In contrast to feature engineering, end-to-end deep learning aims to incorporate the feature learning process while fine-tuning the discriminative classifier (Nweke et al. 2018). Since this type of deep learning approach is domain agnostic and does not include any domain specific pre-processing steps, we decided to further separate these end-to-end approaches using their neural network architectures.
  • This type of deep learning approach is domain agnostic and does not include any domain specific pre-processing steps.
  • During our study, we found that CNN is the most widely applied architecture for the TSC problem, which is probably due to their robustness and the relatively small amount of training time compared to complex architectures such as RNNs or MLPs.

3 Approaches

  • The main goal of deep learning approaches is to remove the bias due to manually designed features (Ordó¨nez and Roggen 2016), thus enabling the network to learn the most discriminant useful features for the classification task.

3.2.1 Multi layer perceptron

3.2.2 Fully convolutional neural network

3.2.3 Residual network

在这里插入图片描述

3.2.4 Encoder

  • Originally proposed by Serrà et al. (2018), Encoder is a hybrid deep CNN whose architecture is inspired by FCN (Wang et al. 2017b) with a main difference where the GAP layer is replaced with an attention layer.

3.2.5 Multi-scale convolutional neural network

3.2.6 Time Le-Net

3.2.7 Multi channel deep convolutional neural network

3.2.8 Time convolutional neural network

3.2.9 Time warping invariant echo state network

4 Experimental setup

4.1 Datasets

4.1.1 Univariate archive
  • UCR/UEA archive (Chen et al. 2015b; Bagnall et al. 2017) which contains 85 univariate time series datasets.
  • The datasets possess different varying characteristics such as the length of the series which has a minimum value of 24 for the ItalyPowerDemand dataset and a maximum equal to 2709 for the HandOutLines dataset.
4.1.2 Multivariate archive
  • Baydogan’s archive (Baydogan 2015) that contains 13 MTS classification datasets.

5 Results

ResNet is relatively better.

在这里插入图片描述

6 Visualization

6.1 Class activation map

  • We investigate the use of Class Activation Map (CAM) which was first introduced by Zhou et al. (2016) to highlight the parts of an image that contributed the most for a given class identification. Wang et al. (2017b) later introduced a one-dimensional CAM with an application to TSC. This method explains the classification of a certain deep learning model by highlighting the subsequences that contributed the most to a certain classification.

在这里插入图片描述

6.2 Multi-dimensional scaling

  • We propose the use of Multi-Dimensional Scaling (MDS) (Kruskal and Wish 1978) with the objective to gain some insights on the spatial distribution of the input time series belonging to different classes in the dataset.

在这里插入图片描述

在这里插入图片描述

### 回答1: PointNet是一种深度学习模型,专门用于处理3D点云的分类和分割任务。它接受一个由点组成的集合作为输入,可以学习到每个点的特征表示,并将它们组合起来以获得整个点云的全局特征。PointNet还具有旋转不变性,因此可以处理具有不同旋转角度的点云数据。这使得PointNet成为3D视觉领域的重要工具,用于处理各种任务,例如物体检测、语义分割和场景重建等。 ### 回答2: PointNet是一种基于点集的深度学习架构,用于3D分类和分割任务。它于2017年由Qi et al.提出,并已在许多3D视觉应用中得到了广泛应用。PointNet的主要思想是将点云看作无序的点集,并设计了一种处理这种无序集合的新型神经网络。 传统上,对3D对象进行分类和分割的方法通常需要将对象转换为网格或体素,然后将其表示为规则形状的网格或体素。这些方法在处理复杂几何形状时存在一定的困难,并且采用的处理方法需要严格的输入格式。 相比之下,PointNet可以直接处理点云数据,不需要对其进行转换或训练数据格式的严格要求。在PointNet中,输入是一组点的集合,每个点有三个坐标和其他任意的属性,如颜色或法线。这些点无序,因此PointNet用最小误差投影(Minimum Error Projetion)来解决这个问题。这个网络的中心思想是使用点集的对称性,将输入点云映射到一个向量空间中,该空间旨在保留输入点集的所有信息。 为了处理点集的对称性,PointNet使用了两个网络——一个是点特征提取网络,另一个是全局特征提取网络。点特征提取网络处理单个点的信息,并产生一个新的点特征。全局特征提取网络则将所有点的特征表示合并到一个全局特征向量中。这种设计使PointNet可以生成对称空间中的全局特征向量,从而保持了输入的无序性质,并确保了在不同尺度和物体位姿下的泛化能力。 总的来说,PointNet为点云的处理提供了一种新颖的方式,可以在保持输入的无序性质和提高处理效率方面取得良好的表现。它的成功应用在3D分类和分割任务中证明了其高效性和实用性,并为未来的3D深度学习研究工作提供了有价值的经验。 ### 回答3: PointNet是一种用于3D分类和分割的深度学习算法。这种算法突破了传统方法中对于3D形状进行预测的限制,通过学习点云中点的全局特征来进行预测,并且在Caltech-101 或 ModelNet40等数据集上取得了远超其他算法的效果。 PointNet算法首先通过应用全连接网络将点云中的每个点转换成一个低维的向量表示。该算法还采用了一个局部特征学习模块,该模块仅对于每个点的局部序列进行操作,以捕获点云的局部特征。该算法使用了max pooling的形式将每个点的局部特征进行汇总,以得出整体的特征表示。最后,算法通过多个全连接层将点云的全局特征映射到所需的目标(如类别标签或分割结果)。 值得注意的是,PointNet算法在3D形状分类和分割问题上的效果非常显著,并且其鲁棒性非常好,即使在存在噪声和缺失数据的情况下,该算法也能够产生准确的结果。此外,PointNet算法还可以通过加入循环神经网络模块来实现对于时间序列数据的处理。 总的来说,PointNet算法是一种极具前景的深度学习算法,其具有高效、准确和鲁棒的特点,可以应用于3D形状预测、3D图像识别、机器人操作等领域。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

Robust Da

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值