基于EEG的通过通道注意力和自注意力的情绪识别

·史莱姆·

已于 2024-09-23 10:20:28 修改

阅读量1.2k

点赞数 22

文章标签：深度学习

于 2024-01-31 16:44:11 首次发布

本文链接：https://blog.csdn.net/m0_48640283/article/details/135823395

版权

本文提出了一种名为ACRNN的深度学习模型，该模型结合了通道注意力和自注意力机制，用于从原始EEG信号中提取情感特征。ACRNN在DEAP和DREAMER数据库上实现了超过92%的识别准确率，显示了在EEG情感识别中的优越性能。通道注意力机制考虑了不同通道的重要性和空间信息，而扩展的自注意力则关注样本间的相似性和时间信息。实验表明，这两种注意力机制的结合提高了识别准确性。

摘要由CSDN通过智能技术生成

Shi et al. first proposed differential entropy (DE) features from five frequency bands and validated that DE features are superior for representing EEG signals [17].

微分熵特征被证明最有效

Yang 等人。将多个波段的 DE 组合为 EEG 特征，并采用连续卷积神经网络作为分类器 [24]。Song等人根据电极位置关系设计了DE特征，采用图卷积网络作为分类器[25]。

之前看很多EEG提取特征，心想这还得学吗，不会啊

现在不用了，都是直接设计端到端的网络，从原始EEG中自动提取优秀特征，然后进行分类

Generally, most EEG emotion recognition methods first design features from EEG signals and adopt classifiers to classify the emotion features. For example, Li et al. extracted features from the gamma frequency band and used a linear support vector machine (SVM) to classify the extracted features [13]. Patil et al. adopted higher-order crossings as features, which are better than other statistical features to classify emotions [16]. Shi et al. first proposed differential entropy (DE) features from five frequency bands and validated that DE features are superior for representing EEG signals [17]. In addition, Duan et al. extracted DE features from multichannel EEG data and combined an SVM and kNearest Neighbor (KNN) to classify the extracted features [18].

一般来说，大多数脑电情感识别方法首先从脑电信号中设计特征，并采用分类器对情感特征进行分类。例如，Li 等人。从伽马频段提取特征，并使用线性支持向量机 (SVM) 对提取的特征进行分类 [13]。Pastil等人采用高阶交叉作为特征，优于其他统计特征对情绪进行分类[16]。Shi等人首先从五个频段提出了微分熵(DE)特征，并验证了DE特征在表示EEG信号方面优于[17]。此外，Duan 等人。从多通道 EEG 数据中提取 DE 特征，并结合 SVM 和 kNearest Neighbor (KNN) 对提取的特征进行分类[18]。

Recently, deep learning has been demonstrated to outperform traditional machine learning in many fields, e.g., computer vision [19], natural language processing [20] and biomedical signal processing [21]–[23]. In addition, many deep learning-based methods have been widely used for EEG-based emotion recognition. On one hand, deep learning methods can be considered as classifiers after feature extraction. For example, Yang et al. combined the DE of multiple bands as EEG features and employed a continuous convolutional neural network as a classifier [24]. Song et al. designed DE features according to the electrode position relationship and adopted a graph convolutional neural network as a classifier [25]. On the other hand, many deep learning methods are data-driven and function in an endto-end manner, which does not require handcrafted features from EEG signals.

最近，深度学习已被证明在许多领域优于传统的机器学习，例如计算机视觉[19]、自然语言处理[20]和生物医学信号处理[21]-[23]。此外，许多基于深度学习的方法已广泛应用于基于脑电图的情绪识别。一方面，深度学习方法可以看作是特征提取后的分类器。例如，Yang 等人。将多个波段的 DE 组合为 EEG 特征，并采用连续卷积神经网络作为分类器 [24]。Song等人根据电极位置关系设计了DE特征，采用图卷积网络作为分类器[25]。另一方面，许多深度学习方法都是端到端的数据驱动和功能，不需要来自脑电信号的手工特征。

For example, Alhagry et al. proposed an end-to-end deep learning neural network to recognize emotion from raw EEG signals, which used an LSTM-RNN to learn features from EEG signals and used the dense layer for classification [15]. Yang et al. proposed a parallel convolutional recurrent neural network for EEG emotion recognition and achieved good performance [7]. However, it still remains challenging to extract more discriminative features for EEG emotion recognition. Therefore, it is important to design an effective deep learning framework that can extract features and perform classification directly from raw EEG signals.

例如，Alhagry 等人。提出了一种端到端的深度学习神经网络来识别原始 EEG 信号的情绪，该网络使用 LSTM-RNN 从 EEG 信号中学习特征，并使用密集层进行分类 [15]。Yang等人提出了一种用于脑电情感识别的并行卷积递归神经网络，取得了良好的性能[7]。然而，提取更具区分性的特征进行脑电信号情感识别仍然具有挑战性。因此，设计一个有效的深度学习框架是很重要的，该框架可以直接从原始脑电图信号中提取特征并执行分类。

Inspired by the cascade convolutional recurrent network (CRNN), which combines CNN and RNN to extract spatial and temporal features from EEG signals [26], we use a CNN to extract the spatial information of EEG signals. Then, we employ two long short-term memory (LSTM) layers to extract temporal information, which is better at storing and accessing information than a standard RNN [27]. Different from a traditional CRNN, we employ a framework to extract more discriminative spatiotemporal information using two attention mechanisms, i.e., a channel-wise attention mechanism [28] and an extended self-attention mechanism [29].

受级联卷积循环网络 (CRNN) 的启发，它结合了 CNN 和 RNN 从 EEG 信号 [26] 中提取空间和时间特征，我们使用 CNN 来提取脑电信号的空间信息。然后，我们使用两个长短期记忆 (LSTM) 层来提取时间信息，这比标准 RNN [27] 更好地存储和访问信息。与传统的 CRNN 不同，我们使用框架使用两个注意力机制（即通道注意机制 [28] 和扩展的自注意力机制 [29]）来提取更具辨别力的时空信息。

Generally, CNNs are used to extract the spatial features of EEG signals [7], however, this ignores the importance of features among different channels. To extract more discriminative features from the spatial information, some methods adopt channel selection to choose more relevant channels [30].

通常，CNN用于提取脑电信号的空间特征[7]，然而，这忽略了不同通道之间特征的重要性。为了从空间信息中提取更具区分性的特征，一些方法采用通道选择来选择更相关的通道[30]。

Different from traditional methods that need first select the relevant channels artificially [31], in this study, we first adopt an adaptive channel-wise mechanism, that transforms channels to a probability distribution as weights and recodes the EEG signals based on the transformed weights. Then CNNs are employed to extract the discriminative spatial features of recoded signals.

与需要人工选择相关通道的传统方法[31]不同，在本研究中，我们首先采用了一种自适应通道机制，将通道转换为概率分布作为权重，并根据变换后的权重对脑电信号进行重新编码。然后使用 CNN 来提取重新编码信号的判别空间特征。

In addition, an RNN is employed to explore the time information of EEG signals, however, this also ignores the importance of different EEG samples. Note that extended self-attention can be applied to LSTM to utilize long-range dependencies [32]. We integrate the extended self-attention mechanism into the RNN to explore the importance of different EEG samples, because this mechanism can update the weight according to the similarity of EEG signals. As a result, more discriminative temporal and spatial characteristics of EEG signals can be obtained by integrating the two attention mechanisms in our framework.

此外，RNN被用来探索脑电信号的时间信息，然而，这也忽略了不同脑电样本的重要性。请注意，扩展的自注意力可以应用于 LSTM 以利用长期依赖关系 [32]。我们将扩展自注意力机制集成到 RNN 中以探索不同 EEG 样本的重要性，因为这种机制可以根据 EEG 信号的相似性更新权重。因此，在我们的框架中集成了两种注意机制，可以获得更具区分性的脑电信号的时间和空间特征。

In this paper, we propose the attention-based convolutional recurrent neural network (ACRNN) to deal with EEG-based emotion recognition. Raw EEG signals can contain spatial information by the intrinsic relationship among different channels and time dependence among temporal slices, thus, the proposed ACRNN can learn the spatial features of multichannel EEG in the convolutional layer and explore the temporal features of different temporal slices using LSTM networks.

在本文中，我们提出了基于注意力的卷积循环神经网络 (ACRNN) 来处理基于 EEG 的情绪识别。原始脑电信号可以通过不同通道之间的内在关系和时间片之间的时间依赖性来包含空间信息，因此，所提出的ACRNN可以学习卷积层中多通道EEG的空间特征，并使用LSTM网络探索不同时间片的时间特征。

In addition, the channel-wise attention and extended self-attention mechanisms can extract more discriminative spatial and temporal features, respectively. The proposed model was evaluated on two publicly available databases, i.e., DEAP [2] and DREAMER [3], and the proposed method demonstrated superior performance relative to recognition accuracy in two databases.

此外，通道注意和扩展自注意力机制可以分别提取更具辨别力的空间和时间特征。所提出的模型在两个公开可用的数据库（即 DEAP [2] 和 DREAMER [3]）上进行评估，所提出的方法在两个数据库中表现出比识别精度更好的性能。

1) We have developed a data-driven ACRNN framework for EEG-based emotion recognition. This framework integrates the channel-wise attention mechanism into a CNN to explore spatial information, which can take the importance of different channels by channel-wise attention and the spatial information of multichannel EEG signals by a CNN into consideration. Besides, ACRNN integrates extended self-attention mechanism into RNN to explore temporal information of EEG signals, which can take the different temporal information by LSTM and the intrinsic similarity of each EEG sample by extended self-attention into consideration.

1）我们开发了一个数据驱动的 ACRNN 框架，用于基于 EEG 的情绪识别。该框架将通道注意机制集成到CNN中，以探索空间信息，通过考虑通道注意和CNN多通道脑电信号的空间信息来考虑不同通道的重要性。此外，ACRNN 将扩展的自注意力机制集成到 RNN 中，以探索脑电信号的时间信息，LSTM 可以采用不同的时间信息，并通过扩展自注意力考虑每个 EEG 样本的内在相似性。

2) We conducted experiments on the DEAP and DREAMER databases, and the experimental results indicate average emotion recognition accuracies of 92.74% and 93.14% in the valence and arousal classification tasks of the DEAP database, respectively. In addition, the proposed method achieved mean accuracies of 97.79%, 97.98% and 97.67% in the valence, arousal and dominance classification tasks of the DREAMER database, respectively.

2）我们在DEAP和DREAMER数据库上进行了实验，实验结果表明DEAP数据库效价和唤醒分类任务中的平均情感识别准确率分别为92.74%和93.14%。此外，所提出的方法在 DREAMER 数据库的效价、唤醒和优势分类任务中的平均准确率分别为 97.79%、97.98% 和 97.67%。

The remainder of this paper is organized as follows. Section II introduces related work, and Section III presents the proposed method. Section IV discusses extensive experiments conducted to demonstrate the effectiveness of the proposed ACRNN. Finally, a discussion is given in Section V, and the paper is concluded in Section VI.

本文的其余部分安排如下。第 II 节介绍了相关工作，第 III 节介绍了所提出的方法。第 IV 节讨论了进行了广泛的实验以证明所提出的 ACRNN 的有效性。最后，第五节给出了讨论，第六节总结了本文

2 RELATED WORK

Here, we introduce the general flow of the traditional EEG emotion recognition framework. We then introduce the channel-wise attention and self-attention mechanisms.

2相关工作

在这里，我们介绍了传统脑电情感识别框架的一般流程。然后我们介绍通道注意和自注意力机制。

Recently, emotion recognition from EEG signals has received significant attention. The general flow of the EEG emotion recognition framework is summarized as follows (Fig. 1).

最近，从脑电信号中识别情感识别受到了极大的关注。EEG情绪识别框架的一般流程总结如下(图1)。

(i) Test protocol: First, the type of stimulus used, trial duration, the number of subjects, their gender, and the emotions to be recognized are recorded. Then, the subjects are exposed to the stimulus, e.g., music or a movie [2], [3].

(i) 测试协议：首先，记录所使用的刺激类型、试验持续时间、受试者数量、性别和要识别的情绪。然后，受试者接触到刺激，例如音乐或电影[2]，[3]。

(ii) EEG recordings: The number of electrodes and test duration are recorded, and then EEG signals are recorded by electrodes. The subjects then assess their emotional state by labeling the EEG recording after each trial [2], [3].

(ii)脑电图记录:记录电极数量和测试持续时间，然后由电极记录脑电图信号。然后，受试者通过在每次试验[2]，[3]之后标记脑电图记录来评估他们的情绪状态。

(iii) Preprocessing: To avoid artifacts in the EEG signals, e.g., eye blinks, the EEG signals should be preprocessed using artifact removal methods, e.g., blind source separation and independent component analysis [33].

(iii)预处理:为了避免脑电图信号中的伪影，如眨眼，脑电图信号应该使用伪影去除方法进行预处理，如盲源分离和独立成分分析[33]。

(iv) Feature extraction: To extract relevant emotion features from EEG signals, information about the EEG signals is explored, e.g., the EEG characteristics in the time, frequency, and spatial domains [9].

(iv)特征提取：为了从脑电信号中提取相关的情感特征，探索了脑电信号的信息，例如时间、频率和空间域的脑电特征[9]。

(v) Various classifiers can be used to classify the extracted features, e.g., Bayesian, support vector machines, decision trees, and deep learning classifiers [34]. Depending on whether the classifier was trained on user-dependent data, EEG emotion recognition can be also divided into userdependent and user-independent tasks.

(v) 可以使用各种分类器对提取的特征进行分类，例如贝叶斯、支持向量机、决策树和深度学习分类器[34]。根据分类器是在用户依赖数据上训练的，EEG情感识别也可以分为用户依赖和与用户无关的任务。

2.2 Channel-wise Attention

Attention plays an important role in human perception [35], [36]. For example, humans can exploit a sequence of partial glimpses and selectively focus on salient parts to better capture a visua