[论文精读]Spatio-temporal directed acyclic graph learning with attention mechanisms on brain functional

论文网址:Spatio-temporal directed acyclic graph learning with attention mechanisms on brain functional time series and connectivity - ScienceDirect

论文全名:Spatio-temporal directed acyclic graph learning with attention mechanisms on brain functional time series and connectivity

目录

1. 省流版

1.1. 心得

1.2. 论文框架图

2. 论文逐段精读

2.1. Astract

2.2. Introduction

2.3. Related work

2.3.1. Deep learning on functional time series

2.3.2. Deep learning on functional connectivity

2.4. Methods

2.4.1. Spatio-temporal graph convolutional network (ST-graph-conv)

2.4.2. Functional connectivity convolutional (FC-conv) network

2.4.3. Functional connectivity based spatial attention (FC-SAtt)

2.4.4. Spatial attention graph pooling

2.4.5. Directed acyclic graph for multi-scale analysis on functional signals and connectivity

2.4.6. Implementation

2.4.7. Evaluation metrics and cross-validation

2.5. Datasets and MRI preprocessing

2.5.1. Adolescent brain cognitive development (ABCD)

2.5.2. Open access series of imaging study-3 (OASIS-3)

2.5.3. MRI preprocessing

2.6. Results

2.6.1. Spatio-temporal directed acyclic graph learning

2.6.2. Fluid intelligence prediction via leave-one-site-out cross-validation

2.6.3. Age prediction

2.6.4. Comparisons with BrainNetCNN and SVR in the prediction of fluid intelligence and age

2.6.5. Comparisons with elastic net’s mixture with random forest, spatio-temporal graph convolution, and BrainNetCNN

2.7. Discussion

3. Reference List


1. 省流版

1.1. 心得

        ①结合了时间序列和功能矩阵,但是整个模型显得太复杂太乱了。是个很不便于学习和借鉴的模型

        ②知识+数据驱动型文章

1.2. 论文框架图

2. 论文逐段精读

2.1. Astract

        ①They developed a spatio-temporal directed acyclic graph with attention mechanisms (ST-DAG-Att)

        ②The authors adopt this model in functional magnetic resonance imaging (fMRI)

        ③ST-DAG-Att includes two parts as feed-forward structure in directed acyclic graphs (DAGs), spatio-temporal graph convolutional network (ST-graph-conv) and functional connectivity convolutional network (FC-conv) 

        ④This framework also contains functional connectivity-based spatial attention (FC-SAtt)

        ⑤They used two large datasets: Adolescent Brain Cognitive Development (ABCD, n=7693) and Open Access Series of Imaging Study-3 (OASIS-3, n=1786)

        ⑥Task: generalizing from ognition prediction to age prediction

2.2. Introduction

        ①Briefly introduce fMRI and its disease diagnosis, individual demographic information and cognitive ability

        ②Models like RNN, LSTM, GRU are for temporal analysis. (They list others as well)

        ③Information and connections between brain regions might be ignored when use functional time series only

        ④Their model is based on directed acyclic graph (DAG)

        ⑤ST-DAG-Att outperforms other models in accuracy

        ⑥This model contains a) signal and network processing, b) spatial, temporal and functional connectivity information, c) spatial attention pooling

schizophrenia  n.精神分裂症

2.3. Related work

2.3.1. Deep learning on functional time series

        ①RNN, LSTM and GRU all include time series data

        ②GCNs can also be used in fMRI analysis. However, time series in it is structure and functional connections (FC) are the graph

(盲猜这两个的区别是①用ROI*time points的玩意儿当输入,然后②是根据ROI*ROI建图,ROI*time points的作为结构)

2.3.2. Deep learning on functional connectivity

        ①In CNN and DNN, FCs are considered as images

        ②BrainNetCNN includes edge-to-edge layers,edge-to-node layers, and node-to-graph layers to present topological relationships

        ③GCN sets node as subject and edge as similarity

2.4. Methods

        ①ST-DAG-Att framework where the blue blocks are ST-graph-conv and the green blocks are FC-conv networks

        ②They define the node as ROI and edge as functional connection

2.4.1. Spatio-temporal graph convolutional network (ST-graph-conv)

        ①G=\left \{ V,E \right \} is the brain graph, where V represents the set of nodes and E represents the sets of edges

        ②For x nodes and t time points, f\left ( x,t \right ) is the functional time series(为什么要用函数表示?

        ③ST-graph-conv figure:

where n denotes the number of ROI;

T denotes time points;

a series of C{}' represent the number of filter channels;

w denotes the kernel size of filters;

p_{t} and p_{s} denote the temporal and spatial pooling strides respectively

        ④There are "8 filters in the temporal convolution and 8 spectral filters designed by the Chebyshev polynomials of order 4 in the spectral graph convolution" in each ST-graph-conv layer

        ⑤The straide of temporal and spatial pooling is 2

(1)Temporal convolution

        ①The function of temporal convolution:

f_j^{\prime}\left(x,\frac{t-w+1}{p_t}\right)=Tpool\left(\sigma\left(\sum_{i=1}^Ch_j\left(t,i\right)*f_i\left(x,t\right)\right)\right)

where w denotes the temporal filter size;

Tpool denotes temporal average pooling;

h_{j} represents the j-th filter with the size of 1\times w;

\sigma denotes leaky ReLU

        ②Then the \mathbf{f}=\{f_i\left.(x,t)\right\}_{i=1,2,\ldots,C}\in\mathbb{R}^{n\times T\times C} is changed to \mathbf{f}^{\prime}=\left\{f_i^{\prime}\left(x,t\right)\right\}_{i=1,2,\ldots,C^{\prime}}\in\mathbb{R}^{n\times\frac{T-w+1}{p_t}\times C}

entangle  vt.纠缠;使卷入;缠住;使陷入;套住

(2)Spatial graph convolution

        ①They adopt spectral filter g in the graph Fourier domain:

g\left(\lambda\right)=\sum_{k=0}^{K-1}\theta_{k}T_{k}\left(\lambda\right)

where K is the order of Chebyshev polynomials;

\lambda denotes the eigenvalue of the graph Laplacian \Delta , which represents the brain functional network of graph G;

\theta _{k} denotes the shape parameter;

T_{k} denotes the Chebyshev polynomial T_k\left(\lambda\right)=\cos\left(k\cos^{-1}\lambda\right);

        ②Then adopt temporal convolution:

f_j''\left(x,t\right)=\sum_{i=1}^{C'}\sum_{k=0}^{K-1}\theta_k^{ij}T_k\left(\Delta\right)f_i'\left(x,t\right)

where all time points share the same filters;

\Delta=I-D^{-\frac12}AD^{-\frac12} and I denotes identity matrix, D denotes degree matrices, A denotes adjacency matrices (??有向图的度矩阵和邻接矩阵不是同一个东西吗?

        ③作者说 \sum_{k=0}^{K-1}\theta_{k}^{ij}T_{k}\left(\Delta\right) 通过 f_{i}^{\prime}(x,t) 到了傅里叶域,然后据傅里叶域中切比雪夫多项式的形状对其进行滤波,并将其变换回时域(不明觉厉

        ④Then, there is a transform from \mathbf{f'}=\begin{Bmatrix}f'_i\left(x,t\right)\end{Bmatrix}\in\mathbb{R}^{n\times\frac{T-w+1}{p_t}\times C'} to \mathbf{f}^{\prime\prime}\in\mathbb{R}^{n\times\frac{T-w+1}{p_t}\times C^{\prime\prime}}

        ⑤Ulteriorly, take consider on spatial pooling:

\mathbf{f}^{^{\prime\prime\prime}}=Spool\left(\mathbf{f}^{\prime\prime}\otimes\mathbf{s}\right)

where \bigotimes represents element-wise multiplication;

\mathbf{s}\in{\mathbb{R}}^{n\times1\times1} is the spatial attention map(??这是哪里来的);

Spool denotes a computational unit;

        ⑥Lastly, \mathbf{f}^{\prime\prime}\in\mathbb{R}^{n\times\frac{T-w+1}{p_t}\times C^{\prime\prime}} convert to \mathbf{f}^{\prime\prime\prime}\in\mathbb{R}^{\frac n{p_s}\times\frac{T-w+1}{p_t}\times C^{\prime\prime}}

(3)Spatio-temporal aggregation

        ①To expand the area from local to global, they aggregate global spatial and temporal information:

\left.\mathbf{y}=\sigma\left(h_s*\begin{bmatrix}Tavg(\mathbf{f}^{'''})\\Tsd(\mathbf{f}^{'''})\end{bmatrix}\right.\right)

where Tavg is the abbreviation of temporal global average, Tsd is temporal standard deviation;

h_{s} denotes spatial filters which its kernel size is \frac{n}{p_{s}}\times1 ;

\sigma denotes leaky ReLU;

        ②Then, the \mathbf{y}\in{\mathbb{R}}^{1\times1\times C^{\prime\prime}}

2.4.2. Functional connectivity convolutional (FC-conv) network

        ①FC-conv network figure:

where the input is functional time series \mathbf{f}\in\mathbb{R}^{n\times T\times C} ;

after Pearson's correlation getting unctional connectivity matrix \mathbf{F}\in\mathbb{R}^{n\times n\times C} ;

the edge conv is \left.\mathbf{Z}=\sigma\left(h_e\right.^*\mathbf{F}\right) with 1\times n filter h_{e} kernel size;

the node conv is \mathbf{Z}^{\prime}=\sigma\left(h_n\right.^*\left(\mathbf{Z}\otimes\mathbf{s}\right)) with n\times 1 filter h_{n} kernel size;

the output is \mathbf{Z}^{\prime}\in\mathbb{R}^{1\times1\times C^{\prime\prime}} .

        ②There are 128 edge filters and 256 node filters in each layer

        ③The bottleneck ratio in MLP is 4

2.4.3. Functional connectivity based spatial attention (FC-SAtt)

        ①Functional connectivity based spatial attention (FC-SAtt) figure:

        ②They firstly adopt channel average poolling layer (Cavg):

Cavg\left(\mathbf{Z}\right)=\sum_{i=1}^{C^{\prime}}\mathbf{Z}_i\in\mathbb{R}^{n\times1\times1}

which generates channel-wise statistics

        ③Then do a series of operators:

\mathbf{s}=Sigmoid[Fully_2(ReLU(Fully_1(Cavg(\mathbf{Z}))))]

where r is the drop rate of the fully connected layers

2.4.4. Spatial attention graph pooling

        ①Spatial attention pooling operation figure:

it may generate a common spatial mask to apply to all the samples

        ②In binary masks, the top M=\frac{n}{p_{s}} nodes with the highest attetion value are set by 1 and others by 0

        ③Then the new nodes will be:

V'=\left\{x:TopM\left(\sum_{i=1}^N\mathbf{m}_i\right)\right\}\subset V

where N denotes the number of samples

        ④The new graph is updated to G{}'=\left \{ V' ,E'\right \} . In this new graph, redundant nodes and their corresponding edges are removed, reducing the dimensionality in calculation costs and enhancing the proportion of effective signals

2.4.5. Directed acyclic graph for multi-scale analysis on functional signals and connectivity

        They use both ROI signal and FC matricies to learn temporal sequence and connectivity features

2.4.6. Implementation

        ①⭐For FC, they set the threshold to top 20%

        ②Filters in the convolutional layers might be selected in \left \{ 8,16,32,64,128,256 \right \}

        ③There are 3 fully connection layers in output layer, which have 256, 256 and 1 hidden node respectively

        ④Dropout rate: 0.2

        ⑤Leak rate in Leaky ReLU: 0.33 (作者说因为时间序列和功能连接中的负数都是有意义的。那么问题来了,top 20%的连接是在说绝对值大小吗?如果不是绝对值不是大概率负数都没了吗

        ⑥Batch size: 32

        ⑦Stochastic gradient descent is adopted

        ⑧Usually convergent after 10 epochs

2.4.7. Evaluation metrics and cross-validation

        ①Root mean square error (RMSE) is their quantitative comparison method

        ②Mean absolute error (MAE) and Pearson’s correlation between groud truth and predicted values are adopted as well

2.5. Datasets and MRI preprocessing

        ①Adolescent Brain Cognitive Development (ABCD) dataset is for predicting fluid intelligence

        ②Open Access Series of Imaging Study-3 (OASIS-3) dataset is for predicting age

2.5.1. Adolescent brain cognitive development (ABCD)

        ①Site: ABCD研究 (abcdstudy.org)

        ②The authors screen out T1 image and rs-fMRI image with 2.4 mm^{3} isotropic voxels and 800 ms TR

        ③They exclude error scanned images and one site with only 24 subjects, then remain 18 sites with 7693 subjects

        ④The fluid intelligence score of the 7693 subjects is from 64 to 123, and their mean and standard deviation is 95.3±7.3

isotropic  adj.各向同性的;等方性的

2.5.2. Open access series of imaging study-3 (OASIS-3)

        ①It is a dataset of Alzheimer’s disease (AD)

        ②Site: OASIS Brains - Open Access Series of Imaging Studies

        ③Samples: 468

2.5.3. MRI preprocessing

        ①They use FreeSurfer 5.3.0. to segment brain image to gray matter (GM), white matter (WM), and cerebrospinal fluid (CSF)

        ②Excluding rs-fMRI that head motion mean framewise displacement (FD) beyonds 0.5 mm

pediatric  adj.小儿科的

2.6. Results

        ①5-fold and leave-one-site-out cross-validation is used in ABCD and via 5-fold cross-validation is used in OASIS-3

        ②They compared their model with BrainNetCNN, support vector regression (SVR) in both dataset

2.6.1. Spatio-temporal directed acyclic graph learning

        ①They adopt 5-fold cross-validation, 4 folds for training and 1 fold for validation

        ②Learning rate: 1e-3

        ③Employing 5-fold cross-validation by 10 times

        ④figure of (A) ST-graph-conv network and (B) FC-conv network:

        ⑤Then, they get scatter plots in fluid intelligence prediction

        ⑥The accuracy of three models:

2.6.2. Fluid intelligence prediction via leave-one-site-out cross-validation

        ①The ABCD is used for predicting fluid intelligence via leave-one-site-out cross-validation

        ②Correlation, MAE and RMSE between actual and predicted fluid intelligence:

        ③Attention maps of fluid intelligence and age built by the first block after the computation of the spatial attention graph pooling module. It shows the most relevant brain regions of prediction

2.6.3. Age prediction

        ①OASIS-3 dataset is used for predicting age

        ②In 5-fold cross-validation, 4 for training and 1 for validation as well

        ③Learning rate: 1e-2

        ④l_{2}-norm regularization rate: 1e-4

        ⑤Correlation, MAE and RMSE between actual and predicted age:

2.6.4. Comparisons with BrainNetCNN and SVR in the prediction of fluid intelligence and age

        Analyse the performances of different models

2.6.5. Comparisons with elastic net’s mixture with random forest, spatio-temporal graph convolution, and BrainNetCNN

        ①Their model aims to analyse functional time signals

        ②BrainNetCNN is for FC networks

2.7. Discussion

        ①They put forward ST-DAG-Att to predict functional time series and connectivity of cognition and age

        ②They analyzed the brain regions that play a major role in both predictions

3. Reference List

Huang, S. et al. (2022) 'Spatio-temporal directed acyclic graph learning with attention mechanisms on brain functional time series and connectivity', Medical Image Analysis, vol. 77. doi: Redirecting

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值