论文全名:Spatio-temporal directed acyclic graph learning with attention mechanisms on brain functional time series and connectivity
目录
2.3.1. Deep learning on functional time series
2.3.2. Deep learning on functional connectivity
2.4.1. Spatio-temporal graph convolutional network (ST-graph-conv)
2.4.2. Functional connectivity convolutional (FC-conv) network
2.4.3. Functional connectivity based spatial attention (FC-SAtt)
2.4.4. Spatial attention graph pooling
2.4.5. Directed acyclic graph for multi-scale analysis on functional signals and connectivity
2.4.7. Evaluation metrics and cross-validation
2.5. Datasets and MRI preprocessing
2.5.1. Adolescent brain cognitive development (ABCD)
2.5.2. Open access series of imaging study-3 (OASIS-3)
2.6.1. Spatio-temporal directed acyclic graph learning
2.6.2. Fluid intelligence prediction via leave-one-site-out cross-validation
2.6.4. Comparisons with BrainNetCNN and SVR in the prediction of fluid intelligence and age
1. 省流版
1.1. 心得
①结合了时间序列和功能矩阵,但是整个模型显得太复杂太乱了。是个很不便于学习和借鉴的模型
②知识+数据驱动型文章
1.2. 论文框架图
2. 论文逐段精读
2.1. Astract
①They developed a spatio-temporal directed acyclic graph with attention mechanisms (ST-DAG-Att)
②The authors adopt this model in functional magnetic resonance imaging (fMRI)
③ST-DAG-Att includes two parts as feed-forward structure in directed acyclic graphs (DAGs), spatio-temporal graph convolutional network (ST-graph-conv) and functional connectivity convolutional network (FC-conv)
④This framework also contains functional connectivity-based spatial attention (FC-SAtt)
⑤They used two large datasets: Adolescent Brain Cognitive Development (ABCD, n=7693) and Open Access Series of Imaging Study-3 (OASIS-3, n=1786)
⑥Task: generalizing from ognition prediction to age prediction
2.2. Introduction
①Briefly introduce fMRI and its disease diagnosis, individual demographic information and cognitive ability
②Models like RNN, LSTM, GRU are for temporal analysis. (They list others as well)
③Information and connections between brain regions might be ignored when use functional time series only
④Their model is based on directed acyclic graph (DAG)
⑤ST-DAG-Att outperforms other models in accuracy
⑥This model contains a) signal and network processing, b) spatial, temporal and functional connectivity information, c) spatial attention pooling
schizophrenia n.精神分裂症
2.3. Related work
2.3.1. Deep learning on functional time series
①RNN, LSTM and GRU all include time series data
②GCNs can also be used in fMRI analysis. However, time series in it is structure and functional connections (FC) are the graph
(盲猜这两个的区别是①用ROI*time points的玩意儿当输入,然后②是根据ROI*ROI建图,ROI*time points的作为结构)
2.3.2. Deep learning on functional connectivity
①In CNN and DNN, FCs are considered as images
②BrainNetCNN includes edge-to-edge layers,edge-to-node layers, and node-to-graph layers to present topological relationships
③GCN sets node as subject and edge as similarity
2.4. Methods
①ST-DAG-Att framework where the blue blocks are ST-graph-conv and the green blocks are FC-conv networks
②They define the node as ROI and edge as functional connection
2.4.1. Spatio-temporal graph convolutional network (ST-graph-conv)
① is the brain graph, where represents the set of nodes and represents the sets of edges
②For nodes and time points, is the functional time series(为什么要用函数表示?)
③ST-graph-conv figure:
where denotes the number of ROI;
denotes time points;
a series of represent the number of filter channels;
denotes the kernel size of filters;
and denote the temporal and spatial pooling strides respectively
④There are "8 filters in the temporal convolution and 8 spectral filters designed by the Chebyshev polynomials of order 4 in the spectral graph convolution" in each ST-graph-conv layer
⑤The straide of temporal and spatial pooling is 2
(1)Temporal convolution
①The function of temporal convolution:
where denotes the temporal filter size;
denotes temporal average pooling;
represents the -th filter with the size of ;
denotes leaky ReLU
②Then the is changed to
entangle vt.纠缠;使卷入;缠住;使陷入;套住
(2)Spatial graph convolution
①They adopt spectral filter in the graph Fourier domain:
where is the order of Chebyshev polynomials;
denotes the eigenvalue of the graph Laplacian , which represents the brain functional network of graph ;
denotes the shape parameter;
denotes the Chebyshev polynomial ;
②Then adopt temporal convolution:
where all time points share the same filters;
and denotes identity matrix, denotes degree matrices, denotes adjacency matrices (??有向图的度矩阵和邻接矩阵不是同一个东西吗?)
③作者说 通过 到了傅里叶域,然后据傅里叶域中切比雪夫多项式的形状对其进行滤波,并将其变换回时域(不明觉厉)
④Then, there is a transform from to
⑤Ulteriorly, take consider on spatial pooling:
where represents element-wise multiplication;
is the spatial attention map(??这是哪里来的);
denotes a computational unit;
⑥Lastly, convert to
(3)Spatio-temporal aggregation
①To expand the area from local to global, they aggregate global spatial and temporal information:
where is the abbreviation of temporal global average, is temporal standard deviation;
denotes spatial filters which its kernel size is ;
denotes leaky ReLU;
②Then, the
2.4.2. Functional connectivity convolutional (FC-conv) network
①FC-conv network figure:
where the input is functional time series ;
after Pearson's correlation getting unctional connectivity matrix ;
the edge conv is with filter kernel size;
the node conv is with filter kernel size;
the output is .
②There are 128 edge filters and 256 node filters in each layer
③The bottleneck ratio in MLP is 4
2.4.3. Functional connectivity based spatial attention (FC-SAtt)
①Functional connectivity based spatial attention (FC-SAtt) figure:
②They firstly adopt channel average poolling layer (Cavg):
which generates channel-wise statistics
③Then do a series of operators:
where is the drop rate of the fully connected layers
2.4.4. Spatial attention graph pooling
①Spatial attention pooling operation figure:
it may generate a common spatial mask to apply to all the samples
②In binary masks, the top nodes with the highest attetion value are set by 1 and others by 0
③Then the new nodes will be:
where denotes the number of samples
④The new graph is updated to . In this new graph, redundant nodes and their corresponding edges are removed, reducing the dimensionality in calculation costs and enhancing the proportion of effective signals
2.4.5. Directed acyclic graph for multi-scale analysis on functional signals and connectivity
They use both ROI signal and FC matricies to learn temporal sequence and connectivity features
2.4.6. Implementation
①⭐For FC, they set the threshold to top 20%
②Filters in the convolutional layers might be selected in
③There are 3 fully connection layers in output layer, which have 256, 256 and 1 hidden node respectively
④Dropout rate: 0.2
⑤Leak rate in Leaky ReLU: 0.33 (作者说因为时间序列和功能连接中的负数都是有意义的。那么问题来了,top 20%的连接是在说绝对值大小吗?如果不是绝对值不是大概率负数都没了吗)
⑥Batch size: 32
⑦Stochastic gradient descent is adopted
⑧Usually convergent after 10 epochs
2.4.7. Evaluation metrics and cross-validation
①Root mean square error (RMSE) is their quantitative comparison method
②Mean absolute error (MAE) and Pearson’s correlation between groud truth and predicted values are adopted as well
2.5. Datasets and MRI preprocessing
①Adolescent Brain Cognitive Development (ABCD) dataset is for predicting fluid intelligence
②Open Access Series of Imaging Study-3 (OASIS-3) dataset is for predicting age
2.5.1. Adolescent brain cognitive development (ABCD)
①Site: ABCD研究 (abcdstudy.org)
②The authors screen out T1 image and rs-fMRI image with 2.4 isotropic voxels and 800 ms TR
③They exclude error scanned images and one site with only 24 subjects, then remain 18 sites with 7693 subjects
④The fluid intelligence score of the 7693 subjects is from 64 to 123, and their mean and standard deviation is 95.3±7.3
isotropic adj.各向同性的;等方性的
2.5.2. Open access series of imaging study-3 (OASIS-3)
①It is a dataset of Alzheimer’s disease (AD)
②Site: OASIS Brains - Open Access Series of Imaging Studies
③Samples: 468
2.5.3. MRI preprocessing
①They use FreeSurfer 5.3.0. to segment brain image to gray matter (GM), white matter (WM), and cerebrospinal fluid (CSF)
②Excluding rs-fMRI that head motion mean framewise displacement (FD) beyonds 0.5 mm
pediatric adj.小儿科的
2.6. Results
①5-fold and leave-one-site-out cross-validation is used in ABCD and via 5-fold cross-validation is used in OASIS-3
②They compared their model with BrainNetCNN, support vector regression (SVR) in both dataset
2.6.1. Spatio-temporal directed acyclic graph learning
①They adopt 5-fold cross-validation, 4 folds for training and 1 fold for validation
②Learning rate: 1e-3
③Employing 5-fold cross-validation by 10 times
④figure of (A) ST-graph-conv network and (B) FC-conv network:
⑤Then, they get scatter plots in fluid intelligence prediction
⑥The accuracy of three models:
2.6.2. Fluid intelligence prediction via leave-one-site-out cross-validation
①The ABCD is used for predicting fluid intelligence via leave-one-site-out cross-validation
②Correlation, MAE and RMSE between actual and predicted fluid intelligence:
③Attention maps of fluid intelligence and age built by the first block after the computation of the spatial attention graph pooling module. It shows the most relevant brain regions of prediction
2.6.3. Age prediction
①OASIS-3 dataset is used for predicting age
②In 5-fold cross-validation, 4 for training and 1 for validation as well
③Learning rate: 1e-2
④-norm regularization rate: 1e-4
⑤Correlation, MAE and RMSE between actual and predicted age:
2.6.4. Comparisons with BrainNetCNN and SVR in the prediction of fluid intelligence and age
Analyse the performances of different models
2.6.5. Comparisons with elastic net’s mixture with random forest, spatio-temporal graph convolution, and BrainNetCNN
①Their model aims to analyse functional time signals
②BrainNetCNN is for FC networks
2.7. Discussion
①They put forward ST-DAG-Att to predict functional time series and connectivity of cognition and age
②They analyzed the brain regions that play a major role in both predictions
3. Reference List
Huang, S. et al. (2022) 'Spatio-temporal directed acyclic graph learning with attention mechanisms on brain functional time series and connectivity', Medical Image Analysis, vol. 77. doi: Redirecting