论文全名:Characterizing functional brain networks via Spatio-Temporal Attention 4D Convolutional Neural Networks (STA-4DCNNs)
论文代码:GitHub - 西江实验室UESTC/STA-4DCNN
英文是纯手打的!论文原文的summarizing and paraphrasing。可能会出现难以避免的拼写错误和语法错误,若有发现欢迎评论指正!文章偏向于笔记,谨慎食用!
目录
2.3.2. Data description and preprocessing
2.3.3. 4D convolution/deconvolution and attention copy
2.3.4. Model architecture of STA-4DCNN
2.3.5. Model setting and training scheme
2.3.6. Model evaluation and validation
2.4.1. Spatio-temporal pattern characterization of DMN in emotion T-fMRI
2.4.3. Effectiveness of spatio-temporal pattern characterization of other FBNs
2.4.5. Characterization of abnormal spatio-temporal patterns of FBNs in ASD patients
3.2. Deconvolution/Transposed convolution/Fractionally-strided convolution
1. 省流版
1.1. 心得
(1)写了如同没写的贡献,全给新颖去了倒是感觉不到几个贡献
(2)仅考虑个体不考虑组间
(3)咦所以什么是浅层数据什么是深层数据?
1.2. 论文框架图
2. 论文逐段精读
2.1. Abstract
①There is still a broad space of 4D-fMRI
②They proposed a Spatio-Temporal Attention 4D Convolutional Neural Network (STA-4DCNN) model for functional brain networks (FBNs) which includes Spatial Attention 4D CNN (SA-4DCNN) and Temporal Guided Attention Network (T-GANet) subnetworks.
③Introduce the dataset they adopted
2.2. Introduction
①4D-fMRI is come from the combination of 3D brain image and 1D time series
②Characterizations of FBNs such as general linear model (GLM) in 1D, principal component analysis (PCA), sparse representation (SR) and independent component analysis (ICA) in 2D and 3D convolutionn in 3D are all limited in spatio-temporal analysis
③Briefly introduce how they experimented
④Fine, they think their contributions are combining U-Net and attention mechanism and the model works...
2.3. Method and materials
2.3.1. Method overview
The overview of SA-4DCNN:
where SA-4DCNN receives 4D data and converts it into 3D spatial output and T-GANet receives both 3D spatial output and 4D data and converts it into 1D Temporal output.
2.3.2. Data description and preprocessing
(1)Data selection
①Types of data: seven t-fMRI including emotion, gambling, language, motor, relational, social, and working memory and Rest state fMRI (rs-fMRI)
②Dataset: HCP S900 release (both types) and ABIDE I (only rs-fMRI, only for evaluating generalizability)
③Sample: randomly select 200 in HCP, 64 ASD and 83 typical developing (TD) in ABIDE I
④Preprocessing: FSL FEAT toolbox, normalized to 0–1 distribution, down-sampled to 48*56*48 (spatial size) and 88 (temporal size) for HCP data, standard Configurable Pipeline for the Analysis of Connectomes (CPAC) for ABIDE I
(2)Data processing
①Training labels: dictionary learning and sparse representation (SR)
②Presenting 4D-fMRI as matrix , where denotes time points and denotes the total number of brain voxels
③Decompose to with dictionary learning and sparse representation (SR) where is error term and is predefined dictionary size
④Each column in dictionary matrix is temporal pattern of the FBN
⑤Each row in sparse coefficient matrix is spatial pattern of the FBN
2.3.3. 4D convolution/deconvolution and attention copy
(1)Convolution
①4D convolution (a) and 4D deconvolution (b) figure:
①The authors explain they decomposed 4D filter into different 3D convolution kernels along the temporal dimension. Then use these 3D convolution kernels to perform 3D convolution on the input 4D data. The final output is the sum of 4D data.
②⭐为了把D*H*W*C(长宽高*时间)的4D图变成2D*2H*2W*2C,第一个绷带大方块(2D*2D*2W)是padding一个小方块得来的(但是作者没有具体说怎么padding只是cite了别人的论文)。然后对于每个绷带大方块都padding出一个新的空白方块。这样就得到了2D*2H*2W*2C阵列。(因为我觉得蛮重要的就中文解释惹,这样下次看笔记一眼就看到了)
(2)Attention
①They concatenate shallow and deep layers feature with 5D (D*H*W*C*L where L is the channel dimension):
②The specific operation is:
where denotes re-ordering the 4D matrix of size D*H*W*C into a 2D one of size (D*H*W)*C
③The 4D attention output is:
where denotes the number of features
④Then restore the dimension through reverse operations:
where is the 4D matrix in s along the last dimension(其实我没太懂这里的表述呢?我觉得就是attention()完的东西?)
2.3.4. Model architecture of STA-4DCNN
The whole framework of STA-4DCNN:
(1)Spatial attention 4D CNN (SA-4DCNN)
①The original chanel is 1, but transfer to 2 after 2 4D convolutional layers
②The red arrows are maxpooling layers
③For the output similar to input with 4D, they adpot 3D CNN to reduce dimension
(2)Temporal guided attention network (T-GANet)
①Firstly they combine spatial pattern and 4D fMRI IN:
showed in the top orange line, where transfer input with S*S*S*C to P*C (P=S*S*S)
②Then combine and :
感觉(2)的公式好迷惑啊,其实图已经很清晰了但是总感觉公式不太对的上
2.3.5. Model setting and training scheme
①The designed two separate loss function to SA-4DCNN and T-GANet
②Loss function in SA-4DCNN:
where is characterized spatial pattern (那大哥你为啥不用 啊?) and denotes training label
③Loss function in T-GANet:
④Training curve:
⑤Training set: 160 for training and 40 for testing
⑥Parameter: the same as their previous paper (Yan et al., 2021)
⑦Learning rate: 0.0001 in the first and 0.0005 in second training stages
⑧Epoch: 150 in spatial and 20 in temporal
⑨Optimizer: Adam in both stages
⑩Convolutional kernals: 3*3*3*3 in 4D and 3*3*3 in 3D
⑪Activation: a batch norm (BN) and a rectified linear unit (ReLU) are followed by each convolutional layer
⑫When in Fast Down-sampling Block is set by 12, they achieve the hightest accuracy
2.3.6. Model evaluation and validation
①Brain network: default mode network (DMN)
②Evaluate spatio-temporal patterns through overlapping parts:
where blue lines represent STA-4DCNN, red is ST-CNN and green is SR.
2.4. Results
2.4.1. Spatio-temporal pattern characterization of DMN in emotion T-fMRI
①Dataset: 40 of emotion t-fMRI
②Averaged spatial pattern similarity values of DMN(这么低?):
③Averaged temporal pattern similarity values of DMN:
2.4.2. Generalizability of spatio-temporal pattern characterization of DMN in other six T-fMRI and One Rs-fMRI
①Dataset: 200 in other six different t-fMRI and one rs-fMRI datasets
②Characterized spatio-temporal patterns of DMN in three example subjects of rs-fMRI sample:
③Characterized spatio-temporal patterns of DMN in one example subject of the other six t-fMRI sample:
2.4.3. Effectiveness of spatio-temporal pattern characterization of other FBNs
①Comparison of other three FBNs:
Obviously, there are more similarity and less scattered noise points
②Averaged spatial pattern similarity values table for other 3 FBNs:
2.4.4. Ablation study
①Reduce one layer in 4D U-Net
②Drop 4D convolution layers after 4D convolution
③Remove attention copy operation
2.4.5. Characterization of abnormal spatio-temporal patterns of FBNs in ASD patients
①Averaged spatial and temporal pattern similarity values of ASD and TD in DMN and Auditory:
②Characterized spatio-temporal patterns of DMN with STA-4DCNN in five example ASD and TD:
③Characterized spatio-temporal patterns of auditory network with STA-4DCNN in five example ASD and TD:
2.5. Conclusion
Their trained results are highly simiar to true labels. Moreover, the model is able to be generalized to other situation
3. 知识补充
3.1. Boltzmann machine
过于复杂不做多解释机器学习笔记之深度玻尔兹曼机(一)玻尔兹曼机系列整体介绍_静静的喝酒的博客-CSDN博客
3.2. Deconvolution/Transposed convolution/Fractionally-strided convolution
【深度学习反卷积】反卷积详解, 反卷积公式推导和在Tensorflow上的应用 - 知乎 (zhihu.com)
4. Reference List
Xi, J. et al. (2022) 'Characterizing functional brain networks via Spatio-Temporal Attention 4D Convolutional Neural Networks (STA-4DCNNs)', Neural Networks, vol. 158, pp. 00-110. doi: Redirecting
Yan, J. et al. (2021) 'A Guided Attention 4D Convolutional Neural Network for Modeling Spatio-Temporal Patterns of Functional Brain Networks', The 4th Chinese conference on pattern recognition and computer vision.