[论文精读]Multi-Scale FC-Based Multi-Order GCN: A Novel Model for Predicting Individual Behavior From -

⑤5 tasks chosen: one motor-related test (Endurance), one executive-function-related test (Cognitive Flexibility), one memory-related test (Episodic Memory), one language-related test (Story Difficulty Level), and a comprehensive cognitive test (Fluid Intelligence)

⑥Measurement: NIH Cognition Battery toolbox

⑦Score adjustment: using NIH National Norms toolbox to adjust all the scores to a standard deviation

2.3.2. Imaging Preprocessing

①Minimal preprocessing pipeline: HCP fMRIvolume

②Pre-processing process: 1) gradient distortion correction, 2) head motion correction, 3) EPI distortion correction, 4) registration to the Montreal Neurological Institute (MNI) space, 5) intensity normalization to a global mean, and 6) masking out non-brain voxels

③Artifact removal: independent component analysis (ICA) based FIX Xnoiseifier

2.4. Multi-Scale Brain-Behavior Relationship

①Atlas: Schaefer 100, 500 and 1k

②FCN construction: Pearson correlation

③Connections: all the negative connections are set to 0 and only remains the top 5% high value

④System-level analysis: 7 functional subsystems, visual network (VIS), somatosensory-motor network (SM), attention network (ATT), salience network (SAL), limbic system (LIM), frontoparietal network (FP), and default mode network (DMN), with left and right brain, obtaining 14 regions overall.

⑤Calculating the Pearson correlation between each network and applying FDR correction

⑥CS matrix: The diagonal of the matrix represents the CS between systems and behaviors, while non diagonal values represent the FC between systems and CS between behaviors

2.5. Multi-Scale FC Based Multi-Order Graph Convolutional Network

①Overall framework:

2.5.1. Multi-Scale Functional Connectivity Estimation

①Sparsify: only remain 5 strong edges for each node for ensuring the connectivity of the graph（为什么又说“仅保留值最高的5%边缘”，又说“每个节点保留五个最强边？“但不是总共就14个节点吗...）

②Scales of one graph: $G=\{G^{1},G^{2},\ldots,G^{M}\}$

2.5.2. Multi-Order Graph Convolutional Network

①Graph for the $n$ -th subject: $G_{n}=\{G_{n}^{1},G_{n}^{2}, \ldots, G_{n}^{M}.\}$

②Graph at the $m$ -th scale: $G_n^m=(V_n^m,A_n^m,X_n^m)$

（1）Multi-Order Graph Convolution Layer

①Multi-order aggregation（这个只卷积了一次，是一个人的其中一个scale(atlas)，三个颜色是指邻居阶数的不同。作者定义几阶的邻接矩阵就是把原始的邻接矩阵乘阶数的次幂，0阶的时候是单位矩阵）:

②Node feature $(X_n^m)^i\in\mathbb{R}^{|V_n^m|\times d_i}$

③Graph convolution operator:

$(X_n^m)^{(i+1)}=\parallel _{0\leq k\leq K}\sigma((\widetilde{A}_n^m)^k(X_n^m)^iW_k^i)$

and the define $\sigma$ as ReLU......非常独特的消息传递方式......

④ $\widetilde{A}_n^m=(\widetilde{D}_n^m)^{-\frac12}(A_n^m+I)(\widetilde{D}_n^m)^{-\frac12}$

（2）Pooling Layer

①The final feature vector can be calculated by:

$f_n^m=p(X_n^m)=\frac1{|V_n^m|}\sum_{v=1}^{|V_n^m|}(x_v)_n^m$

（3）Inter-Scale Contrast Constraint

①To enhance the similarity between different scales:

$L_{inter}=\sum_{m=1}^{M-1}\max\left(Dist\left(f_n^m,f_n^{m+1}\right)-Dist\left(f_n^m,f_s^m\right)+\delta,0\right)$

where the features from the same scale are positive term and from different scale are negative term, $Dist$ denotes Euclidean distance between two vectors, $\delta$ denotes margin parameter

2.5.3. Adaptive Feature Fusion

①Total feature of subject $n$ :

$f_n= \{f_n^1,f_n^2,...,f_n^M \}$

②Mean pooling to obtain graph representation:

$\theta_n^m=h_{\mathrm{Avg}}\left(f_n^m\right)$

where $h_{\mathrm{Avg}}$ denotes global average pooling

③Contribution weight (attention?):

$\varphi_n=g\left(\theta_n\right)=\delta\left(Q\sigma\left(W\theta_n\right)\right)$

where $W$ and $Q$ are trainable parameters, $\delta$ is Sigmoid function

④Joint/final features for one person:

$z_n=\begin{Bmatrix}\varphi_n^1f_n^1,\varphi_n^2f_n^2,\ldots,\varphi_n^Mf_n^M\end{Bmatrix}$

2.5.4. Behavior Score Estimation

①Behavior score:

$\hat{Y}=z_nU$

where $U$ denotes trainable parameter

②Loss:

$L_\text{total }=\alpha L_\text{inter }+L_\text{MAE}$

$L_{\mathrm{MAE}}=\frac{1}{N}\sum_{i=1}^{N}E\left(y_i,\hat{y}_i\right)$

where $E$ denotes absolute error

2.6. Implementation

2.6.1. Model Settings and Evaluation Metric

①Filters in 2 GCN: 96 and 12（GCN还有滤波器吗？？？这什么玩意儿？hidden layer？）

②12-channel pooling layer and a 1-channel fully connected layer

③Optimizer: Adam

④Learning rate: 0.005

⑤Norm: L2 with 0.0005

⑥Cross validation: 5 fold

⑦Batch size: 16

⑧Iteration times: 70

⑨Evaluation: average value in 5 times of 5-fold cross validation

2.6.2. Compared Methods

①Hyperparameter setting in each compared method:

（1）Kernel Regression Method

（2）FNN

（3）BrainNetCNN

（4）GCNN

（5）GAT

（6）SAGPool

（7）Meta-RegGNN

（8）BC-GCN-SE

2.7. Results and Discussion

2.7.1. Comparison of FC-Behavior Relationship Between Different Scales

①Correlation difference between scales:

so they reckon the brain has a hierarchical structure

2.7.2. Parameter Analysis

①Grid search on hyperparameter:

where $K=(K_{\mathrm{FCN-100}} , K_{\mathrm{FCN-500}} , K_{\mathrm{FCN-1000}})$ denotes the $K$ order in each scale

②Fixing $\alpha$ to 1 and further testing the combinations of $K$ :

2.7.3. Comparison With Other Methods

①Comparison table:

2.7.4. Ablation Study

①Module ablation:

where baseline 1 denotes Single-Scale FCs + Multi-Order Graph Convolution, 2 denotes Multi-Scale FCs + 1-Order Graph Convolution + Inter-Scale Contrast Constraint, and 3 represents Multi-Scale FCs + Multi-Order Graph Convolution

2.7.5. Importance of Functional Connectivity

①Applying occlusion importance (OI) in networks （屏蔽某一个网络的特征得到的结果和原始结果的差异）:

2.8. Conclusion

3. 知识补充

3.1. FDR correction

（1）定义：FDR correction，即错误发现率（False Discovery Rate）校正，是一种在多重假设检验中常用的统计校正方法，旨在控制假阳性（false positives）发现的错误率。

（2）方法：FDR是指在拒绝原假设的条件下，拒绝的假设中错误的比例。FDR correction则是一种通过调整统计显著性水平来降低这一比例的方法。FDR是指在拒绝原假设的条件下，拒绝的假设中错误的比例。FDR correction则是一种通过调整统计显著性水平来降低这一比例的方法。

4. Reference

Wen, X. et al. (2024) 'Multi-Scale FC-Based Multi-Order GCN: A Novel Model for Predicting Individual Behavior From fMRI', IEEE Transactions on Neural Systems and Rehabilitation Engineering , 32: 548-558. doi: 10.1109/TNSRE.2024.3357059