[论文精读]Community-Aware Transformer for Autism Prediction in fMRI Connectome-CSDN博客

①Treating each ROI equally will overlook the social relationships between them. Thus, the authors put forward Com-BrainTF model to learn local and global presentations

②They share the parameters between different communities but provide specific token for each community

2.2. Introduction

①ASD patients perform abnormal in default mode network (DMN) and are influenced by the significant change of dorsal attention network (DAN) and DMN

②Com-BrainTF contains a hierarchical transformer to learn community embedding and a local transformer to aggregate the whole information of brain

③Sharing the local transformer parameters can avoid over-parameterization

2.3. Method

2.3.1. Overview

（1）Problem Definition

①They adopt Pearson correlation coefficients methods to obrain functional connectivity matrices

②Then divide $N$ ROIs to $K$ communities $\{X_{1},X_{2},\ldots,X_{K}\},X_k\in\mathbb{R}^{N_k\times N}$

③The learned embedding $H=[H_{1},\ldots,H_{k}],H_k\in\mathbb{R}^{N_k\times N}\mapsto Z_{L}\in\mathbb{R}^{N\times N}$

④Next, the following pooling layer and MPLs predict the labels

（2）Overview of our Pipeline

①They provide a local transformer, a global transformer and a pooling layer in their local-global transformer architecture

②The overall framework

2.3.2. Local-global transformer encoder

①With the input FC, the learned node feature matrix $H_i$ can be calculated by $H_i=(\|_{m=1}^Mh^m)W_O$

②In transformer encoder module,

$h^m=\text{softmax}\bigg(\frac{Q^m(K^m)^T}{\sqrt{d_k^m}}\bigg)V^m$

where $Q^{m}=W_{Q}X_{i}^{\prime},K^{m}=W_{K}X_{i}^{\prime},V^{m}=W_{V}X_{i}^{\prime},X_{i}^{\prime}=[p_{i},X_{i}]$ ,

$M$ is the number of heads

（1）Local Transformer

①They apply same local transformer for all the input, but use unique learnable tokens $\{p_1,p_2,...,p_k\},p_i\in\mathbb{R}^{1\times N}$ :

$p_i',H_i=\text{LocalTransformer}([p_i,X_i])\text{where},i\in[1,2...K]$

（2）Global Transformer

①The global operation is:

$p_{global}=\text{MLP (Concat }(p_1^{'},p_2^{'}\ldots p_K^{'}))$

$H_{global}=\text{Concat}(H_1,H_2,\ldots,H_K)$

$p^{'},Z_L=\text{GlobalTransformer}([p_{global},H_{global}])$

2.3.3. Graph Readout Layer

①They aggregate node embedding by OCRead.

②The graph level embedding $Z_G$ is calculated by $Z_{G}=A^{\top}Z^{L}$ , where $A\in\mathbb{R}^{K\times N}$ is a learnable assignment matrix computed by OCRead layer

③Afterwards, flattening $Z_G$ and put it in MLP for final prediction

④Loss: CrossEntropy (CE) loss

2.4. Experiments

2.4.1. Datasets and Experimental Settings

（1）ABIDE

①Preprocessing: Configurable Pipeline for the Analysis of Connectomes (CPAC), band-pass filtering (0.01 - 0.1 Hz), no global signal regression

②Sites: 17

③Atlas: Craddock 200

④Communities: 8, cerebellum and subcortical structures (CS & SB), visual network (V), somatomotor network (SMN), DAN, ventral attention network (VAN), limbic network (L), frontoparietal network (FPN) and DMN

⑤Samples: 1009（好像是某一个的全部，似乎没有筛选） with 51.14% ASD

⑥FC: Pearson correlation

⑦Sampling strategy: stratified

（2）Experimental Settings

①Number of heads: equals to the number of communities

②Optimizer: Adam

③Learning rate: 1e-4

④Weight decay: 1e-4

⑤Train: validation: test = 0.7: 0.1: 0.2

⑥Batch size: 64

⑦Metrics: AUROC, accuracy, sensitivity, specificity

⑧Epoch: 50

2.4.2. Quantitative and Qualitative Results

（1）Comparison with Baselines (Quantitative results)

①Transformer based model: BNT

②Fixed brain network: BrainNetCNN（什么叫fixed？？等于static吗？不time series吗？那不是有很多咩？看到下面有个可以学习的，这个也可以学习吧？不是卷积都是学习吗，啥意思）

③Learnable brain network: FBNETGEN

④Result demonstrability:

（2）Interpretibility of Com-BrainTF (Qualitative results)

①The averaged attention matrices of Com-BrainTF and BNT:

②Chord plot of communities with top 1% attention scores:

③ROI importance image visualized by ROI standardized attention score generated by average prompt vectors:

④Difference between important DMN and SMN in prompt vectors:

2.4.3. Ablation studies

（1）Input: node features vs. class tokens of local transformers

①They compared aodpting prompt tokens in local Transformer and combining prompt tokens and updated node features

②Comparison table:

（2）Output: Cross Entropy loss on the learned node features vs. prompt token

①They compared node features and prompt tokens as output

②Comparison table:

2.5. Conclusion

Com-BrainTF outperforms SOTA and is able to detects salient functional networks associated with ASD

2.6. Supplementary Materials

2.6.1. Variations on the Number of Prompts

Comparison/Ablation of number of input prompt tokens:

2.6.2. Attention Scores of ASD vs. HC in Comparison between Com-BrainTF (ours) and BNT (baseline)

Comparison of Com-BrainTF’s prompt vector with BNT’s attention vector:

2.6.3. Decoded Functional Group Differences of ASD vs. HC

①Biomakers in DMN, calculated by normalized attention scores:

②Biomakers in SMN, calculated by normalized attention scores:

3. 知识补充

3.1. Prompt

参考学习：通俗易懂地理解Prompt - 知乎 (zhihu.com)

3.2. Neurosynth

官网：Neurosynth

4. Reference List

Bannadabhavi A. et al. (2023) 'Community-Aware Transformer for Autism Prediction in fMRI Connectome', 26th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2023), doi: https://doi.org/10.48550/arXiv.2307.10181