[论文精读]DeepFMRI: End-to-end deep learning for functional connectivity and classification of ADHD usin

②Electroencephalography (EEG), Magnetoencephalography (MEG), functional Magnetic Resonance Imaging (fMRI) and Positron Emission Tomography (PET) are all the method of brain detection

③ADHD is a lifelong disease（嘶，这玩意儿咋又不能治愈啊）

epilepsy n.癫痫;羊角风;羊痫风

2.3. Related work

2.3.1. Correlation methods

①Introducing several atlases

②The problems of atlas are: a) the connections are over dense (fully connected), b) the belonging of ROI to the community does not reflected

2.3.2. Dimensionality reduction methods

①Briefly introduce dimensionality reduction approaches such as Independent Component Analysis (ICA) and minimum redundancy maximum relevance (MRMR) based

②ICA sometimes beyonds comprehension（这为啥难以理解，可能我也没有细学我只看了PCA）, is independent in time and space and is limited in threshold choosing（阈值这个玩意儿在哪都很烦吧，感觉过于盲选了）

2.3.3. Graph based methods

Only list one graph method（的确...这时候才2020，GNN大多还在分子领域）

2.3.4. Clustering based methods

①Clustering based methods are more complex than correlation in that the network obtained by clustering is sparse（稀疏就更难？？？不至于吧）

affinity n.密切关系;喜爱;密切的关系;类同;喜好

2.3.5. Deep learning based methods

①DL combines extraction and classification

②Introduce some DL methods in fMRI classification

2.4. Data and preprocessing

①Dataset: NeuroBureau ADHD-200 competition (ADHD-200 Sample)

②Modalities: MRI, rs-fMRI, age, sex and IQ

③Screen: excluding time series less than 172

④Sites: choosing 3 from 8 in competition, NeuroImage (NI), New York University Medical Center (NYU), and Peking University (Peking)

⑤Time-series signal length: 172, because the mode of the time series signal in the dataset is 172. Then, truncating all the series that beyonds 172

⑥Sample:

⑦Parameters in different sites:

⑧Preprocessing by tools AFNI and FSL: removing of the first four time points, slice time correction, motion correction (first image taken as the reference), registration on 4×4× 4 voxel resolution using the Montreal Neurological Institute (MNI) space, filtration (bandpass filter 0.009 Hz <  $f$ <0.08 Hz) and smoothing using a 6 mm FWHM Gaussian filter. The brain is segmented into 90 regions using the well established AAL template

consortium n.联盟;(合作进行某项工程的)财团，银团，联营企业

truncate vt.截断;截短，缩短，删节(尤指掐头或去尾) adj.截短的;被删节的

2.5. Methods

2.5.1. End-to-end model

（1）The feature extractor network

①They adopt parametric ReLU $\left.f(x)=\left\{\begin{array}{ll}x,&x>0\\\text{ax},&x\le0\end{array}\right.\right.$ , where $\textup{a}$ denotes a non-negative scalar

②The overall framework（为什么不用不同颜色的框框，def看起来还是有点心累）:

where Conv kernals are all 3, strides are 1;

length is 2 and stride is 1 in temporal pooling;

region $n$ is a $\mathbb{R}^{32}$ vector in feature extractor network

（2）The functional connectivity network

①Each similarity measure network computes correlation between two ROIs, so there are $n_{s}=(n_{f}\times(n_{f}-1)/2)=45*89=4005$ networks

②The output in this module is a vector with 2 dimension

③The mapping operator $M(i)=w_1v_1^i+w_2v_2^i$ where $v_1^i$ and $v_2^i$ are scalar outputs in the $i$ -th similarity measure network, $w_1$ and $w_2$ are hyper-parameters（虽然文中说是权重，但是在后面又说 $w_1+w_2=1$ 并为了减少参数进一步设定 $w_1=1,w_2=0$ 感觉就很超参数了...不过我想问的是这直接0了真的好吗...?）

④Weights initialization: adopt FCNet weights after pre-trained（这又是哪里飞进来的一个网络啊，FCNet和他们的网络有什么关系？）

⑤Learning rate: $10^{-5}$

（3）Classification network

①Nodes in fully connected layers are 100, 50, 50, 2

②Weights initialization: randomly

③Learning rate: $10^{-4}$

2.6. Experimental settings and results

2.6.1. Experimental settings

①Even though there are three types in ADHD, ADHD combined, ADHD hyperactive-impulsive and ADHD inattentive in ADHD-200 dataset, the authors combined them together as ADHD.

②Optimizer: Adam

③Epoch: 50

④Loss function: cross-entropy $L=-\frac{1}{n}\sum_1^n[y_i\log(\widehat{y_i})+(1-y_i)\log(1-\widehat{y_i})]$ ,

where $n$ represents the number of training samples, $y_i$ denotes the true label (1 denotes ADHD and 0 denotes HC), $\widehat{y_i}$ denotes the predicted label

2.6.2. Comparison methods

（1）End-to-end model without functional connectivity

①They provide a ablation study that contains a model without FC:

（2）FCNet

①It is the first classification model applied in ADHD

②Based on CNN, FCNet extracts FC in time-series signals and gets features by Elastic net. Finally they apply SVM to classifying

（3）Correlation method

①They first apply corrlation analysis in fMRI signals to get FC, and then extract features by Elastic net. Finally using SVM to classifying

（4）Clustering method

①Applying Synthetic Minority Over sampling TEchnique (SMOTE) to solve dataset imbalance, and then extract features by Elastic net. Finally using SVM to classifying

2.6.3. Feature importance of functional connectivity

①They rank the importance of each ROIs in ADHD classification

②There is a linear score model:

$S_c(M)=w_cM+b_c$

where $S_c(M)$ denotes class score function, $c$ represents the class, $M$ is the output of one layer, $b_c$ denotes the bias of the whole model and $w_c$ is weight

③⭐However, linear function can not be directly used in DL network. Hence, approximating $S_c$ by first-order Taylor expansion:

$S_{c}(M)\approx\frac{\partial S_{c}}{\partial M}|_{M_{0}}\text{ M}+b$

and the $\frac{\partial S_{c}}{\partial M}|_{M_{0}}$ is calculated by back-propagation

④The importance score of ROI $i$ in the $d$ -th layer:

$f_c^d(i)=\sum_{l=L-1}^d\sum_kw_c^{(l,l+1)}f_c^{(l+1)}(k)$

where $L$ denotes the total number of layers, $k$ denotes the number of ROIs and $f^L_c$ denotes the output of the classification network

⑤Feature importance map for class $c$ :（啊？？？class不是只有俩吗这是什么东西？）

$I_c(x)=f_c^M(x)$

reminiscent adj.怀旧的;使回忆起(人或事);回忆过去的;缅怀往事的 n.回忆者;追记前事者

2.6.4. Results

Classification results table:

（1）Comparison with other methods

①Comparison table:

②The data heterogeneity at three different sites is too high (not only these):

NI	Keep eyes closed, no visual stimulus
Peking	Eyes open or closed and stay still. A black screen with a white fixation cross was displayed during the scan
NYU	Keep eyes closed, think of nothing systematically and not fall asleep. Show a black screen to them（闭着眼睛为什么还要展现黑屏？）

③Accordingly, they tried to train model in the mixed sites dataset and respectively classified in 3 sites:

it seems a little bit terrible especially on Peking...

2.6.5. Performance comparison

The authors reckon that significance test might help to analyse, whereas the number of samples is too few

（1）Comparison methods

①Model without FC:

②Model without classification network: Elastic network→SVM

③Clustering + classification network: replace FC network by clustering

④Correlation + classification network: replace FC network by correlation method

（2）Comparison results

Comparison table:

2.7. Discussion

2.7.1. Analysis of learned feature importance of functional connectivity

① $I_c\in \mathbb{R}^{1\times 4005}$ seems like the upper triangular matrix of FC. 后面一句话的中翻是“其中每个值对应于相应函数连通性值在确定特定类时的重要性”，不过一个人的所有ROI都是一类吧，噢似乎感觉是一次性把HC和ADHD的重要ROI一起算出来了？

②According to the largest number of samples and the highest accuracy, they choose NYU dataset to visualize ROI scores:

⭐it is an innovation that takes class into consideration?

where the black boxes highlights some differences...

（我很好奇为啥纵坐标是由大到小）

③The top 100... correlation...

不是你这你能不能好好画你学学人家Com-BrainTF，你这我看得懂个der（来自：[论文精读]Community-Aware Transformer for Autism Prediction in fMRI Connectome-CSDN博客）

④The top 50 feature map on HC:

⑤The top 50 feature map on ADHD:

⑥Matching the top 100 important ROIs in HC to the top 500 important ROIs in ADHD:

and find that there is less than 10% matching degree

⑦Matching the top 100 important ROIs in ADHD to the top 500 important ROIs in HC:

and find that there is less than 10% matching degree

⑧The top 100 features of 2 classes in inter-lobe and intra-lobe:

most of the important ROIs are in frontal lobe, which related to cognitive functioning such as attention, the executive function that includes planning, selection, sequential organization and self-monitoring of actions, affect and mood, memory, self-awareness and personality

2.8. Conclusions

There is a pleonasm about the summary of model, small sample size limitation, heterogeneous data... etc...

3. 知识补充

4. Reference List

Riaz A. et al. (2020) 'DeepFMRI: End-to-end deep learning for functional connectivity and classification of ADHD using fMRI', Journal of Neuroscience Methods, 335: 1. doi: Redirecting