[论文精读]DeepFMRI: End-to-end deep learning for functional connectivity and classification of ADHD using fMRI

论文网址:DeepFMRI: End-to-end deep learning for functional connectivity and classification of ADHD using fMRI - ScienceDirect

英文是纯手打的!论文原文的summarizing and paraphrasing。可能会出现难以避免的拼写错误和语法错误,若有发现欢迎评论指正!文章偏向于笔记,谨慎食用!


1. 省流版

1.1. 心得

1.2. 论文总结图

2. 论文逐段精读

2.1. Abstract

2.1.1. Background

2.1.2. New method

2.1.3. Results

2.1.4. Comparison with existing methods

2.1.5. Conclusions

2.2. Introduction

2.3. Related work

2.3.1. Correlation methods

2.3.2. Dimensionality reduction methods

2.3.3. Graph based methods

2.3.4. Clustering based methods

2.3.5. Deep learning based methods

2.4. Data and preprocessing

2.5. Methods

2.5.1. End-to-end model

2.6. Experimental settings and results

2.6.1. Experimental settings

2.6.2. Comparison methods

2.6.3. Feature importance of functional connectivity

2.6.4. Results

2.6.5. Performance comparison

2.7. Discussion

2.7.1. Analysis of learned feature importance of functional connectivity

2.8. Conclusions

3. 知识补充

3.1. Minimum redundancy maximum relevance (MRMR)

3.2. Elastic net

3.3. Synthetic Minority Over sampling TEchnique (SMOTE)

4. Reference List

(1)你的abstract也不用如此over specific...








(9)2.7.1. 的⑥⑦比较方法怎么又新颖又原始的,似乎是不能用显著性比较所以才酱紫委曲求全?



2.1. Abstract

2.1.1. Background

        Due to the unknown mechanism of Attention Deficit Hyperactivity Disorder (ADHD), the diagnosis mainly rely on behaviour

2.1.2. New method

        They proposed a DeepFMRI model with feature extraction, functional connectivity (FC) constructing and classifying by end-to-end approach

2.1.3. Results

        They achieve 73.1% ACC, 91.6% SPE and 65.5% SEN on ADHD-200

2.1.4. Comparison with existing methods

        This model is unique

2.1.5. Conclusions

        DeepFMRI achieve the best performance

2.2. Introduction

        ①The connectivity in brain regions may affect a lot in disease

        ②Electroencephalography (EEG), Magnetoencephalography (MEG), functional Magnetic Resonance Imaging (fMRI) and Positron Emission Tomography (PET) are all the method of brain detection

        ③ADHD is a lifelong disease(嘶,这玩意儿咋又不能治愈啊)

epilepsy  n.癫痫;羊角风;羊痫风

2.3. Related work

2.3.1. Correlation methods

        ①Introducing several atlases

        ②The problems of atlas are: a) the connections are over dense (fully connected), b) the belonging of ROI to the community does not reflected

2.3.2. Dimensionality reduction methods

        ①Briefly introduce dimensionality reduction approaches such as Independent Component Analysis (ICA) and minimum redundancy maximum relevance (MRMR) based

        ②ICA sometimes beyonds comprehension(这为啥难以理解,可能我也没有细学我只看了PCA), is independent in time and space and is limited in threshold choosing(阈值这个玩意儿在哪都很烦吧,感觉过于盲选了

2.3.3. Graph based methods

        Only list one graph method(的确...这时候才2020,GNN大多还在分子领域)

2.3.4. Clustering based methods

        ①Clustering based methods are more complex than correlation in that the network obtained by clustering is sparse(稀疏就更难???不至于吧

affinity  n.密切关系;喜爱;密切的关系;类同;喜好

2.3.5. Deep learning based methods

        ①DL combines extraction and classification

        ②Introduce some DL methods in fMRI classification

2.4. Data and preprocessing

        ①Dataset: NeuroBureau ADHD-200 competition (ADHD-200 Sample)

        ②Modalities: MRI, rs-fMRI, age, sex and IQ

        ③Screen: excluding time series less than 172

        ④Sites: choosing 3 from 8 in competition, NeuroImage (NI), New York University Medical Center (NYU), and Peking University (Peking)

        ⑤Time-series signal length: 172, because the mode of the time series signal in the dataset is 172. Then, truncating all the series that beyonds 172


        ⑦Parameters in different sites:

        ⑧Preprocessing by tools AFNI and FSL: removing of the first four time points, slice time correction, motion correction (first image taken as the reference), registration on 4×4× 4 voxel resolution using the Montreal Neurological Institute (MNI) space, filtration (bandpass filter 0.009 Hz < f<0.08 Hz) and smoothing using a 6 mm FWHM Gaussian filter. The brain is segmented into 90 regions using the well established AAL template

consortium  n.联盟;(合作进行某项工程的)财团,银团,联营企业

truncate  vt.截断;截短,缩短,删节(尤指掐头或去尾) adj.截短的;被删节的

2.5. Methods

2.5.1. End-to-end model

(1)The feature extractor network

        ①They adopt parametric ReLU \left.f(x)=\left\{\begin{array}{ll}x,&x>0\\\text{ax},&x\le0\end{array}\right.\right., where \textup{a} denotes a non-negative scalar

        ②The overall framework(为什么不用不同颜色的框框,def看起来还是有点心累):

where Conv kernals are all 3, strides are 1; 

length is 2 and stride is 1 in temporal pooling;

region n is a \mathbb{R}^{32} vector in feature extractor network

(2)The functional connectivity network

        ①Each similarity measure network computes correlation between two ROIs, so there are n_{s}=(n_{f}\times(n_{f}-1)/2)=45*89=4005 networks

        ②The output in this module is a vector with 2 dimension

        ③The mapping operator M(i)=w_1v_1^i+w_2v_2^i where v_1^i and v_2^i are scalar outputs in the i-th similarity measure network, w_1 and w_2 are hyper-parameters(虽然文中说是权重,但是在后面又说w_1+w_2=1并为了减少参数进一步设定w_1=1,w_2=0感觉就很超参数了...不过我想问的是这直接0了真的好吗...?

        ④Weights initialization: adopt FCNet weights after pre-trained(这又是哪里飞进来的一个网络啊,FCNet和他们的网络有什么关系?

        ⑤Learning rate: 10^{-5}

(3)Classification network

        ①Nodes in fully connected layers are 100, 50, 50, 2

        ②Weights initialization: randomly

        ③Learning rate: 10^{-4}

2.6. Experimental settings and results

2.6.1. Experimental settings

        ①Even though there are three types in ADHD, ADHD combined, ADHD hyperactive-impulsive and ADHD inattentive in ADHD-200 dataset, the authors combined them together as ADHD.

        ②Optimizer: Adam

        ③Epoch: 50

        ④Loss function: cross-entropy L=-\frac{1}{n}\sum_1^n[y_i\log(\widehat{y_i})+(1-y_i)\log(1-\widehat{y_i})],

where n represents the number of training samples, y_i denotes the true label (1 denotes ADHD and 0 denotes HC), \widehat{y_i} denotes the predicted label

2.6.2. Comparison methods

(1)End-to-end model without functional connectivity

        ①They provide a ablation study that contains a model without FC:


        ①It is the first classification model applied in ADHD

        ②Based on CNN, FCNet extracts FC in time-series signals and gets features by Elastic net. Finally they apply SVM to classifying

(3)Correlation method

        ①They first apply corrlation analysis in fMRI signals to get FC, and then extract features by Elastic net. Finally using SVM to classifying

(4)Clustering method

        ①Applying Synthetic Minority Over sampling TEchnique (SMOTE) to solve dataset imbalance, and then extract features by Elastic net. Finally using SVM to classifying

2.6.3. Feature importance of functional connectivity

        ①They rank the importance of each ROIs in ADHD classification

        ②There is a linear score model:


where S_c(M) denotes class score function, c represents the class, M is the output of one layer, b_c denotes the bias of the whole model and w_c is weight

        ③⭐However, linear function can not be directly used in DL network. Hence, approximating S_c by first-order Taylor expansion:

S_{c}(M)\approx\frac{\partial S_{c}}{\partial M}|_{M_{0}}\text{ M}+b

and the \frac{\partial S_{c}}{\partial M}|_{M_{0}} is calculated by back-propagation

        ④The importance score of ROI i in the d-th layer:


where L denotes the total number of layers, k denotes the number of ROIs and f^L_c denotes the output of the classification network

        ⑤Feature importance map for class c:(啊???class不是只有俩吗这是什么东西?


reminiscent  adj.怀旧的;使回忆起(人或事);回忆过去的;缅怀往事的 n.回忆者;追记前事者

2.6.4. Results

        Classification results table:

(1)Comparison with other methods

        ①Comparison table:

        ②The data heterogeneity at three different sites is too high (not only these):

NIKeep eyes closed, no visual stimulus
PekingEyes open or closed and stay still. A black screen with a white fixation cross was displayed during the scan
NYUKeep eyes closed, think of nothing systematically and not fall asleep. Show a black screen to them(闭着眼睛为什么还要展现黑屏?

        ③Accordingly, they tried to train model in the mixed sites dataset and respectively classified in 3 sites:

it seems a little bit terrible especially on Peking...

2.6.5. Performance comparison

        The authors reckon that significance test might help to analyse, whereas the number of samples is too few

(1)Comparison methods

        ①Model without FC:

        ②Model without classification network: Elastic network→SVM

        ③Clustering + classification network: replace FC network by clustering

        ④Correlation + classification network: replace FC network by correlation method

(2)Comparison results

        Comparison table:

2.7. Discussion

2.7.1. Analysis of learned feature importance of functional connectivity

        ①I_c\in \mathbb{R}^{1\times 4005} seems like the upper triangular matrix of FC. 后面一句话的中翻是“其中每个值对应于相应函数连通性值在确定特定类时的重要性”,不过一个人的所有ROI都是一类吧,噢似乎感觉是一次性把HC和ADHD的重要ROI一起算出来了?

        ②According to the largest number of samples and the highest accuracy, they choose NYU dataset to visualize ROI scores:

⭐it is an innovation that takes class into consideration?

where the black boxes highlights some differences...


        ③The top 100... correlation...

不是你这你能不能好好画你学学人家Com-BrainTF,你这我看得懂个der(来自:[论文精读]Community-Aware Transformer for Autism Prediction in fMRI Connectome-CSDN博客

        ④The top 50 feature map on HC:

        ⑤The top 50 feature map on ADHD:

        ⑥Matching the top 100 important ROIs in HC to the top 500 important ROIs in ADHD:

and find that there is less than 10% matching degree

        ⑦Matching the top 100 important ROIs in ADHD to the top 500 important ROIs in HC:

and find that there is less than 10% matching degree

        ⑧The top 100 features of 2 classes in inter-lobe and intra-lobe:

most of the important ROIs are in frontal lobe, which related to cognitive functioning such as attention, the executive function that includes planning, selection, sequential organization and self-monitoring of actions, affect and mood, memory, self-awareness and personality

2.8. Conclusions

        There is a pleonasm about the summary of model, small sample size limitation, heterogeneous data... etc...

3. 知识补充

3.1. Minimum redundancy maximum relevance (MRMR)


参考学习2:特征选择方法全面总结 - 知乎 (zhihu.com)

3.2. Elastic net

参考学习1:机器学习算法系列(六)- 弹性网络回归算法(Elastic Net Regression Algorithm)_elasticnet回归-CSDN博客

参考学习2:一文读懂正则化:LASSO回归、Ridge回归、ElasticNet 回归 - 知乎 (zhihu.com)

3.3. Synthetic Minority Over sampling TEchnique (SMOTE)

参考学习1:机器学习中进行不平衡分类的SMOTE方法 - 知乎 (zhihu.com)

参考学习2:合成少数过采样技术解决多类不平衡问题 - ScienceDirect

4. Reference List

Riaz A. et al. (2020) 'DeepFMRI: End-to-end deep learning for functional connectivity and classification of ADHD using fMRI', Journal of Neuroscience Methods, 335: 1. doi: Redirecting

