英文是纯手打的!论文原文的summarizing and paraphrasing。可能会出现难以避免的拼写错误和语法错误,若有发现欢迎评论指正!文章偏向于笔记,谨慎食用!
目录
2.1.4. Comparison with existing methods
2.3.2. Dimensionality reduction methods
2.3.4. Clustering based methods
2.3.5. Deep learning based methods
2.6. Experimental settings and results
2.6.3. Feature importance of functional connectivity
2.7.1. Analysis of learned feature importance of functional connectivity
3.1. Minimum redundancy maximum relevance (MRMR)
3.3. Synthetic Minority Over sampling TEchnique (SMOTE)
1. 省流版
1.1. 心得
(1)你的abstract也不用如此over specific...
(2)感觉介绍好多啊,有点长了这篇幅。论文有点太远古了导致排版有点奇怪啊,模型说了学习率实验又说一遍其实真没必要。现在的最好就是模型说模型实验设置再说参数的。一片混乱
(3)好传统的模型!!和我看的3D-CNN好像
(4)⭐哈哈哈哈不能线性就泰勒近似也是🐂
(5)文中一直在强调参数数量并且计算得很具体,但是在性能不高且不需要当机立断的情况下参数真的这么重要吗?
(6)按类别来确实比较新颖?但是...其实也还好吧?
(7)重要ROI可视化也做的太烂了
(8)脑图可视化到底有啥用啊?主要是比较性真的很差诶?等我回顾一下别的论文。我指的这个:
(9)2.7.1. 的⑥⑦比较方法怎么又新颖又原始的,似乎是不能用显著性比较所以才酱紫委曲求全?
(10)你一直数据量少不能交叉验证吗???fold那时候应该有了吧
(11)中科院四区JCRQ3的论文~不爱看的可以划走~~知道是四区所以写得这么杂冗也就接受了
1.2. 论文总结图
2. 论文逐段精读
2.1. Abstract
2.1.1. Background
Due to the unknown mechanism of Attention Deficit Hyperactivity Disorder (ADHD), the diagnosis mainly rely on behaviour
2.1.2. New method
They proposed a DeepFMRI model with feature extraction, functional connectivity (FC) constructing and classifying by end-to-end approach
2.1.3. Results
They achieve 73.1% ACC, 91.6% SPE and 65.5% SEN on ADHD-200
2.1.4. Comparison with existing methods
This model is unique
2.1.5. Conclusions
DeepFMRI achieve the best performance
2.2. Introduction
①The connectivity in brain regions may affect a lot in disease
②Electroencephalography (EEG), Magnetoencephalography (MEG), functional Magnetic Resonance Imaging (fMRI) and Positron Emission Tomography (PET) are all the method of brain detection
③ADHD is a lifelong disease(嘶,这玩意儿咋又不能治愈啊)
epilepsy n.癫痫;羊角风;羊痫风
2.3. Related work
2.3.1. Correlation methods
①Introducing several atlases
②The problems of atlas are: a) the connections are over dense (fully connected), b) the belonging of ROI to the community does not reflected
2.3.2. Dimensionality reduction methods
①Briefly introduce dimensionality reduction approaches such as Independent Component Analysis (ICA) and minimum redundancy maximum relevance (MRMR) based
②ICA sometimes beyonds comprehension(这为啥难以理解,可能我也没有细学我只看了PCA), is independent in time and space and is limited in threshold choosing(阈值这个玩意儿在哪都很烦吧,感觉过于盲选了)
2.3.3. Graph based methods
Only list one graph method(的确...这时候才2020,GNN大多还在分子领域)
2.3.4. Clustering based methods
①Clustering based methods are more complex than correlation in that the network obtained by clustering is sparse(稀疏就更难???不至于吧)
affinity n.密切关系;喜爱;密切的关系;类同;喜好
2.3.5. Deep learning based methods
①DL combines extraction and classification
②Introduce some DL methods in fMRI classification
2.4. Data and preprocessing
①Dataset: NeuroBureau ADHD-200 competition (ADHD-200 Sample)
②Modalities: MRI, rs-fMRI, age, sex and IQ
③Screen: excluding time series less than 172
④Sites: choosing 3 from 8 in competition, NeuroImage (NI), New York University Medical Center (NYU), and Peking University (Peking)
⑤Time-series signal length: 172, because the mode of the time series signal in the dataset is 172. Then, truncating all the series that beyonds 172
⑥Sample:
⑦Parameters in different sites:
⑧Preprocessing by tools AFNI and FSL: removing of the first four time points, slice time correction, motion correction (first image taken as the reference), registration on 4×4× 4 voxel resolution using the Montreal Neurological Institute (MNI) space, filtration (bandpass filter 0.009 Hz < <0.08 Hz) and smoothing using a 6 mm FWHM Gaussian filter. The brain is segmented into 90 regions using the well established AAL template
consortium n.联盟;(合作进行某项工程的)财团,银团,联营企业
truncate vt.截断;截短,缩短,删节(尤指掐头或去尾) adj.截短的;被删节的
2.5. Methods
2.5.1. End-to-end model
(1)The feature extractor network
①They adopt parametric ReLU , where denotes a non-negative scalar
②The overall framework(为什么不用不同颜色的框框,def看起来还是有点心累):
where Conv kernals are all 3, strides are 1;
length is 2 and stride is 1 in temporal pooling;
region is a vector in feature extractor network
(2)The functional connectivity network
①Each similarity measure network computes correlation between two ROIs, so there are networks
②The output in this module is a vector with 2 dimension
③The mapping operator where and are scalar outputs in the -th similarity measure network, and are hyper-parameters(虽然文中说是权重,但是在后面又说并为了减少参数进一步设定感觉就很超参数了...不过我想问的是这直接0了真的好吗...?)
④Weights initialization: adopt FCNet weights after pre-trained(这又是哪里飞进来的一个网络啊,FCNet和他们的网络有什么关系?)
⑤Learning rate:
(3)Classification network
①Nodes in fully connected layers are 100, 50, 50, 2
②Weights initialization: randomly
③Learning rate:
2.6. Experimental settings and results
2.6.1. Experimental settings
①Even though there are three types in ADHD, ADHD combined, ADHD hyperactive-impulsive and ADHD inattentive in ADHD-200 dataset, the authors combined them together as ADHD.
②Optimizer: Adam
③Epoch: 50
④Loss function: cross-entropy ,
where represents the number of training samples, denotes the true label (1 denotes ADHD and 0 denotes HC), denotes the predicted label
2.6.2. Comparison methods
(1)End-to-end model without functional connectivity
①They provide a ablation study that contains a model without FC:
(2)FCNet
①It is the first classification model applied in ADHD
②Based on CNN, FCNet extracts FC in time-series signals and gets features by Elastic net. Finally they apply SVM to classifying
(3)Correlation method
①They first apply corrlation analysis in fMRI signals to get FC, and then extract features by Elastic net. Finally using SVM to classifying
(4)Clustering method
①Applying Synthetic Minority Over sampling TEchnique (SMOTE) to solve dataset imbalance, and then extract features by Elastic net. Finally using SVM to classifying
2.6.3. Feature importance of functional connectivity
①They rank the importance of each ROIs in ADHD classification
②There is a linear score model:
where denotes class score function, represents the class, is the output of one layer, denotes the bias of the whole model and is weight
③⭐However, linear function can not be directly used in DL network. Hence, approximating by first-order Taylor expansion:
and the is calculated by back-propagation
④The importance score of ROI in the -th layer:
where denotes the total number of layers, denotes the number of ROIs and denotes the output of the classification network
⑤Feature importance map for class :(啊???class不是只有俩吗这是什么东西?)
reminiscent adj.怀旧的;使回忆起(人或事);回忆过去的;缅怀往事的 n.回忆者;追记前事者
2.6.4. Results
Classification results table:
(1)Comparison with other methods
①Comparison table:
②The data heterogeneity at three different sites is too high (not only these):
NI | Keep eyes closed, no visual stimulus |
Peking | Eyes open or closed and stay still. A black screen with a white fixation cross was displayed during the scan |
NYU | Keep eyes closed, think of nothing systematically and not fall asleep. Show a black screen to them(闭着眼睛为什么还要展现黑屏?) |
③Accordingly, they tried to train model in the mixed sites dataset and respectively classified in 3 sites:
it seems a little bit terrible especially on Peking...
2.6.5. Performance comparison
The authors reckon that significance test might help to analyse, whereas the number of samples is too few
(1)Comparison methods
①Model without FC:
②Model without classification network: Elastic network→SVM
③Clustering + classification network: replace FC network by clustering
④Correlation + classification network: replace FC network by correlation method
(2)Comparison results
Comparison table:
2.7. Discussion
2.7.1. Analysis of learned feature importance of functional connectivity
① seems like the upper triangular matrix of FC. 后面一句话的中翻是“其中每个值对应于相应函数连通性值在确定特定类时的重要性”,不过一个人的所有ROI都是一类吧,噢似乎感觉是一次性把HC和ADHD的重要ROI一起算出来了?
②According to the largest number of samples and the highest accuracy, they choose NYU dataset to visualize ROI scores:
⭐it is an innovation that takes class into consideration?
where the black boxes highlights some differences...
(我很好奇为啥纵坐标是由大到小)
③The top 100... correlation...
不是你这你能不能好好画你学学人家Com-BrainTF,你这我看得懂个der(来自:[论文精读]Community-Aware Transformer for Autism Prediction in fMRI Connectome-CSDN博客)
④The top 50 feature map on HC:
⑤The top 50 feature map on ADHD:
⑥Matching the top 100 important ROIs in HC to the top 500 important ROIs in ADHD:
and find that there is less than 10% matching degree
⑦Matching the top 100 important ROIs in ADHD to the top 500 important ROIs in HC:
and find that there is less than 10% matching degree
⑧The top 100 features of 2 classes in inter-lobe and intra-lobe:
most of the important ROIs are in frontal lobe, which related to cognitive functioning such as attention, the executive function that includes planning, selection, sequential organization and self-monitoring of actions, affect and mood, memory, self-awareness and personality
2.8. Conclusions
There is a pleonasm about the summary of model, small sample size limitation, heterogeneous data... etc...
3. 知识补充
3.1. Minimum redundancy maximum relevance (MRMR)
参考学习1:最大相关性最小冗余性(mrmr)_最大相关最小冗余算法-CSDN博客
参考学习2:特征选择方法全面总结 - 知乎 (zhihu.com)
3.2. Elastic net
参考学习1:机器学习算法系列(六)- 弹性网络回归算法(Elastic Net Regression Algorithm)_elasticnet回归-CSDN博客
参考学习2:一文读懂正则化:LASSO回归、Ridge回归、ElasticNet 回归 - 知乎 (zhihu.com)
3.3. Synthetic Minority Over sampling TEchnique (SMOTE)
参考学习1:机器学习中进行不平衡分类的SMOTE方法 - 知乎 (zhihu.com)
参考学习2:合成少数过采样技术解决多类不平衡问题 - ScienceDirect
4. Reference List
Riaz A. et al. (2020) 'DeepFMRI: End-to-end deep learning for functional connectivity and classification of ADHD using fMRI', Journal of Neuroscience Methods, 335: 1. doi: Redirecting