[论文精读]Few-shot domain-adaptive anomaly detection for cross-site brain images

论文网址:Few-shot domain-adaptive anomaly detection for cross-site brain images | IEEE Journals & Magazine | IEEE Xplore

英文是纯手打的!论文原文的summarizing and paraphrasing。可能会出现难以避免的拼写错误和语法错误,若有发现欢迎评论指正!文章偏向于笔记,谨慎食用!

目录

1. 省流版

1.1. 心得

1.2. 论文总结图

2. 论文逐段精读

2.1. Abstract

2.2. Introduction

2.3. Related work

2.3.1. Classification of mental disorders

2.3.2. Few-shot learning for anomaly detection

2.3.3. Cross-domain few-shot learning

2.4. Materials

2.4.1. Demographic, clinical and imaging information of data

2.4.2. Preprocessing

2.4.3. Functional connectivity measures

2.5. Proposed algorithm

2.5.1. Problem definition

2.5.2. Deep semi-supervised anomaly detection (DSAD)

2.5.3. Residual correction block (RCB)

2.5.4. Conditional adversarial domain adaptation revisited

2.5.5. Overall formulation of the FAAD algorithm

2.6. Experiment

2.6.1. Baseline method

2.6.2. Implementation details

2.6.3. Results and analysis

2.7. Discussion

2.8. Conclusion

3. 知识补充

3.1. Hypersphere

3.2. Meta-learning

3.3. Manifold

3.4. Canonical Correlation Analysis (CCA)

4. Reference List


1. 省流版

1.1. 心得

(1)这Intro在我黯淡无光的读着重复的论文的每一天中突然闪耀起来了。这是TPAMI的魅力吗

(2)其实我现在觉得脑图分类总不好可能是大家也有别的病...(天哪我又...他他他居然在文章的3.1(不是我的3.1,我的是2.4.1)里面说了“患者无神经系统疾病、严重内科疾病、药物滥用或电休克治疗史。所有健康对照与SCZ或MDD患者无相关性。他们也根据DSM-IV标准进行评估。他们都没有急性身体疾病,药物滥用或依赖,头部受伤导致意识丧失的历史,或严重的精神或神经疾病。”我不知道其他的有没有,反正大概率有的话都不在正文)

(3)Related works写名字是真的...难评。为什么不能写写模型名字

(4)文章也解释了为什么用fMRI而不是sMRI:“精神障碍引起的病理改变通常是功能性的,而不是结构性的,尤其是在早期阶段。”

(5)文章解释了为什么不用voxel FC而是用ROI based FC:“在体素方面,由于FC具有超高的维度(十亿级)和较低的信噪比(SNR),因此没有采用。”

(6)我终于知道什么是标签空间了,就像去不同医院测的指标其实不一样

(7)我的discussion:我突然觉得似乎对于注意力来说ROI得小然后对于普通的ROI得大

1.2. 论文总结图

2. 论文逐段精读

2.1. Abstract

        ①For solving the problem that fMRI data comes from different sites, the authors proposed few-shot domain-adaptive anomaly detection (FAAD)

        ②They firstly adopt domain adaptation, which reduce the differences of different sites. And secondly combining the features of different sites

        ③The database is the Human Connectome Project (HCP)

2.2. Introduction

        ①It is hard to obtain enough number of correctly labeled samples

        ②⭐It comes overfitting risk when applying unsupervised methods in that the dimension of functional connectivity is too high, the number of sample is limited and differences between samples are significant

        ③⭐In reality, the number of healthy people is definitely much greater than the number of Alzheimer's patients. If follows the situation (the ratio of AD and HC), it may decreases the accuracy of binary classification

        ④⭐Accordingly...They take large amount healthy samples as their pre-traning set, then apply anomaly detection in comprehensive sites.

        ⑤作者在这里提到一个标签空间的问题,他们认为纯健康的源域和有健康有不健康的目标域的标签空间可能是不一样的。因此不能采用传统的自适应方法。作者认为“需要应用一般和有条件的领域自适应。这样可以在保持训练模型的判别能力的同时,使两个域的特征分布保持一致”

        ⑥The schematic of their FAAD:

        ⑦Their contributions: a) they are the first one to adopt anomaly detection in psychiatric disoders classification, b) for one class in source dataset and two classes (only one new class) in target dataset, they alleviate the difference of distribution between two classes, c) they align the general feature distribution and conditional distribution between the source and the target datasets at the same time

interrater  adj. 评分者间的:指不同评分者之间的一致性或可靠性

delineate  v. (详细地)描述,解释;标明,标示(边界)

schematic  adj. 略图的;严谨的;简表的;有章法的  n. 简图

authenticity  n. 真实性,可靠性

2.3. Related work

2.3.1. Classification of mental disorders

        ①Shen et al. classified schizophrenia (SCZ) and HC by locally linear embedding and C-means clustering

        ②Zeng et al. classified depression and HC by whole brain FC and SVM

        ③What is more, Zeng et al. then classified SCZ and HC by discriminant autoencoder network with sparsity constraint (DANS) with combining different sites of data

        ④Sui et al. predicted the cognitive domain score of SCZ by extracting features from multimodal MRI images

        ⑤Li et al. classified posttraumatic stress disorder (PTSD) and HC by dynamic FC

        ⑥Gopinath et al. predicted the stage of AD by new learnable graph pooling method

        ⑦Lian et al. extracted the multi-scale features of AD by hierarchical fully convolutional network (H-FCN)

        ⑧Mourao-Miranda et al. classified patients by anomaly detections with SVM but only contains 38 samples

morphometry  n. 形态测量学;形态计量术

2.3.2. Few-shot learning for anomaly detection

        ①Anomaly detection, also called outlier detection or novelty detection, tries to limit all the training samples (normal samples) in a hypersphere as much as possible. All the samples that fall outside the hypersphere are abnormal samples

        ②Few number of anomalies will better help to depict the hypersphere

        ③Lu et al. proposed a few-shot scene-adaptive outlier detection method

        ④Ding et al. put forward graph deviation networks (GDN) and new cross-network meta-learning algorithm

        ⑤Koizumi et al. proposed a few-shot method to train cascaded specific anomaly detector

        ⑥It is hard to use meta-learning cuz the domain is single (diversity needed) and unseen labels can only be used in fine-tune in meta-learning

a.k.a.  abbr.又名,亦称(尤用于引出某人的昵称或艺名(also known as));

2.3.3. Cross-domain few-shot learning

        ①Most of the cross-domain methods focus on the condition that the label space is the same of the source domain and the target domain

        ②Guan et al. proposed triplet autoencoder (TriAE) model

        ③Zhao et al. put forward domain-adversarial prototypical network (DAPN) model with meta-learning and N-way k-shot classification. N-way k-shot means N clusters in support set and k samples in each clusters. The there is a query set which contains N clusters also to query (measure the performance). Due to the requirement of N clusters, disease classification can not apply this method

2.4. Materials

        ①The overall pipeline:

(A)Get time series \overset{Pearson\, \, correlation}{\rightarrow} FC \overset{vectorize}{\rightarrow} input vector

(B)Pretraining: input vector (dimension N=\frac{n(n-1)}{2}, where n is the number of ROI) \overset{three-layer\, \, autoencoder}{\rightarrow} output vector through reconstruction loss L_{reconstruction}我不知道怎么用的

(C)Apply three-repeat three-trial validation in samples with random seed in each repeat for randomize the sequence of samples. Select few normal and abnormal samples from each trival randomly as labelled data. The remain of them is regard as test set

(D)Retaining the encoder from B and compensating the differences between domains through residual correction block and conditional adversarial domain adaptation. Also

L_{total}=L_{ad}+L_{da}\left ( \beta \right )

where L_{ad} denotes the loss of anomaly detection and L_{da} denotes the loss of domain adaptation. 

        ②Finally, the measure the performance by the AUC of unlabelled target domain

2.4.1. Demographic, clinical and imaging information of data

        ①Sites: 7

(1)Source domain

        ①dataset: The Human Connectome Project (HCP) dataset (HCP S1200)

        ②Samples: 1053 HC with 483 males and 570 females

        ③Parameters of scanning: spatial resolution = 2×2×2mm³ , repetition time (TR) = 720 ms, echo time (TE) = 33.1 ms, field of view (FOV) = 208×80mm² , slices = 72, flip angle (FA) = 52◦, TRs = 1200

(2)Target domain

        ①Dadaset: AMU, FMMU#1, FMMU#2, PUTH, UCLA and COBRE datasets (they are a) rs-fMRI, b) keep the same scanner in one site, c) the sample size >100 when contains HC and SCZ, > 150 when contains SCZ and MDD for one site)

2.4.2. Preprocessing

        ①Software: SMP8

        ②Magnetic saturation: the first five frames of the scanned data are discarded

        ③Slice timing

        ④Motion correction: excluding scans with excessive head motion during acquisition (>2.5 mm translation and/or 2.5◦ rotation)

        ⑤Normalization with an EPI template in the Montreal Neurological Institute (MNI) atlas space (3-mm isotropic voxels)

        ⑥Spatial smoothing with a 6-mm fullwidth half-maximum Gaussian kernel

        ⑦Linear detrending and bandpass temporal filtering (0·01–0·08 Hz)

        ⑧Regression of nuisance variables, including the six parameters obtained by rigid body head motion correction, ventricular and white matter signals, and their first temporal derivatives, quadratic terms, and squares of derivatives

2.4.3. Functional connectivity measures

        ①AAL atlas lacks information of functional organization

        ②17-network parcellation possess high SNR but do not contain some subcortical regions, such as the thalamus and amygdala, which are regarded as essential regions in memory, emotional control and various cognitive functions

        ③Thus, they use BA512 atlas with eigen clustering (EIC) and unsupervised method 

        ④Applying Pearson correlation coefficient in time series under each atlas, then transforming them to approach to normal distribution by Fisher r-to-z transformation

        ⑤Three atlases:

striatum  n. 纹状体,终脑的皮层    thalamus  n. [解剖] 丘脑;花托     amygdala  n. [解剖] 杏仁核;扁桃腺;苦巴旦杏

2.5. Proposed algorithm

2.5.1. Problem definition

        ①\mathcal{D}_{s}=\{(x_{si},y_{si})\}_{i=1}^{n_{s}}=\{\mathbf{X}_{s},y_{s}\} is the source domain, the HCP dataset, where y_{si}=+1

        ②\mathcal{D}_{t} is the target domain, the AMU, FMMU#1, FMMU#2, PUTH, UCLA and COBRE datasets

        ③\mathcal{D}_{l}=\{(x_{li},y_{li})\}_{i=1}^{n_{l}}=\{\mathbf{X}_{l},y_{l}\} is the labeled target, where y_{li}=+1 for HC, y_{li}=-1 for patients

        ④\mathcal{D}_{u}=\{(x_{ui})\}_{i=1}^{n_{u}}=\{\mathbf{X}_{u}\} is the unlabeled target

        ⑤

\mathcal{X}_{s}the feature space of the source domain \mathcal{D}_{s}
\mathcal{X}_{t}the feature space of the target domain \mathcal{D}_{t}
\mathcal{Y}_{s}the label space of the source domain \mathcal{D}_{s}\mathcal{Y}_{s}\subset \mathcal{Y}_{t}. Its class number C_s=1
\mathcal{Y}_{t}the label space of the target domain \mathcal{D}_{t}. Its class number C_t=2

        ⑥D\left ( \mathcal{X}_{s} \right )=D\left ( \mathcal{X}_{t} \right ) means they have the same dimension

        ⑦⭐The feature distribution between source and target domain is difference, namely P_{s}(X_{s})\neq P_{t}(X_{t})其实我不知道这个特征分布指的是 a) 同样的指标但是大小分区不均 还是 b) 指标个数一样但是指标不一样

        ⑧They aim to alleviate the distribution discrepancy between \mathcal{D}_{s} and \mathcal{D}_{l} and apply anomaly detection in \mathcal{D}_{u}

2.5.2. Deep semi-supervised anomaly detection (DSAD)

        ①In L layers deep support vector data description (deep SVDD):

\begin{aligned}\min_{\mathcal{W}}\frac{i}{n}\sum_{i=1}^{n}||\phi(x_{i};\mathcal{W})-c||^{2}+\frac{\lambda}{2}\sum_{l=1}^{L}||\mathbf{W}^{l}||_{F}^{2}\end{aligned}

where \mathcal{X}\subset\mathbb{R}^{D} denotes the input space and \mathcal{Z}\subset\mathbb{R}^d denotes the output space;

\mathcal{W}=\{\mathbf{W}^{1},...,\mathbf{W}^{L}\} , x_{1},...,x_{n}\in\mathcal{X}c denotes the center of the hypersphere;

And this function is for minimizing the volume of hypersphere of all the HC;

The left term is to enclose the HC and the right term is a standard weight decay regularizer with hyperparameter \lambda > 0

        ②For there is only HC samples for training and maxmizing the mutual information \mathcal{I}(\mathcal{X},\mathcal{Z}), autoencoder initialization with reconstruction loss as the optimizer

        ③The mean value of all the features of encoded samples in center c:

c=\frac{1}{n}\sum_{i=1}^{n}\phi(x_{si};\mathcal{W}_{0})

        ④The anomaly score after training can be:

s(x)=\|\phi(x;\mathcal{W})-c\|^2

        ⑤There might be "hypersphere collapse" when only use HC. It means the radius of the hypersphere reduce to 0 and eliminating the representation capability of the network. It can be mitigated by few labeled abnormal samples

        ⑥For two classes labeled samples, there are:

\begin{aligned}&(x_{t1},y_{t1}),...,(x_{tm},y_{tm}),\\&(x_{t(m+1)},y_{t(m+1)}),...,(x_{t(2m)},y_{t(2m)})\in\mathcal{X}_t\times\mathcal{Y}_t\end{aligned}

        ⑦After adding the labeled samples, the network could be changed to:

\begin{aligned} \operatorname*{min}_{\mathcal{V}}& \begin{aligned}\frac{1}{n}\sum_{i=1}^n(||\phi(x_{si};\mathcal{W})-c||^2)^{y_si}\end{aligned} \\ &+\frac{1}{2m}\sum_{j=1}^{2m}(||\phi(x_{tj};\mathcal{W})-c||^{2})^{y_{t}j}+\frac{\lambda}{2}\sum_{l=1}^{L}||\mathbf{W}^{l}||_{F}^{2} \end{aligned}

the labeled abnormal samples are mapped away from center by penalization

        ⑧The centers of source domain and target domain are shared

2.5.3. Residual correction block (RCB)

        ①Distribution alignment by increasing discrepancy loss may not completely eliminate the domain discrepancies

        ②Li et al. put forward two-layer fully connected neural network RCB, which \mathcal{Y}_{t}\subset\mathcal{Y}_{s}

        ③\phi_{s}(x_{s}) and \phi_{t}(x_{t}) are the task-specific features of source data x_s and target data x_t

        ④“The source data x_s only needs to go through the original network, while the target data x_t needs to pass the RCB afterward.” Hence \phi_{s}(x_{s})=\phi(x_{s})我不知道啥意思

        ⑤Feature that learned by RCB is denoted as \Delta\phi_{\boldsymbol{s}}(x_{t})

        ⑥The integrate target feature: \phi_{t}(x_{t})=\phi_{s}(x_{t})+\Delta\phi_{s}(x_{t})

        ⑦They further update the object equation, i.e. the loss of DSAD:

\begin{aligned} L_{ad}=& \begin{aligned}\frac{1}{n}\sum_{i=1}^{n}(||\phi_{s}(x_{si};\mathcal{W})-c||^{2})^{y_{si}}\end{aligned} \\ &+\frac1{2m}\sum_{j=1}^{2m}(||\phi_{t}(x_{tj};\mathcal{W})-c||^{2})^{y_{tj}}+\frac\lambda2\sum_{l=1}^{L}||\mathbf{W}^{l}||_{F}^{2} \end{aligned}

2.5.4. Conditional adversarial domain adaptation revisited

        ①CDAN designed for traditional domain adaptation, which domain possess the same label space of source and target domain

        ②The domain confufsion error:

\begin{aligned}L_{dc}&=-\frac{1}{n}\sum_{i=1}^{n}\log[D(\phi_s(x_{si}),g(x_{si}))]\\&-\frac{1}{2m}\sum_{j=1}^{2m}\log[1-D(\phi_t(x_{tj}),g(x_{tj}))]\end{aligned}

        ③They apply:

\begin{aligned}&\{g(x_1),g(x_2),...,g(x_B)\}\\&=\text{softmax}(\{-s(x_1),-s(x_2),...,-s(x_B)\})\end{aligned}

where s\left ( x_i \right ) denotes the distance between x_i and c

        ④There are adversarial network:

\begin{aligned}&\min_\phi L_{ad}(\phi)-\beta L_{dc}(D,g)\\&\min_DL_{dc}(D,g)\end{aligned}

        ⑤The domain discriminator D(\phi,g)=D(\phi\otimes g)

        ⑥Then, the CDAN can be: 

\begin{aligned} &\begin{aligned}\min_{\phi}L_{ad}(\phi)+\beta(\frac{1}{n}\sum_{i=1}^{n}w(g(x_{si}))\log[D(\phi_{s}(x_{si})\otimes g(x_{si}))]\end{aligned} \\ &+\frac{1}{2m}\sum_{j=1}^{2m}w(g(x_{tj}))\log[1-D(\phi_{t}(x_{tj})\otimes g(x_{tj}))]) \\ &\operatorname*{mar}_{D} \kappa\frac{1}{n}\sum_{i=1}^{n}w(g(x_{si}))\log[D(\phi_{s}(x_{si})\otimes g(x_{si}))] \\ &+\frac{1}{2m}\sum_{j=1}^{2m}w(g(x_{tj}))\log[1-D(\phi_{t}(x_{tj})\otimes g(x_{tj}))]. \end{aligned}

where the entropy criterion w(g)=1+e^{-g}

2.5.5. Overall formulation of the FAAD algorithm

        ①The Few-shot domain-Adaptive Anomaly Detection (FAAD) combines DSAD and RCB:

\begin{aligned} \min_{\phi}& \frac{1}{n}\sum_{i=1}^{n}(||\phi_{s}(x_{si};\mathcal{W})-c||^{2})^{y_{si}} \\ &+\frac{1}{2m}\sum_{j=1}^{2m}(||\phi_{t}(x_{tj};\mathcal{W})-c||^{2})^{y_{tj}}+\frac{\lambda}{2}\sum_{l=1}^{L}||\mathbf{W}^{l}||_{F}^{2} \end{aligned}

        ②FAAD+CDANE:

\begin{aligned} &\min_{\phi} \begin{aligned}\frac{1}{n}\sum_{i=1}^n(||\phi_s(x_{si};\mathcal{W})-c||^2)^{y_{si}}\end{aligned} \\ &+\frac1{2m}\sum_{j=1}^{2m}(||\phi_{t}(x_{tj};\mathcal{W})-c||^{2})^{y_{tj}}+\frac\lambda2\sum_{l=1}^{L}||\mathbf{W}^{l}||_{F}^{2} \\ &+\beta(\frac1n\sum_{i=1}^nw(g(x_{si}))\log[D(\phi_s(x_{si})\otimes g(x_{si}))] \\ &+\frac1{2m}\sum_{j=1}^{2m}w(g(x_{tj}))\log[1-D(\phi_{t}(x_{tj})\otimes g(x_{tj}))]) \\ &\max_{D} \begin{aligned}\frac{1}{n}\sum_{i=1}^nw(g(x_{si}))\log[D(\phi_s(x_{si})\otimes g(x_{si}))]\end{aligned} \\ &+\frac1{2m}\sum_{j=1}^{2m}w(g(x_{tj}))\log[1-D(\phi_{t}(x_{tj})\otimes g(x_{tj}))], \end{aligned}

        ③The pseudo code of FAAD+CDANE:

2.6. Experiment

        ①They compared their model with a) machine learning as SVM and deep learning as FNN, b) originial anomaly detection DSAD, c) domain adaptation models

        ②They evaluate the soecific disease detection ability and various disease domain differentiating ability of their model

2.6.1. Baseline method

        ①They apply 95% PCA-SVM cuz the number of dimension is far more than the samples(特征维数是哪个什么n(n-1)/2吗,)

        ②They construct a BC-DNN with FNN combined with a fully connected layer and a Softmax layer. Then apply pre-training in BC-DNN to get BC-DNN-p

        ③They continue to introduce other models...(我这省略了)

2.6.2. Implementation details

(1)Network and training setup

        ①Shot: 10-shot and 20-shot applied

        ②Measurement: AUC

        ③FNN: input dimensions of layer 1,2,3 are the original dimension of vector, 128, 32 respectively; learning rate=0.001; optimizer: Adam

        ④FAAD and FAAD+CDANE: learning rate of RCB = 1/10 original learning rate; epoch=12 in pretraining and epoch=16 in FAAD; learning rate / 10 in the fourth and eighth epoch; batch size=4; \lambda =0.0001 and \beta =0.1 (from 0 to 0.1, influenced by coefficient \begin{aligned}(1-\exp(-\delta p))/(1+\exp(-\delta p))\end{aligned},  where \delta =10 and p iterate from 0 to 1)(我不能太理解); dropout ratio=0.2(多看一眼就会爆炸的段落)

        ⑤DSAD-DANN: \beta =1

(2)Data augmentation

        ①为什么在这里又说特征维度比样本量小!?

        ②⭐They think the label of partial fMRI scanning is the same as the full scan

        ③⭐“在训练过程中,每个时间过程都是随机裁剪的(应该从扫描的第一帧开始,并且大于原始长度的一半),然后用于计算全脑FC。在测试期间,放弃增强”(这种叫增强啊...可能没学过数据增强)

2.6.3. Results and analysis

        They compare the mean AUC of 9 trials

(1)FAAD for one mental disorder (SCZ only)

        ①AMU

        ②FMMU#1

        ③FMMU#2

        ④PUTH

        ⑤UCLA

        ⑥COBRE

        ⑦他们在这之后花了大篇幅撰写discussion,不过讨论都是基于实验结果的,对于没有实验结果的我暂时没有特别大的意义。因此只是看了一遍而没有记录

        ⑧Mean values and standard deviation of AUCs(%):

(2)FAAD for two mental disorders (SCZ & MDD)

        ①AMU

        ②FMMU#1

(3)Discriminative FC and brain regions

        ①They combine all the FC vector in each test set and apply canonical correlation analysis (CCA) on it. Get the mean weight of FC in each test set and select the top 10%

        ②SCZ visualization:

        ③SCZ or MDD:

(4)Empirical analysis of parameters

        ①Grid search \beta =\left \{ 0,\, 0.05,\, 0.1,\, 0.15,\, 0.2,\, 0.25 \right \} and find FAAD+CDANE is not sensitive to \beta

        ②Table of the tuning:

(5)Distribution of anomaly scores

        ①Anomaly scores in FMMU#1 with AAL:

(6)Brain parcellation and model performance

        ①Comparison of datasets and atlases:

2.7. Discussion

        ①This model can also be generalized to other networks

        ②⭐图的定义和图的拉普拉斯表示并不总是令人满意哈哈哈哈哈笑死,但你这个平均精度其实也不算太高,虽然最高可以到80但是平均下来我感觉就六七十了。2021其实也很够了

        ③Most of the samples in HCP are young person, it might influence the results

        ④⭐They did not consider the different pre-processing pipeline of different sites

2.8. Conclusion

        我就懒得conclude了,该是啥是啥

3. 知识补充

3.1. Hypersphere

参考学习:超球面_百度百科 (baidu.com)

3.2. Meta-learning

参考学习:一文入门元学习(Meta-Learning)(附代码) - 知乎 (zhihu.com)

3.3. Manifold

参考学习1:几何学中最伟大的发明之一——流形,其背后的几何直觉与数学方法 (baidu.com)

参考学习2:流形_百度百科 (baidu.com)

3.4. Canonical Correlation Analysis (CCA)

参考学习:Canonical Correlation Analysis - 知乎 (zhihu.com)

4. Reference List

Su J. et al. (2021)  'Few-shot domain-adaptive anomaly detection for cross-site brain images', IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1. doi: 10.1109/TPAMI.2021.3125686

  • 18
    点赞
  • 20
    收藏
    觉得还不错? 一键收藏
  • 2
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值