Paper reading (六十九):Prediction of anti-cancer drug response by kernelized multi-task learning

论文题目:Prediction of anti-cancer drug response by kernelized multi-task learning

scholar 引用:14

页数:8

发表时间:2016.10

发表刊物:Artificial Intelligence in Medicine

作者:Mehmet Tan

摘要:

Motivation
Chemotherapy or targeted therapy are two of the main treatment options for many types of cancer. Due to the heterogeneous nature of cancer, the success of the therapeutic agents differs among patients. In this sense, determination of chemotherapeutic response of the malign cells is essential for establishing a personalized treatment protocol and designing new drugs. With the recent technological advances in producing large amounts of pharmacogenomic data, in silico methods have become important tools to achieve this aim.
Objective
Data produced by using cancer cell lines provide a test bed for machine learning algorithms that try to predict the response of cancer cells to different agents. The potential use of these algorithms in drug discovery/repositioning and personalized treatments motivated us in this study to work on predicting drug response by exploiting the recent pharmacogenomic databases. We aim to improve the prediction of drug response of cancer cell lines.
Methods
We propose to use a method that employs multi-task learning to improve learning by transfer, and kernels to extract non-linear relationships to predict drug response.
Results
The method outperforms three state-of-the-art algorithms on three anti-cancer drug screen datasets. We achieved a mean squared error of 3.305 and 0.501 on two different large scale screen data sets. On a recent challenge dataset, we obtained an error of 0.556. We report the methodological comparison results as well as the performance of the proposed algorithm on each single drug.
Conclusion
The results show that the proposed method is a strong candidate to predict drug response of cancer cell lines in silico for pre-clinical studies. The source code of the algorithm and data used can be obtained from http://mtan.etu.edu.tr/Supplementary/kMTrace/.

Highlights:

  • We proposed to use kernelized multitask learning for anticancer drug activity prediction.
  • The proposed method was found to outperform the previous methods in terms of cytotoxicity prediction on three different data sets.
  • The method not only performs better but also requires few parameters.
  • New drugs predicted by the method to be active against certain cell lines were listed.

正文组织架构:

1. Introduction

2. Notation and background

3. Methods

3.1 Gene selection

4.  Experimental results

4.1 Materials and baseline methods

        4.1.1 GDSC dataset

        4.1.2 CCLE dataset

        4.1.3 NCI-DREAM challenge dataset

        4.1.4 Other methods for comparison

4.2 Setting

4.3 Results and discussion

5.  Conclusion

正文部分内容摘录:

1. Biological Problem: What biological problems have been solved in this paper?

  • Prediction of anti-cancer drug response

2. Main discoveries: What is the main discoveries in this paper?

  • The method outperforms three state-of-the-art algorithms on three anti-cancer drug screen datasets.
  • We achieved a mean squared error of 3.305 and 0.501 on two different large scale screen data sets.
  • On a recent challenge dataset, we obtained an error of 0.556.
  • We report the methodological comparison results as well as the performance of the proposed algorithm on each single drug.

3. ML(Machine Learning) Methods: What are the ML methods applied in this paper?

  • GDSC dataset: 609 cell lines and 1100 genes.

  • CCLE dataset: 376 cell lines and 1272 genes.

  • NCI-DREAM challenge dataset: 46 cell lines and 31 compounds

  • kernelized multi-task learning
  • kMTrace
  • We propose to use a method that employs multi-task learning to improve learning by transfer, and kernels to extract non-linear relationships to predict drug response.
  • we propose to use a regularization based multi-task learning method for drug response prediction from gene expression data. 
  • We, therefore, assume that the response models of these drugs which can be represented by suitable regularization within a multi-task learning framework, can also be similar.
  • 创新:a regularization based multi-task learning method has not been applied to this problem before under the assumptions mentioned above.
  • 有一个方法类似,但是它们需要大量的参数:The method proposed in [10] can be considered the most similar to our work. They also use kernels for dimension reduction and non-linear relationships, and exploit multi-task learning. However, they consider multi-task learning to find a common subspace for tasks and then derive model parameters for each task separately. Also, a large number of parameters that have to be determined by the user (or selected by cross validation) makes the algorithm hard to apply.
  • We proposed to use trace-norm regularized kernel multi-task learning to predict drug activity on cancer cell lines in this paper.
  • 方法分两步:While the first step performs dimension reduction and introduces non-linearity by a non-linear kernel,
  • the second step competitively learns a model to predict drug response by transferring knowledge between tasks.

4. ML Advantages: Why are these ML methods better than the traditional methods in these biological problems?

  • We aim to satisfy this by using a nonlinear kernel which also reduces the dimension of gene expression data.
  • First, kMTrace requires a small number of parameters whereas KBMTL requires a large number of parameters that are hard to optimize. This significantly effects the impact of a machine learning model especially in the biological and medical domain where the users may not be machine learning experts.
  • Second, kMTrace is non-linear which is in compliant with the results of the NCI-DREAM drug sensitivity prediction challenge [9] where most of the top methods were non-linear. kMTrace, therefore overcomes the linearity problem in terms of this aspect as also observed in the results.
  • Finally, kMTrace is a multi-task learning method whereas Stream is a single task algorithm that considers each task separately. 
  • kMTrace requires few parameters, is non-linear and considers all tasks together. 

5. Biological Significance: What is the biological significance of these ML methods’ results?

  • We evaluated the method on three different datasets. Two of them are from a number of drugs screened against cancer cell lines and the other one is a DREAM challenge dataset.
  • As a preprocessing step, we performed gene selection based on MalaCards database and shrinked the number of genes used in the experiments.
  • The proposed method, kMTrace, outperformed all the comparative methods which are composed of a well known multi-task learning algorithm, a single task learning algorithm and a multi-task learning method to predict drug response.
  • The reported results are in terms of average MSE for all tasks.

6. Prospect: What are the potential applications of these machine learning methods in biological science?

  • The results show that the proposed method is a strong candidate to predict drug response of cancer cell lines in silico for pre-clinical studies.
  • There are several extensions that we plan to do for this work.
  • First one is to impose a method that can perform a kind of task selection to be able to discriminate the outlier tasks. This way, more related tasks can be used and the performance can be improved.
  • Second, in vitro validations of the (drug, cell line) pairs that are determined to be worth further investigations will be performed.
  • Finally, we consider to work on a method to induce synergies between drugs based on multi-task learning methods. As most of the chemotherapy regimens are combination therapies, this can lead to new protocols for some types of cancer.

7. Mine Question(Optional)

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值