Topic 9. 克隆进化之 TimeScape

我们利用Pyclone和CITUP得到了三个文件即cellfreq.txt和tree.txt 和sample_id,下面我们就利用TimeScape搞一下可视化,在这里不会出现具体的基因或突变位点,但是都是可以追述的。

  • 关于安装问题

安装很简单,利用BioManager即可顺利安装完成。

# try http:// if https:// URLs are not supported
if (!requireNamespace("BiocManager", quietly=TRUE))
    install.packages("BiocManager")
BiocManager::install("timescape")
  • 关于输入数据获得

该输入数据是以每个样本每个克隆的克隆系统发育和克隆流行率作为输入。将之前的主要命令行整理,获得三个输入文件:cellfreq.txt,tree.txt 和 sample_id,如下:

####利用PyClone获得文件 tables/loci.tsv,再生产三个文件freq.txt,cluster.txt 和 sample_id
PyClone run_analysis_pipeline --in_files SRR385938.tsv SRR385939.tsv SRR385940.tsv SRR385941.tsv --working_dir pyclone_analysis
cat ./pyclone_analysis/tables/loci.tsv | cut -f 6 | sed '1d' | paste - - - -  > ./pyclone_analysis/freq.txt
cat ./pyclone_analysis/tables/loci.tsv | cut -f 3 | sed '1d' | paste - - - - |cut -f 1 > ./pyclone_analysis/cluster.txt
cat ./pyclone_analysis/tables/loci.tsv | cut -f 2 | sed '1d' | head -4 > ./pyclone_analysis/sample_id
####利用CITUP获得文件results.h5
run_citup_qip.py ./pyclone_analysis/freq.txt ./pyclone_analysis/cluster.txt ./pyclone_analysis/results.h5
####利用ReadH5.py读取results.h5获得两个文件cellfreq.txt 和 tree.txt
python ReadH5.py ./pyclone_analysis/results.h5 | sed 's/^ \[//;s/\[//g;s/\]//g' | tr ' ' '\t'| grep '\.' > ./pyclone_analysis/cellfreq.txt
python ReadH5.py ./pyclone_analysis/results.h5 | sed 's/^ \[//;s/\[//g;s/\]//g' | tr ' ' '\t'| grep -v '\.' > ./pyclone_analysis/tree.txt
  • 关于运行问题

软件包自带例子好几个,但是内部使用example,直接运行,估计大家都一头雾水,其实直接看:

example("timescape") ###生成多个可视化将出现在您的浏览器中

browseVignettes("timescape")

#or:

?timescape

看例子吧,如下:

 # EXAMPLE 1 - Acute myeloid leukemia patient, Ding et al., 2012
  # genotype tree edges
tree_edges <- read.csv(system.file("extdata", "AML_tree_edges.csv", package = "timescape"))
  tree_edges 
    source target
1      1      2
2      1      3
3      3      4
4      4      5
  # clonal prevalences
clonal_prev <- read.csv(system.file("extdata", "AML_clonal_prev.csv", package = "timescape"))
  clonal_prev 
   timepoint clone_id clonal_prev
1 Diagnosis        1      0.1274
2 Diagnosis        2      0.5312
3 Diagnosis        3      0.2904
4 Diagnosis        4      0.0510
5   Relapse        5      1.0000
  # targeted mutations
mutations <- read.csv(system.file("extdata", "AML_mutations.csv", package = "timescape"))
 head(mutations)
   Tier chrom    coord clone_id timepoint    VAF
1    3     1  2554021        1 Diagnosis 0.4383
2    3     1 11965332        1 Diagnosis 0.4123
3    3     1 18952534        1 Diagnosis 0.4891
4    3     1 20382629        1 Diagnosis 0.4754
5    3     1 28395117        5 Diagnosis 0.0004
6    3     1 30729775        1 Diagnosis 0.4812
  # perturbations
perturbations <- data.frame( pert_name = c("Chemotherapy"), 
                               prev_tp = c("Diagnosis"))
perturbations 
  pert_name   prev_tp
1 Chemotherapy Diagnosis                               
  # run timescape
timescape(clonal_prev = clonal_prev, tree_edges = tree_edges, perturbations = perturbations, height=260)

图片

  • 关于参数问题

TimeScape需要配置的参数包括必要参数选参数。

必要参数如下:

  • clonal_prev是由每个时间点每个克隆的克隆流行率组成的数据帧,该数据的列为:
  1. character() timepoint - timepoint

  2. character() clone_id - clone id

  3. numeric() clonal_prev -clonal prevalence。

  • Tree_edges是描述根克隆系统发育的边缘的数据框架,该数据的列为:
  1. character() source - source node id

  2. character() target - target node id。

可选参数如下:

突变是由每个克隆中产生的突变组成的数据表格,如果提供了这个参数,一个突变表将出现在视图的底部。该数据的列为:

  1. character() chrom - chromosome number

  2. numeric() coord - coordinate of mutation on chromosome

  3. character() clone_id - clone id

  4. character() timepoint - time point

  5. numeric() VAF - variant allele frequency of the mutation in the corresponding timepoint.

  • 实际数据测试

利用我们生产的实际数据cellfreq.txt,tree.txt 和 sample_id作为输入数据,来获得分析结果,如下:

library(timescape)
options(stringsAsFactors = F)
#example("timescape")
#browseVignettes("timescape") 
library(plotly)
library(htmlwidgets)
library(webshot)
library(tidyr)

tree_edges = read.table("tree.txt")
colnames(tree_edges) = c("source","target")
 # clonal prevalences
cellfreq = read.table("cellfreq.txt")
colnames(cellfreq) = 0:(length(cellfreq)-1)
sample_id = read.table("sample_id")
cellfreq$timepoint = sample_id[ , 1]
clonal_prev = gather(cellfreq, key="clone_id", value = "clonal_prev", -timepoint)
clonal_prev = clonal_prev[order(clonal_prev$timepoint),]
clonal_prev
# targeted mutations
# mutations <- read.csv(system.file("extdata", "AML_mutations.csv", package = "timescape"))
 p = timescape(clonal_prev = clonal_prev, tree_edges = tree_edges,height=260)
 saveWidget(p, "test.html")

  • 关于生成文件特征

该软件包生产.html格式文件,可以用浏览器直接打开,之后可以保存为SVG,或者 PNG,一般我习惯选择SVG,之后方便使用AI编辑使用。测试结果如下:

图片

References:

Ding, Li, et al. “Clonal evolution in relapsed acute myeloid leukaemia revealed by whole-genome sequencing.” Nature 481.7382 (2012): 506-510.

Ha, Gavin, et al. “TITAN: inference of copy number architectures in clonal cell populations from tumor whole-genome sequence data.” Genome research 24.11 (2014): 1881-1893.

Malikic, Salem, et al. “Clonality inference in multiple tumor samples using phylogeny.” Bioinformatics 31.9 (2015): 1349-1356.

McPherson, Andrew, et al. “Divergent modes of clonal spread and intraperitoneal mixing in high-grade serous ovarian cancer.” Nature genetics (2016).

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值