circos 学习手册（三十一）

最新推荐文章于 2025-05-01 16:54:33 发布

名本无名

最新推荐文章于 2025-05-01 16:54:33 发布

阅读量49

点赞数

文章标签：学习算法

本文链接：https://blog.csdn.net/dxs18459111694/article/details/139937411

版权

技巧(六)

20. 细胞周期 —— II

我们接续上次讲的绘制细胞周期图片，在第一部分的示例中，我们将细胞周期的每个阶段分开绘制为不同的轴。在这里，我们将整个周期绘制在一个轴上。

# cycle.txt
chr - cycle cycle 0 100 greys-6-seq-5

20.1 使用裁剪

我们将每个阶段定义为周期轴上的裁剪区域

karyotype   = cycle.txt
chromosomes = cycle[g1]:0-45;cycle[s]:45-80;cycle[g2]:80-95;cycle[m]:95-100

然后在 <spacing> 块中定义 break 参数来控制裁剪区域间的间距

<ideogram>
<spacing>
default = 0.005r
break   = 1r
</spacing>
</ideogram>

20.2 为阶段上色

颜色的定义与第一部分一样

palette  = greys-6-seq
<phases>
g1 = 3
s  = 4
g2 = 5
m  = 6
</phases>
# g1, s, g2, m are tags defined in 'chromosomes' above
chromosomes_color = g1=conf(palette)-conf(phases,g1),
                    s=conf(palette)-conf(phases,s),
                    g2=conf(palette)-conf(phases,g2),
                    m=conf(palette)-conf(phases,m)

其他刻度参数也沿用上一部分

image.png

21. Nature 封面图

下面我们将展示如何自动生成封面图片

21.1 图片元素

该图片包含 23 个片段，表示人类的染色体 1-22 和 X 染色体。

图中展示的染色体长度与 hg19 版本的染色体并不完全一致，在这里我们使用的是组装长度

使用柔和的配色方案，在橙色、绿色、蓝色和紫色之间循环。我们用这个颜色方案来重新定义默认颜色。

图中的数据以 6 个同心圆方式显示，它们之间的间距向内侧略微减小，每个轨迹都会高亮显示固定的区域，并在染色体着色之后再着色

21.2 配色

我们从封面图片中提取到了如下的颜色配置，并通过加 * 号来重新为变量赋值

# circos.conf
<<include etc/colors_fonts_patterns.conf>>
<colors>
chr1*  = 163,132,130
chr2*  = 188,162,118
chr3*  = 216,196,96
chr4*  = 233,212,56
chr5*  = 229,229,50
chr6*  = 212,222,56
chr7*  = 195,215,57
chr8*  = 177,209,58
chr9*  = 160,204,61
chr10* = 139,198,61
chr11* = 128,193,95
chr12* = 115,186,126
chr13* = 102,183,152
chr14* = 91,178,176
chr15* = 61,174,199
chr16* = 36,170,224
chr17* = 75,129,194
chr18* = 85,111,180
chr19* = 92,92,168
chr20* = 98,70,156
chr21* = 101,45,145
chr22* = 121,74,141
chrx*  = 140,104,137
</colors>

然后在 <image> 块中定义图像背景

<image>
<<include etc/image.conf>>
background* = black
</image>

21.3 轨迹位置

每个轨迹都有相同的数据源，但由于随机改变数据的动态规则，会使其外观不同

# variables used in each plot.conf block

plot_width   = 80 
plot_padding = 25 
num_plots    = 6  

<plots>
type             = highlight
file             = bins.txt
stroke_thickness = 0
<<include plot.conf>>
<<include plot.conf>>
<<include plot.conf>>
<<include plot.conf>>
<<include plot.conf>>
<<include plot.conf>>
<<include plot.conf>>
</plots>

plot.conf 文件的定义

<plot>
r1   = dims(ideogram,radius_inner)
         - conf(plot_padding)*eval(remap(counter(plot),0,conf(num_plots),1,0.9))
         - eval((conf(plot_width)+conf(plot_padding))*counter(plot)*eval(remap(counter(plot),0,conf(num_plots),1,0.9)))
r0   = conf(.,r1)
         - conf(plot_width)*eval(remap(counter(plot),0,conf(num_plots),1,0.9))
post_increment_counter = plot:1
<<include rules.conf>>
</plot>

轨迹的内半径和外半径(r0,r1)，通过 plot_padding 和 plot_width 参数进行设置

每次绘制图像时，变量 counter(plot) 的值自动加 1

通过 dims(ideogram,radius_inner) 获得 ideogram 内半径的值

用 remap(VAR,MIN,MAX,TARGETMIN,TARGETMAX) 将 VAR 的值从 [MIN,MAX] 重新映射到 [TARGETMIN,TARGETMAX]

21.4 轨迹数据

每个轨迹使用相同的数据，定义了 7.5 Mb 基因组区域

hs1 0 7499999
hs1 7500000 14999999
hs1 15000000 22499999
hs1 22500000 29999999
...

然后在 plot 块中用 rule 动态更改颜色

# rules.conf
<rules>

<rule>

# The first condition tests that bins are further than 5 Mb from the
# start and end of each ideogram.  This ensures that the color
# for the first/last bin will be the same as the ideogram.

condition  = var(start) >= 5e6 && var(end) < chrlen(var(chr))-5e6

# The probability that the second condition is true is proportional to
# the track counter. Bins in inner tracks are more likely to trigger
# this rule.  Here, rand() is a uniformly distributed random number in
# the range [0,1).

condition  = rand() < remap(counter(plot),0,conf(num_plots)-1,1/conf(num_plots),1) 

# If this rule is true, the color of the bin is changed to that of a
# random ideogram.

fill_color = eval("chr" . (sort {rand() <=> rand()} (1..22,"x"))[0])

</rule>

<rule>
condition  = 1
fill_color = eval("chr" . lc substr(var(chr),2))
</rule>
</rules>

image

22. 不只是基因组

circos 并不是只能用来绘制基因组区域，还可以绘制任何形式的轴。

在这里，我们将轴的各部分对应于美国总统候选人在辩论中发言的总字数。

我们在核型文件定义这些片段。例如，我们假设奥巴马说了 2,000 个单词，理查森说了 1,000个单词，依此类推。

# karyotype.txt
chr - obama obama 0 2000 dem
chr - richardson richardson 0 1000 dem
chr - clinton clinton 0 1500 dem
chr - mccain mccain 0 1000 rep
chr - romney romney 0 1750 rep
chr - huckabee huckabee 0 1250 rep

最后一个字段根据其是共和党和民主党，用经典的蓝/红配色方案设置颜色

<<include etc/colors_fonts_patterns.conf>>

# append to the colors block
<colors>
rep = 211,121,111
dem = 85,143,190
</colors>

22.1 片段切片

每个片段分为不同的切片，每个切片表示在特定辩论中演讲的单词数

# slices.txt
obama 0 300     # Obama's 1st debate words
obama 301 750   #         2nd
obama 751 950   #         3rd
obama 951 1250  #         4th
obama 1251 1500 #         5th
obama 1501 2000 #         6th

这些切片在 ideogram 的顶部绘制为空心高亮，并带有白色的粗轮廓。

<plot>
file  = slices.txt
type  = highlight
r0    = dims(ideogram,radius_inner)
r1    = dims(ideogram,radius_outer)
fill_color       = undef
stroke_color     = white
stroke_thickness = 5
</plot>

22.2 指名道姓

当一位候选人在演讲中提到另一位候选人的名字时，我们会画一个 link。

link 从辩论部分开始，其中提到了另一个候选人的名字，那么 link 的结束就是所述候选人片段的中心

# links.txt
# Obama mentions Clinton in his 1st debate
obama 150 150 clinton 750 750
# McCain mentions Clinton in his 3rd debate
mccain 875 875 clinton 750 750
# Huckabee mentions Clintin in his 2nd debate
huckabee 525 525 clinton 750 750

默认情况下，link 的颜色设置为 rep，即共和党红色

<link>
file      = links.txt
radius    = dims(ideogram,radius_inner)
bezier_radius = 0r
thickness = 5
color     = rep 
...
</link>

如果推荐候选人是民主党人，则会添加一条规则以将链接颜色更改为 dem

<rules>
<rule>
# set dem color if start is on a democrat
condition = var(chr1) =~ /obama|richardson|clinton/
color     = dem
</rule>
</rules>

22.3 关注候选人

要显示来自给定候选对象的 link，可以使用 from() 函数返回 link 起始段的名称

<rule>
# only links from obama are shown (all others are hidden by setting show=no)
# the condition test is equivalent to
#   var(chr1) ne "obama"
condition = ! from(obama)  
show      = no
</rule>

或者，要测试 link 结束段的标识，可以使用 to() 函数。

<rule>
# only links to mccain are shown (all others are hidden by setting show=no)
# the condition test is equivalent to
#   var(chr2) ne "mccain"
condition = ! to(mccain)
show      = no
</rule>

或者用 fromto() 测试 link 的两端

<rule>
# only links from obama to mccain are shown (all others are hidden by setting show=no)
# the condition test is equivalent to
#   var(chr1) ne "obama" || var(chr2) ne "mccain"
condition = ! fromto(obama,mccain)
show      = no
</rule>