alakazam包测试_Topology analysis of lineage trees_

1.设置当前工作目录

getwd()
setwd(“alakazam”)

2.导入R包

library(alakazam)

3.测试

3.1 Example data

#@ A small set of annotated example trees are included in the alakazam package. The trees are igraph objects with the following tree annotations (graph attributes):

• clone: An identifier for the clonal group. These entries correspond to the CLONE column in

the ExampleDb data.frame from which the trees were generated.

• v_gene: IGHV gene name.

• j_gene: IGHJ gene name.

• junc_len: Length of the junction region (nucleotides).

#@ And the following node annotations (vertex attributes):

• SAMPLE: Time point in relation to influenza vaccination.

• ISOTYPE: The isotype(s) assigned to the sequence. Multiple isotypes are delimited by comma,

and reflect identical V(D)J sequences observed with more than one isotype.

• DUPCOUNT: The copy number (duplicate count), which indicates the total number of reads with

the same V(D)J sequence.

library(igraph)
library(dplyr)

Load example trees

data(ExampleTrees)

Select one tree for example purposes

graph <- ExampleTrees[[24]]

And add some annotation complexity to the tree

V(graph)$SAMPLE[c(2, 7)] <- “-1h”

V(graph)$ISOTYPE[c(2, 7)] <- “IgM”

Make a list of example trees excluding multi-isotype trees

graph_list <- ExampleTrees[sapply(ExampleTrees, function(x) !any(grepl(",", V(x)$ISOTYPE)))]

3.2 Plotting annotations on a tree

#@ There are many options for configuring how an igrpah object is plotted which are helpful for visualing annotation topologies. Below is an extensive example of how to plot a tree by configuring the colors, labels, shapes and sizes of different visual elements according to annotations embedded in the graph.

Set node colors

V(graph) c o l o r [ V ( g r a p h ) color[V(graph) color[V(graph)SAMPLE == “-1h”] <- “seagreen”
V(graph) c o l o r [ V ( g r a p h ) color[V(graph) color[V(graph)SAMPLE == “+7d”] <- “steelblue”
V(graph) c o l o r [ V ( g r a p h ) color[V(graph) color[V(graph)name == “Germline”] <- “black”
V(graph) c o l o r [ g r e p l ( " I n f e r r e d " , V ( g r a p h ) color[grepl("Inferred", V(graph) color[grepl("Inferred",V(graph)name)] <- “white”

Set node labels

V(graph) l a b e l < − p a s t e ( V ( g r a p h ) label <- paste(V(graph) label<paste(V(graph)SAMPLE, V(graph) I S O T Y P E , s e p = " , " ) V ( g r a p h ) ISOTYPE, sep=", ") V(graph) ISOTYPE,sep=",")V(graph)label[V(graph) n a m e = = " G e r m l i n e " ] < − " " V ( g r a p h ) name == "Germline"] <- "" V(graph) name=="Germline"]<""V(graph)label[grepl(“Inferred”, V(graph)$name)] <- “”

Set node shapes

V(graph) s h a p e < − " c r e c t a n g l e " V ( g r a p h ) shape <- "crectangle" V(graph) shape<"crectangle"V(graph)shape[V(graph) n a m e = = " G e r m l i n e " ] < − " c i r c l e " V ( g r a p h ) name == "Germline"] <- "circle" V(graph) name=="Germline"]<"circle"V(graph)shape[grepl(“Inferred”, V(graph)$name)] <- “circle”

Set node sizes

V(graph) s i z e < − 60 V ( g r a p h ) size <- 60 V(graph) size<60V(graph)size[V(graph) n a m e = = " G e r m l i n e " ] < − 30 V ( g r a p h ) name == "Germline"] <- 30 V(graph) name=="Germline"]<30V(graph)size[grepl(“Inferred”, V(graph)$name)] <- 15

Remove large default margins

par(mar=c(0, 0, 0, 0) + 0.05)

Plot the example tree

plot(graph, layout=layout_as_tree, vertex.frame.color=“grey”,
vertex.label.color=“black”, edge.label.color=“black”,
edge.arrow.mode=0)

Add legend

legend(“topleft”, c(“Germline”, “Inferred”, “-1h”, “+7d”),
fill=c(“black”, “white”, “seagreen”, “steelblue”), cex=0.75)

3.3 Summarizing node properties

#@ Various annotation dependent node statistics can be calculated using the summarizeSubstrees and getPathLengths functions. getPathLengths calculates distances from the root (germline) to child nodes, whereas summarizeSubtrees calculates paths and subtree statistics from child nodes.

3.4 Calculating distance from the germline

#@ To determine the shortest path from the germline sequence to any node, we use getPathLengths, which returns the distance both as the number of “hops” (STEPS) and the number of mutational events (DISTANCE).

Consider all nodes

getPathLengths(graph, root=“Germline”)

NAME STEPS DISTANCE

1 Inferred1 1 20

2 GN5SHBT04CW57C 2 26

3 Inferred2 3 28

4 GN5SHBT08I7RKL 4 29

5 GN5SHBT04CAVIG 5 30

6 Germline 0 0

7 GN5SHBT01D6X0W 2 22

8 GN5SHBT06H7TQD 6 31

9 GN5SHBT05HEG2J 4 33

#@ Note, the STEPS counted in the above example include traversal of inferred intermediates. If you want to exclude such nodes, and consider only nodes associated with observed sequences, you can specify an annotation field and value that will be excluded from the number of steps. In the example

below we are excluding NA values in the ISOTYPE annotation (field=“ISOTYPE”, exclude=NA).

Exclude nodes without an isotype annotation from step count

getPathLengths(graph, root=“Germline”, field=“ISOTYPE”, exclude=NA)

NAME STEPS DISTANCE

1 Inferred1 0 20

2 GN5SHBT04CW57C 1 26

3 Inferred2 1 28

4 GN5SHBT08I7RKL 2 29

5 GN5SHBT04CAVIG 3 30

6 Germline 0 0

7 GN5SHBT01D6X0W 1 22

8 GN5SHBT06H7TQD 4 31

9 GN5SHBT05HEG2J 2 33

#@ Note, STEPS has changed with respect to the previous example, but DISTANCE remains the same.

3.5 Calculating subtree properties

The summarizeSubtrees function returns a table of each node with the following properties for

each node:

• NAME: The node identifier.

• PARENT: The identifier of the node’s parent.

• OUTDEGREE: The number of edges leading from the node.

• SIZE: The total number of nodes within the subtree rooted at the node.

• DEPTH: The depth of the subtree that is rooted at the node.

• PATHLENGTH: The maximum path length beneath the node.

• OUTDEGREE_NORM: The OUTDEGREE normalized by the total number of edges.

• SIZE_NORM: The SIZE normalized by the total tree size.

• DEPTH_NORM: The DEPTH normalized by the total tree depth.

• PATHLENGTH_NORM: The PATHLENGTH normalized by the longest path.

The fields=c(“SAMPLE”, “ISOTYPE”) argument in the example below simply defines which annotations we wish to retain in the output. This argument has no effect on the results, in constast to the behavior of getPathLengths

Summarize tree

df <- summarizeSubtrees(graph, fields=c(“SAMPLE”, “ISOTYPE”), root=“Germline”)
print(df[1:4])

NAME SAMPLE ISOTYPE PARENT

1 Inferred1 Germline

2 GN5SHBT04CW57C -1h IgM Inferred1

3 Inferred2 GN5SHBT04CW57C

4 GN5SHBT08I7RKL +7d IgG Inferred2

5 GN5SHBT04CAVIG +7d IgG GN5SHBT08I7RKL

6 Germline

7 GN5SHBT01D6X0W -1h IgM Inferred1

8 GN5SHBT06H7TQD +7d IgG GN5SHBT04CAVIG

9 GN5SHBT05HEG2J +7d IgG Inferred2

print(df[c(1, 5:8)])

NAME OUTDEGREE SIZE DEPTH PATHLENGTH

1 Inferred1 2 8 6 13

2 GN5SHBT04CW57C 1 6 5 7

3 Inferred2 2 5 4 5

4 GN5SHBT08I7RKL 1 3 3 2

5 GN5SHBT04CAVIG 1 2 2 1

6 Germline 1 9 7 33

7 GN5SHBT01D6X0W 0 1 1 0

8 GN5SHBT06H7TQD 0 1 1 0

9 GN5SHBT05HEG2J 0 1 1 0

print(df[c(1, 9:12)])

NAME OUTDEGREE_NORM SIZE_NORM DEPTH_NORM

1 Inferred1 0.250 0.8888889 0.8571429

2 GN5SHBT04CW57C 0.125 0.6666667 0.7142857

3 Inferred2 0.250 0.5555556 0.5714286

4 GN5SHBT08I7RKL 0.125 0.3333333 0.4285714

5 GN5SHBT04CAVIG 0.125 0.2222222 0.2857143

6 Germline 0.125 1.0000000 1.0000000

7 GN5SHBT01D6X0W 0.000 0.1111111 0.1428571

8 GN5SHBT06H7TQD 0.000 0.1111111 0.1428571

9 GN5SHBT05HEG2J 0.000 0.1111111 0.1428571

PATHLENGTH_NORM

1 0.39393939

2 0.21212121

3 0.15151515

4 0.06060606

5 0.03030303

6 1.00000000

7 0.00000000

8 0.00000000

9 0.00000000

#@ Distributions of normalized subtree statistics for a population of trees can be plotted using the plotSubtrees function. In the example below, we have specified silent=TRUE which causes plotSubtrees to return the ggplot object without rendering the plot. The ggplot object are then plotting using the gridPlot function which places each individual plot in a separate panel of the same figure

Set sample colors

sample_colors <- c("-1h"=“seagreen”, “+7d”=“steelblue”)

Box plots of node outdegree by sample

p1 <- plotSubtrees(graph_list, “SAMPLE”, “outdegree”, colors=sample_colors,
main_title=“Node outdegree”, legend_title=“Time”,
style=“box”, silent=TRUE)

Box plots of subtree size by sample

p2 <- plotSubtrees(graph_list, “SAMPLE”, “size”, colors=sample_colors,
main_title=“Subtree size”, legend_title=“Time”,
style=“box”, silent=TRUE)

Violin plots of subtree path length by isotype

p3 <- plotSubtrees(graph_list, “ISOTYPE”, “pathlength”, colors=IG_COLORS,
main_title=“Subtree path length”, legend_title=“Isotype”,
style=“violin”, silent=TRUE)

Violin plots of subtree depth by isotype

p4 <- plotSubtrees(graph_list, “ISOTYPE”, “depth”, colors=IG_COLORS,
main_title=“Subtree depth”, legend_title=“Isotype”,
style=“violin”, silent=TRUE)

Plot in a 2x2 grid

gridPlot(p1, p2, p3, p4, ncol=2)

3.6 Counting and testing node annotation relationships

#@ Given a set of annotated trees, you can determine the abundance of specific parent-child relationships within individual trees using the tableEdges function and the signficance of these relationships in population of trees using the testEdges function. Annotation relationships over edges can be calculated as direct or indirect relationships, where a direct relationship is a parent-child pair and an indirect relationship is a decent relationship that travels through another node (or nodes) first.

3.7 Tabulating edges for a single tree

#@ Tabulating all directparent-child annotation relationships in the tree by isotype annotation can be performed like so:

Count direct edges between isotypes

tableEdges(graph, “ISOTYPE”)

A tibble: 5 x 3

Groups: PARENT [3]

PARENT CHILD COUNT

1 IgG IgG 2

2 IgM NA 1

3 NA IgG 2

4 NA IgM 2

5 NA NA 1

#@ The above output is cluttered with the NA annotations from the germline and inferred nodes. We can perform the same direct tabulation, but exclude any nodes annotated with either Germline or NA for isotype using the exclude argument:

Direct edges excluding germline and inferred nodes

tableEdges(graph, “ISOTYPE”, exclude=c(“Germline”, NA))

A tibble: 1 x 3

Groups: PARENT [1]

PARENT CHILD COUNT

1 IgG IgG 2

#@ As there are inferred nodes in the tree, we might want to consider indirect parent-child relationships that traverse through inferred nodes. This is accomplished using the same arguments as above, but with the addition of the indirect=TRUE argument which will skip over the excluded nodes when tabulating annotation pairs:

Count indirect edges walking through germline and inferred nodes

tableEdges(graph, “ISOTYPE”, indirect=TRUE, exclude=c(“Germline”, NA))

A tibble: 2 x 3

Groups: PARENT [2]

PARENT CHILD COUNT

1 IgG IgG 2

2 IgM IgG 2

3.8 Significance testing of edges in a population of trees

#@ Given a population of trees, as a list of annotated igraph objects, you can determine if there is enrichment for specific annotation pairs using the testEdges function. This has the same options as tableEdges, except that the values c(“Germline”, NA) are excluded by default. testEdges performs a permutation test to generated a null distribution, excluding permutation of of any annotations specified to the exclude argument (these annotation remain fix in the tree). P-values output by testEdges are one-sided tests that the annotation pair is observed more often than expected.

Test isotype relationships

edge_test <- testEdges(graph_list, “ISOTYPE”, nperm=20)

Print p-value table

print(edge_test)

PARENT CHILD COUNT EXPECTED PVALUE

1 IgA IgA 36 34.800000 0.0500000

2 IgA IgG 2 4.187500 0.6875000

3 IgG IgA 1 2.444444 0.8888889

4 IgG IgG 99 97.650000 0.3000000

Plot null distributions for each annotation pair

plotEdgeTest(edge_test, color=“steelblue”, main_title=“Isotype Edges”, style=“hist”)

3.9 Counting and testing MRCA annotations

#@ The most recent common ancestor (MRCA) of an Ig lineage we define herein as the most ancestral observed, or inferred, sequences in the lineage tree. Meaning, the node that is most proximal, by some measure, to the germline (root) node. The getMRCA and testMRCA functions provide extraction and significance testing of MRCA sequences by annotation value, respectively.

3.10 Extracting MRCAs from a tree

#@ Extracting the MRCA from a tree is accomplished using the getMRCA function. The germline distance criteria are as described above for getPathLengths and can be either node hops or mutational events, with or without exclusion of nodes with specific annotations. To simply extract the annotations for the node(s) immediately below the germline, you can use the path=steps argument without any node exclusion:

Use unweighted path length and do not exclude any nodes

mrca_df <- getMRCA(graph, path=“steps”, root=“Germline”)

Print subset of the annotation data.frame

print(mrca_df[c(“NAME”, “SAMPLE”, “ISOTYPE”, “STEPS”, “DISTANCE”)])

NAME SAMPLE ISOTYPE STEPS DISTANCE

Inferred1 Inferred1 1 20

#@ To use mutational distance and consider only observed (ie, non-germline and non-inferred) nodes, we specify the exclusion field (field=“ISOTYPE”) and exclusion value within that field (exclude=NA):

Exclude nodes without an isotype annotation and use weighted path length

mrca_df <- getMRCA(graph, path=“distance”, root=“Germline”, field=“ISOTYPE”, exclude=NA)

Print excluding sequence, label, color, shape and size annotations

print(mrca_df[c(“NAME”, “SAMPLE”, “ISOTYPE”, “STEPS”, “DISTANCE”)])

NAME SAMPLE ISOTYPE STEPS DISTANCE

GN5SHBT01D6X0W GN5SHBT01D6X0W -1h IgM 1 22

3.11 Significance testing of MRCA annotations

#@ Similar to testEdges, the function testMRCA will perform a permutation test to determine the significance of an annotation appearing at the MRCA over a population of trees. P-values output by testMRCA are one-sided tests that the annotation is observed more often than expected in the MRCA position.

Test isotype MRCA annotations

mrca_test <- testMRCA(graph_list, “ISOTYPE”, nperm=20)

Print p-value table

print(mrca_test)

ANNOTATION COUNT EXPECTED PVALUE

1 IgA 12 11.15 0.00

2 IgG 31 31.85 0.85

Plot null distributions for each annotation

plotMRCATest(mrca_test, color=“steelblue”, main_title=“Isotype MRCA”, style=“hist”)

4.结束

sessionInfo()

R version 3.6.2 (2019-12-12)

Platform: x86_64-w64-mingw32/x64 (64-bit)

Running under: Windows 10 x64 (build 18363)

Matrix products: default

locale:

[1] LC_COLLATE=Chinese (Simplified)_China.936

[2] LC_CTYPE=Chinese (Simplified)_China.936

[3] LC_MONETARY=Chinese (Simplified)_China.936

[4] LC_NUMERIC=C

[5] LC_TIME=Chinese (Simplified)_China.936

attached base packages:

[1] stats graphics grDevices utils datasets methods

[7] base

other attached packages:

[1] igraph_1.2.4.2 scales_1.1.0 dplyr_0.8.3 alakazam_0.3.0

[5] ggplot2_3.2.1

loaded via a namespace (and not attached):

[1] Seurat_3.1.2 TH.data_1.0-10

[3] Rtsne_0.15 colorspace_1.4-1

[5] seqinr_3.6-1 pryr_0.1.4

[7] ggridges_0.5.2 rstudioapi_0.10

[9] farver_2.0.3 leiden_0.3.2

[11] listenv_0.8.0 npsurv_0.4-0

[13] ggrepel_0.8.1 fansi_0.4.1

[15] mvtnorm_1.0-12 codetools_0.2-16

[17] splines_3.6.2 R.methodsS3_1.7.1

[19] mnormt_1.5-5 lsei_1.2-0

[21] TFisher_0.2.0 ade4_1.7-13

[23] zeallot_0.1.0 jsonlite_1.6

[25] packrat_0.5.0 ica_1.0-2

[27] cluster_2.1.0 png_0.1-7

[29] R.oo_1.23.0 uwot_0.1.5

[31] sctransform_0.2.1 readr_1.3.1

[33] compiler_3.6.2 httr_1.4.1

[35] backports_1.1.5 assertthat_0.2.1

[37] Matrix_1.2-18 lazyeval_0.2.2

[39] cli_2.0.1 prettyunits_1.1.0

[41] htmltools_0.4.0 tools_3.6.2

[43] rsvd_1.0.2 gtable_0.3.0

[45] glue_1.3.1 RANN_2.6.1

[47] reshape2_1.4.3 Rcpp_1.0.3

[49] Biobase_2.46.0 vctrs_0.2.1

[51] multtest_2.42.0 gdata_2.18.0

[53] ape_5.3 nlme_3.1-142

[55] gbRd_0.4-11 lmtest_0.9-37

[57] stringr_1.4.0 globals_0.12.5

[59] lifecycle_0.1.0 irlba_2.3.3

[61] gtools_3.8.1 future_1.16.0

[63] MASS_7.3-51.4 zoo_1.8-7

[65] hms_0.5.3 parallel_3.6.2

[67] sandwich_2.5-1 RColorBrewer_1.1-2

[69] reticulate_1.14 pbapply_1.4-2

[71] gridExtra_2.3 stringi_1.4.3

[73] mutoss_0.1-12 plotrix_3.7-7

[75] caTools_1.17.1.4 BiocGenerics_0.32.0

[77] bibtex_0.4.2.2 Rdpack_0.11-1

[79] SDMTools_1.1-221.2 rlang_0.4.2

[81] pkgconfig_2.0.3 bitops_1.0-6

[83] lattice_0.20-38 ROCR_1.0-7

[85] purrr_0.3.3 labeling_0.3

[87] htmlwidgets_1.5.1 cowplot_1.0.0

[89] tidyselect_0.2.5 RcppAnnoy_0.0.14

[91] plyr_1.8.5 magrittr_1.5

[93] R6_2.4.1 gplots_3.0.1.2

[95] multcomp_1.4-12 pillar_1.4.3

[97] withr_2.1.2 sn_1.5-4

[99] fitdistrplus_1.0-14 survival_3.1-8

[101] tibble_2.1.3 future.apply_1.4.0

[103] tsne_0.1-3 crayon_1.3.4

[105] utf8_1.1.4 KernSmooth_2.23-16

[107] plotly_4.9.1 progress_1.2.2

[109] grid_3.6.2 data.table_1.12.8

[111] metap_1.2 digest_0.6.23

[113] tidyr_1.0.0 numDeriv_2016.8-1.1

[115] R.utils_2.9.2 RcppParallel_4.4.4

[117] stats4_3.6.2 munsell_0.5.0

[119] viridisLite_0.3.0

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
uf_brep_ask_topology是一种在CAD软件中询问拓扑信息的函数。拓扑信息指的是物体的结构和组成方式,括边界、面、点以及它们之间的关系。通过调用uf_brep_ask_topology函数,我们可以获取物体的拓扑信息,以便在处理CAD模型时进行相应的操作。 该函数的输入参数通常为一个虚拟实体对象,它代表了一个物体或几何体。函数将返回一个表示该物体拓扑信息的数据结构,可以是边界、面或点的列表。在获取拓扑信息之前,我们可能需要进行一些准备工作,例如通过uf_brep_create_entity函数创建一个虚拟实体对象。 通过调用uf_brep_ask_topology函数,我们可以了解物体的边界,即物体的外形轮廓。边界由一系列有序的边组成,每条边连接两个顶点。我们可以获取边的起点和终点的坐标,并计算边的长度和方向。另外,边界还可以通过将边连接起来形成闭合的曲线或多边形。 我们还可以获取物体的面信息,即物体的表面。面可以是平面、圆柱面、球面等等,通过面我们可以计算物体的表面积和法向量。 此外,我们还可以获取物体的顶点信息,即物体的角点或交叉点。顶点是物体的最小单元,通过顶点我们可以计算物体的体积以及顶点之间的距离。 总之,通过uf_brep_ask_topology函数,我们可以获取物体的边界、面和顶点等拓扑信息,进而进行CAD模型的分析、修改和操作。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值