circos - visualizing the genome, among other things

visual guide to Circos

A visual guide to Circos (Circos - an information aesthetic for comparative genomics) presents some of the capabilities of Circos and illustrates its application in the field of comparative genomics and genome visualization.

 

Download: medium bitmap (7Mb) | huge bitmap (46Mb) | PDF (40Mb) | Illustrator (20Mb) (PDF and Illustrator files are very complex)

Circos - an information aesthetic for comparative genomics - presented at Genome Informatics 2008, Hinxton, UK

what is Circos?


Video | An animated view of the similarity between human (upper half) and dog (lower half) genomes ...more

An image created with circos <a href='?screenshots'>...more</a>
Figure | An image created with circos ...more

Circos is designed for visualizing genomic data such as alignments , conservation , and generalized 2D data, such as line , scatter , heatmap and histogram plots. Circos is very flexible — you can use it to visualize any kind of data, not just genomics. Circos has been used to visualize customer flow in the auto industry, volume of courier shipments, database schemas, and presidential debates.

The creation of Circos was motivated by a need to visualize intra- and inter-chromosomal relationships within one or more genomes, or between any two or more sets of objects with a corresponding distance scale. Circos is similar to chromowheel and, to a lesser extent, genopix .

Circos uses a circular composition of ideograms to mitigate the fact that some data, like combinations of intra- and inter-chromosomal relationships (alignments, duplications, assembly paired-ends, etc) are very difficult to organize when the underlying ideograms (or contigs) are arranged as lines. In many cases, it is impossible to keep the relationship lines from crossing other structures and this deteriorates the effectiveness of the graphic.

Specific features are included to help viewing data on the genome. The genome is a large structure with localized regions of interest, frequently separated by large oceans of uninteresting sequence. To help visualize data in this context, Circos can create images with variable axis scaling, permitting local magnification of genomic regions to be controlled without cropping. Scale smoothing ensures that the magnification level changes smoothly. In combination with axis breaks and custom ideogram order, the final image can be easily tuned to offer the clearest illustration of your data.

All aspects of the output image are tunable, making Circos a flexible and extensible tool for the generation of publication-quality , circularly composited renditions of genomic data and related annotations .

Circos is written in Perl and produces bitmap (PNG) and vector (SVG) images using plain text configuration and input files.

how does it work?

Circos is driven by a Apache-like, text configuration file and accepts data from flat files . There is currently no graphical user interface for Circos and no plan to create one.

It is easy to plot, format and layer your data with Circos. A large variety of plot and feature parameters are customizable, helping you make the image that best communicates your data. You supply your data to Circos as flat files (e.g. GFF format), tell Circos what you want plotted using the configuration file, and then create the image.

Great for posters too. <a href='images/circos-conservation.png'>zoom</a>
Figure | Great for posters too. zoom

only for genomic data?

Circos can be applied to draw any kind of data, not just from the field of genomics. Since I work in genomics, I've been using Circos to draw the kind of data I work with. Circos is ideally suited when your data represents relationships between positions on one or more scales.

You can turn tabular data into Circos images using the online version of Circos . Transform boring tables into informative and visually compelling datagraphics.

Visualization of a table with circos.

Large tables can be visualized - below is an example of a 54x14 table.

Visualization of a large table with circos.

I've applied circular compositing to represent database structure with Schemaball .

plot types

Support exists for a variety of plot types, such as paired-location, scatter, line, histogram, heat map, tiles, glyph and text elements plots. Plots may be combined in a single track and multiple tracks are supported. Colours and positions of individual elements can be tuned to suit your application.

Some examples of Circos plots. (A) glyph (B) highlight with depth control (C) scatter (D) paired-location (E) ribbon (F) histogram (G) tile (H) highlight with auto depth (I) text with auto arrange (J) heat map (K) high-density text (L) high-density glyph (M) multi-type composite (N) variable scale control (O) fine geometry control (P) flexible text and element placement (Q) transparent ribbons (R) stacked histogram (S) connectors (T) tick rings.
Figure | Some examples of Circos plots. (A) glyph (B) highlight with depth control (C) scatter (D) paired-location (E) ribbon (F) histogram (G) tile (H) highlight with auto depth (I) text with auto arrange (J) heat map (K) high-density text (L) high-density glyph (M) multi-type composite (N) variable scale control (O) fine geometry control (P) flexible text and element placement (Q) transparent ribbons (R) stacked histogram (S) connectors (T) tick rings.

Rules can be written to adjust formatting of plot elements based on position, value and formatting. You can control data characteristics (such as color, text size, position, etc) based on rules that may depend on initial data values.

global and local zooming

Circos is unique in its support for both global and local axis scale deformation . This is illustrated in the set of figures below, where magnification of ideograms and regions of ideograms can independently adjusted to accentuate or attenuate the visual impact of information.

You can draw ideograms with no scaling effects (left), with a global scale change applied to one or more ideograms (middle), and additionally add any number of local scale adjustments to enlarge/compress individual regions of ideograms (right). When applying local scale changes, the magnification can be smoothly varied across the zoom region.
Figure | You can draw ideograms with no scaling effects (left), with a global scale change applied to one or more ideograms (middle), and additionally add any number of local scale adjustments to enlarge/compress individual regions of ideograms (right). When applying local scale changes, the magnification can be smoothly varied across the zoom region.
<a href='images/circos-sample-large-23.png'>zoom</a> | hires <a href='images/hires/7.png'>01</a> <a href='images/hires/7-z01.png'>02</a> <a href='images/hires/7-z02.png'>03</a> <a href='images/hires/7-z03.png'>04</a> | The purpose of scale stretching is to expand regions which contain interesting data patterns. As one region is stretched, others are contracted to maintain the entire data domain in view. In this figure, location of genes (green), disease genes (orange) and cancer genes (red) are plotted on chr17 with the region in the vicinity of 35 Mb repeatedly expanded. Genes are drawn using <a href='http://mkweb.bcgsc.ca/circos/?tutorials&id=3'>highlights</a> with radial position representing the number of exons in the gene.
Figure | zoom | hires 01 02 03 04 | The purpose of scale stretching is to expand regions which contain interesting data patterns. As one region is stretched, others are contracted to maintain the entire data domain in view. In this figure, location of genes (green), disease genes (orange) and cancer genes (red) are plotted on chr17 with the region in the vicinity of 35 Mb repeatedly expanded. Genes are drawn using highlights with radial position representing the number of exons in the gene.
<a href='images/circos-sample-large-24.png'>zoom</a> | hires <a href='images/circos-sample-huge-19.png'>01</a> <a href='images/hires/19-z01.png'>02</a> <a href='images/hires/19-z02.png'>03</a> <a href='images/hires/19-z03.png'>04</a>| Scale stretching is very visually appealing when combined with images that depict spatial relationships. Shown here is the similarity of human chromosome 1 (hg17) to the entire genome of the mouse (mm5). Lines represent alignment chains between human and mouse regions and are color coded by the identity of the mouse chromosome on which they impinge. Regions of human chromosome 1 and mouse chromosome 5 are expanded to show details in the alignments.
Figure | zoom | hires 01 02 03 04 | Scale stretching is very visually appealing when combined with images that depict spatial relationships. Shown here is the similarity of human chromosome 1 (hg17) to the entire genome of the mouse (mm5). Lines represent alignment chains between human and mouse regions and are color coded by the identity of the mouse chromosome on which they impinge. Regions of human chromosome 1 and mouse chromosome 5 are expanded to show details in the alignments.

using Circos

How do you know whether Circos can be useful to you? First, take a look at some screenshots . These will give you an idea of the types of data visualizations that Circos can create.

Circos, shamelessly promoted (PDF <a href='images/circos-poster-01.pdf'>white</a>, <a href='images/circos-poster-02.pdf'>black</a>, or <a href='images/circos-poster-03.png'>archetype zoo</a>)
Figure | Circos, shamelessly promoted (PDF white , black , or archetype zoo )

I've made Circos to be simple to use, with the goal being to produce high quality genome diagrams suitable for publication. To keep Circos flexible, the configuration file that describes the generation of the image contains many settings - be sure to read the tutorials to familiarize yourself with these features.

To use Circos, you need to have Perl installed, along with a few CPAN modules . It's likely that you already meet all the requirements if you are working on a UNIX system.

You will also need a definition of the genome karyotypes, such as the content of the cytoBandIdeo table (UCSC genome browser). You can download the karyotype from the table browser or directly for human , mouse , or rat , or other species . The karyotype files are used to let Circos know the size and features of the chromosomes for the purpose of drawing the ideograms.

Once you've decided which species (one or more) and chromosomes (all, some, with optional spans) to use you can layer 2D and position-paired data in concentric "tracks".

future of Circos

I work on Circos in a passive-aggressive manner - sometimes passive sometimes aggressive. I welcome your comments - please contact Martin Krzywinski if you would like to report a bug, request a feature or share the ways in which you are using, or hope to use, Circos.

There is a development road map for Circos. With one eye on the future, I am also keeping track of what is happening now with Circos .

license

Circos is free software, licensed under GPL .

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值