visual guide to Circos
A visual guide to Circos (Circos - an information aesthetic for comparative genomics) presents some of the capabilities of Circos and illustrates its application in the field of comparative genomics and genome visualization.
Download: medium bitmap (7Mb) | huge bitmap (46Mb) | PDF (40Mb) | Illustrator (20Mb) (PDF and Illustrator files are very complex)
what is Circos?
Video | An animated view of the similarity between human (upper half) and dog (lower half) genomes ...more
Circos is designed for visualizing genomic data such as alignments , conservation , and generalized 2D data, such as line , scatter , heatmap and histogram plots. Circos is very flexible — you can use it to visualize any kind of data, not just genomics. Circos has been used to visualize customer flow in the auto industry, volume of courier shipments, database schemas, and presidential debates.
The creation of Circos was motivated by a need to visualize intra- and inter-chromosomal relationships within one or more genomes, or between any two or more sets of objects with a corresponding distance scale. Circos is similar to chromowheel and, to a lesser extent, genopix .
Circos uses a circular composition of ideograms to mitigate the fact that some data, like combinations of intra- and inter-chromosomal relationships (alignments, duplications, assembly paired-ends, etc) are very difficult to organize when the underlying ideograms (or contigs) are arranged as lines. In many cases, it is impossible to keep the relationship lines from crossing other structures and this deteriorates the effectiveness of the graphic.
Specific features are included to help viewing data on the genome. The genome is a large structure with localized regions of interest, frequently separated by large oceans of uninteresting sequence. To help visualize data in this context, Circos can create images with variable axis scaling, permitting local magnification of genomic regions to be controlled without cropping. Scale smoothing ensures that the magnification level changes smoothly. In combination with axis breaks and custom ideogram order, the final image can be easily tuned to offer the clearest illustration of your data.
All aspects of the output image are tunable, making Circos a flexible and extensible tool for the generation of publication-quality , circularly composited renditions of genomic data and related annotations .
Circos is written in Perl and produces bitmap (PNG) and vector (SVG) images using plain text configuration and input files.
how does it work?
Circos is driven by a Apache-like, text configuration file and accepts data from flat files . There is currently no graphical user interface for Circos and no plan to create one.
It is easy to plot, format and layer your data with Circos. A large variety of plot and feature parameters are customizable, helping you make the image that best communicates your data. You supply your data to Circos as flat files (e.g. GFF format), tell Circos what you want plotted using the configuration file, and then create the image.
only for genomic data?
Circos can be applied to draw any kind of data, not just from the field of genomics. Since I work in genomics, I've been using Circos to draw the kind of data I work with. Circos is ideally suited when your data represents relationships between positions on one or more scales.
You can turn tabular data into Circos images using the online version of Circos . Transform boring tables into informative and visually compelling datagraphics.
Large tables can be visualized - below is an example of a 54x14 table.
I've applied circular compositing to represent database structure with Schemaball .
plot types
Support exists for a variety of plot types, such as paired-location, scatter, line, histogram, heat map, tiles, glyph and text elements plots. Plots may be combined in a single track and multiple tracks are supported. Colours and positions of individual elements can be tuned to suit your application.
Rules can be written to adjust formatting of plot elements based on position, value and formatting. You can control data characteristics (such as color, text size, position, etc) based on rules that may depend on initial data values.
global and local zooming
Circos is unique in its support for both global and local axis scale deformation . This is illustrated in the set of figures below, where magnification of ideograms and regions of ideograms can independently adjusted to accentuate or attenuate the visual impact of information.
|
|
using Circos
How do you know whether Circos can be useful to you? First, take a look at some screenshots . These will give you an idea of the types of data visualizations that Circos can create.
I've made Circos to be simple to use, with the goal being to produce high quality genome diagrams suitable for publication. To keep Circos flexible, the configuration file that describes the generation of the image contains many settings - be sure to read the tutorials to familiarize yourself with these features.
To use Circos, you need to have Perl installed, along with a few CPAN modules . It's likely that you already meet all the requirements if you are working on a UNIX system.
You will also need a definition of the genome karyotypes, such as the content of the cytoBandIdeo
table (UCSC genome browser). You can download the karyotype from the table browser or directly for human , mouse , or rat , or other species . The karyotype files are used to let Circos know the size and features of the chromosomes for the purpose of drawing the ideograms.
Once you've decided which species (one or more) and chromosomes (all, some, with optional spans) to use you can layer 2D and position-paired data in concentric "tracks".
future of Circos
I work on Circos in a passive-aggressive manner - sometimes passive sometimes aggressive. I welcome your comments - please contact Martin Krzywinski if you would like to report a bug, request a feature or share the ways in which you are using, or hope to use, Circos.
There is a development road map for Circos. With one eye on the future, I am also keeping track of what is happening now with Circos .
license
Circos is free software, licensed under GPL .