a python Environment for Tree Exploration
Reviewed by Jaime Huerta-Cepas,corresponding author1 Joaqu
í
n Dopazo,2 and Toni Gabald
ó
ncorresponding author1
Abstract
Many
bioinformatics
analyses,
ranging
from
gene
clustering
to
phylogenetics,
produce
hierarchical
trees
as
their
main
result.
These
are
used
to
represent
the
relationships
among
different
biological
entities,
thus
facilitating
their
analysis
and
interpretation.
A
number
of
standalone programs are available that focus on tree visualization or that perform specific analyses
on them. However, such applications are rarely suitable for large-scale surveys, in which a higher
level
of
automation
is
required.
Currently,
many
genome-wide
analyses
rely
on
tree-like
data
representation
and
hence
there
is
a
growing
need
for
scalable
tools
to
handle
tree
structures
at
large scale.
Keywords: Python, spiking neurons, simulation, integrate and fire, teaching, neural networks,
computational neuroscience, software
Background
Here we present the Environment for Tree Exploration (ETE), a python programming toolkit
that
assists
in
the
automated
manipulation,
analysis
and
visualization
of
hierarchical
trees.
ETE
libraries
provide
a
broad
set
of
tree
handling
options
as
well
as
specific
methods
to
analyze
phylogenetic and clustering trees. Among other features, ETE allows for the independent analysis
of
tree
partitions,
has
support
for
the
extended
newick
format,
provides
an
integrated
node
annotation system and permits to link trees to external data such as multiple sequence alignments
or numerical arrays. In addition, ETE implements a number of built-in analytical tools, including
phylogeny-based
orthology
prediction
and
cluster
validation
techniques.
Finally,
ETE's
programmable tree drawing engine can be used to automate the graphical rendering of trees with
customized node-specific visualizations.
Conclusions
ETE
provides
a
complete
set
of
methods
to
manipulate
tree
data
structures
that
extends
current
functionality
in
other
bioinformatic
toolkits
of
a
more
general
purpose.
ETE
is
free
software and can be downloaded from http://ete.cgenomics.org.
Trees
are
commonly
used
to
represent
the
results
of
many
bioinformatics
analyses.
In
particular, such type of binary graphs are ideal to describe the hierarchical relationships among a
variety of biological entities. Some common examples are the evolutionary analysis of molecular
sequences
or
the
clusterization
of
genes
and
proteins
according
to
their
properties.
Besides
the