UniGene is an experimental system for automatically partitioning GenBank sequences into a non-redundant set of gene-oriented clusters. Each UniGene cluster contains sequences that represent a unique gene, as well as related information such as the tissue types in which the gene has been expressed and map location.
Entrez Gene is a searchable database of genes,
from
RefSeq
genomes, and defined by sequence and/or located in the NCBI Map Viewer.
Entrez Gene (
www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gene) is NCBI's database for gene-specific information.
It does not include all known or predicted genes; instead Entrez Gene focuses on the genomes that have been completely sequenced, that have an active research community to contribute gene-specific information, or that are scheduled for intense sequence analysis. The content of Entrez Gene represents the result of curation and automated integration of data from
NCBI's Reference Sequence project (RefSeq), from collaborating model organism databases, and from many other databases available from NCBI.
Entrez Gene is a step forward from NCBI's LocusLink, with both a major increase in taxonomic scope and improved access through the many tools associated with NCBI Entrez.
The Reference Sequence (RefSeq) collection aims to provide a
comprehensive,
integrated,
non-redundant set of sequences, including genomic DNA, transcript (RNA), and protein products, for major research organisms.RefSeq records
are derived from primary GenBank submissions; varying levels of validation, additional annotation, and manual curation are applied to the RefSeq record.
For example UniGene has clustered transcript information for some species that Entrez Gene does not, and Entrez Gene has records not cross-referenced in UniGene. Entrez Gene is solely responsible for providing the unique GeneID that is used to identify information for genes and other types of loci.