Orthology Resources

Many groups have developed methods and databases to map orthologs and homologs. Some of the strongest are listed here, with some of their individual strengths and characteristics.

NCBI's homolog database. Well integrated with Entrez Gene and other NCBI resources, and summarizes domain structures and literature links for families. Coverage is limited (several vertebrates and a few major model organisms) and frequently splits close homologs (e.g, vertebrate paralogs, or see that worm and fly orthologs are clustered with different vertebrate paralogs).

Provides massive trees for all animal genes, and a few model non-animals. Very thorough. Navigating the sheer scope of the data can be difficult.

Probably the most popular tool for multi-genome clustering. This site includes a very large scale database of over 150 genomes, and some nice search tools, such as phylogenetic profiles (find genes present in some organisms and absent in others). For browsing, the final data are a little difficult, as they are mostly just linked to Ensembl IDs.

Part of the Ensembl database, has an elaborate analysis mechanism, though the interface is difficult (access gene-specific information through the regular Ensembl interface), and coverage seems relatively low.

Another major database and method, InParanoid focuses on pairwise similarity between genomes, though it was later extended (MultiParanoid) to multiple genomes. Lower coverage than OrthoMCL, but finds a number of orthologous sets missed by other methods.

An integration of 7 other orthology databases.

Many other orthology databases exist that we haven't yet had a chance to explore. These include P-POD (Princeton Protein Orthology Database), KEGG Orthology, EggNog, POGS (plant-specific), OrthoDB, roundup, Gene-Oriented Orthology Database and many more