Consensus

Tutorial
Taxonomies
Tutorial Topics
Home
The 16S Gene
Lab Overview
Base Calling
Alignment
Chimeras
Classification
Tree
Mutations
Taxonomies
Micro-Arrays
Sample Data
Teachers

Taxonomies

What Are The Taxonomies Utilized by Greengenes?

 Pace,  Hugenholtz,  Ludwig,  NCBI,  G2_Chip,  RDP

You may have learned the standard Kingdom, Phylum, Class, Order, Family, Genus & species classification system.  Each of the six taxonomies offered through Greengenes uses its own variation of this theme.  Each taxonomy has behind it a team of researchers attempting to manage the tens of thousands of bacterial and archaeal genes by creating a hierarchical classification system.  Grouping sets of microorganisms according to patterns in their DNA is influenced by many different factors.  There is the scope of genes to consider.  With so many different microorganisms to be classified and the existence of multiple distinguishing genes for any given microbe, it can be a daunting task.  Also to be considered are the quality of the gene alignments and the types of positions (conserved or hypervariable) within the genes allowed into the comparison.  For a given taxonomy, the full length of the gene or genes being used to classify a bacteria or archaea will not be used with equal weighting.  Differences within a hypervariable section are more likely to be disregarded, or regarded as of lower value than differences and distinctions within conserved portions of a gene when assigning a microorganism to a taxonomic grouping.   Some taxonomies are more heavily influenced by historical groupings based upon phenotype.  Biologists disagree with respect to which method is superior, hence the existence of multiple taxonomies. Further confusion is caused simply by the (poor?) choices biologists may use when naming novel candidate phyla (when someone discovers a totally new type of bacteria), or the complete absence of names for candidate phyla.
     For example, Pace and Hugenholtz have independently named over a dozen phylum-level lineages, many of which are the same lineages, and RDP has not named any of these lineages. This is a consequence of the huge number of environmental sequences in the public databases and frequent redundant naming of environmental lineages in the scientific journals. We hope that making multiple taxonomic classifications available through Greengenes will promote user awareness of disagreements and soon will aid in standardizing classification, particularly classification of environmental lineages.
When reading a classification report from Greengenes, you may find that a given sequence was classified with labels for it’s sub-phyla, sub-orders and sub-families and on down to species using one taxonomy while another taxonomy will classify without those 'sub' categories and may terminate in "unclassified".  This simply means that this taxonomy has yet to firmly establish the identity of this particular strain of bacteria.  As well, you will see Operational Taxonomic Units (OTU/otu) used as the final, most narrow grouping in a hierarchy instead of "strain".

    To see a Venn diagram comparing the six taxonomies, click here. 

Near Neighbors

Tutorial Main

Greengene Main
 FASTA
file
Align
1
Align
2
Align
3
Align
4
Align
5
Chimera
1
Chimera
2
Classify Tree