|
What Are The Taxonomies Utilized by Greengenes?
Pace, Hugenholtz, Ludwig, NCBI, G2_Chip, RDP
You may have learned the standard Kingdom, Phylum,
Class, Order, Family, Genus & species classification system. Each of
the six taxonomies offered through Greengenes uses its own variation of this
theme. Each taxonomy has behind it a team of researchers attempting to
manage the tens of thousands of bacterial and archaeal genes by creating a
hierarchical classification system.
Grouping
sets of microorganisms according to patterns in their DNA is
influenced by many different factors. There is the scope of genes
to consider. With so many different microorganisms to be
classified and the existence of multiple distinguishing genes for any
given microbe, it can be a daunting task. Also to be considered
are the quality of the gene
alignments and the types of positions (conserved or hypervariable)
within the
genes allowed into the comparison. For a given taxonomy, the full
length of the gene or genes being used to classify a bacteria or
archaea will not be used with equal weighting. Differences within
a hypervariable section are more likely to be disregarded, or regarded
as of lower value than differences and distinctions within conserved
portions of a gene when assigning a microorganism to a taxonomic
grouping. Some taxonomies are more heavily influenced by historical groupings based upon
phenotype.
Biologists disagree with respect to which method is
superior, hence the existence of multiple taxonomies. Further confusion
is caused simply by the (poor?) choices biologists may use when naming
novel
candidate phyla (when someone discovers a totally new type of
bacteria), or
the complete absence of names for candidate phyla.
For example, Pace and Hugenholtz have independently
named over a dozen phylum-level lineages, many of which are the same lineages,
and RDP has not named any of these lineages. This is a consequence of the huge
number of environmental sequences in the public databases and frequent
redundant naming of environmental lineages in the scientific journals. We hope
that making multiple taxonomic classifications available through Greengenes
will promote user awareness of disagreements and soon will aid in standardizing
classification, particularly classification of environmental lineages.
When reading a classification report from Greengenes,
you may find that a given sequence was classified with labels for it’s
sub-phyla, sub-orders and sub-families and on down to species using one
taxonomy while another taxonomy will classify without those 'sub' categories
and may terminate in "unclassified". This simply means that
this taxonomy has yet to firmly establish the identity of this particular
strain of bacteria. As well, you will see Operational Taxonomic Units
(OTU/otu) used as the final, most narrow grouping in a hierarchy instead of "strain".
To see a Venn diagram comparing the six taxonomies, click here.
Near Neighbors
Tutorial Main
Greengene Main |