How to Read Your Classification Spreadsheet:
The
spreadsheet displays where each of a user’s sequences fit in the taxonomic
hierarchy of each of six classification systems. Also, there are details that tell how
Greengenes decides where the user’s sequence fit in each taxonomy. Basically, Greengenes finds the closest
matching sequence in each of the curated collections and transfers the taxonomy
of the sequence in the collection to the unknown. The closer the relationship between a user
sequence and the one in a taxonomy, the more trustworthy the taxonomic
placement.

The
first column in the spreadsheet is the identity of the user sequence.
The remaining thirty columns of the report are grouped by
taxonomy. For each user
sequence, there are five columns of data. These same five columns
of data repeat for each of the six taxonomies. A
description of each of the five columns follows:
Column 1 - prokMSA_id – the greengenes numerical identifier of the nearest neighbor in
greengenes which was classified and was not itself a chimera; a reference
sequence.
Column 2 - SimRank id - This column is given as a percent of 7mers in common between the
cloned sequence and the reference.
Column 3 - DNAML
id – (here, ML is "maximum likelihood) It is the identity between the user sequence and the reference considering only
conserved bases. The closer the value is
to 1, the better the match.
Column 4 - DNAML
columns - a count of how many bases were compared between the cloned sequence
and the reference. There should be a high correlation between the value
reported here and the number reported in column "H" in the post NAST
report. If the number here is much lower than column "H" then
one doesn't place a lot of confidence in this particular taxonomic placement.
Column 5 - Classification of the reference organism. This column is the particular
taxonomic assignment of the sequence you submitted. For some taxonomies
it will appear more like the standard "Kingdom, Phylum, Class, Order,
Family, Genus, Species". Other taxonomies may utilize sub Phyla or
Sub Families. They may stop at the Family or Genus level. You will
most likely not receive the typical genus and species classification. More likely it will be labeled as an OTU, or
operational taxonomic unit.
Taxonomies
Tutorial main
Greengenes Main
|