Consensus

Tutorial
Classification_Excel
Tutorial Topics
Home
The 16S Gene
Lab Overview
Base Calling
Alignment
Chimeras
Classification
Tree
Mutations
Taxonomies
Micro-Arrays
Sample Data
Teachers

Classification Excel Report
How to Read Your Classification Spreadsheet:

The spreadsheet displays where each of a user’s sequences fit in the taxonomic hierarchy of each of six classification systems.  Also, there are details that tell  how Greengenes decides where the user’s sequence fit in each taxonomy.  Basically, Greengenes finds the closest matching sequence in each of the curated collections and transfers the taxonomy of the sequence in the collection to the unknown.  The closer the relationship between a user sequence and the one in a taxonomy, the more trustworthy the taxonomic placement.

Classification Spreadsheet

The first column in the spreadsheet is the identity of the user sequence.  The remaining thirty columns of the report are grouped by taxonomy. For each user sequence, there are five columns of data.  These same five columns of data  repeat for each of the six taxonomies.  A description of each of the five  columns follows:

Column 1 - prokMSA_id – the greengenes numerical identifier of the nearest neighbor in greengenes which was classified and was not itself a chimera; a reference sequence.

Column 2SimRank id - This column is given as a percent of 7mers in common between the cloned sequence and the reference.

Column 3 - DNAML id – (here, ML is "maximum likelihood) It is the identity between the user sequence and the reference considering only conserved bases.  The closer the value is to 1, the better the match.

Column 4 - DNAML columns - a count of how many bases were compared between the cloned sequence and the reference.  There should be a high correlation between the value reported here and the number reported in column "H" in the post NAST report.  If the number here is much lower than column "H" then one doesn't place a lot of confidence in this particular taxonomic placement.

Column 5 - Classification of the reference organism. This column is the particular taxonomic assignment of the sequence you submitted.  For some taxonomies it will appear more like the standard "Kingdom, Phylum, Class, Order, Family, Genus, Species".  Other taxonomies may utilize sub Phyla or Sub Families.  They may stop at the Family or Genus level.  You will most likely not receive the typical genus and species classification.  More likely it will be labeled as an OTU, or operational taxonomic unit.

Taxonomies

Tutorial main

Greengenes Main
 FASTA
file
Align
1
Align
2
Align
3
Align
4
Align
5
Chimera
1
Chimera
2
Classify Tree