Consensus

Tutorial
Lab_Overview
Tutorial Topics
Home
The 16S Gene
Lab Overview
Base Calling
Alignment
Chimeras
Classification
Tree
Mutations
Taxonomies
Micro-Arrays
Sample Data
Teachers

Lab Overview

High Throughput Bacterial DNA Analysis

    Often, it is desireable to classify the microbial community within a given soil, water, aerosol or clinical sample.  The researcher may be doing a longitudinal study, for example, to find out the effects of an environmental toxin on that microbial community.  A clinician may be interested in seeing which bacteria are present in a patient's throat, lungs or GI tract.  While the majority of microbes cannnt be cultured in the laboratory environment and classify with older, more traditional methods, their DNA can be extracted readily from the environmental sample.  From this DNA, the classification of the entire microbial community can be discerned. The work currently being performed in Gary Andersen’s lab revolves around the isolation and identification of the 16S rRNA gene.  The reason for isolating the 16S rRNA gene is twofold.

    First, the 16S rRNA gene (codes for part of the ribosome) is distinct from its eukaryotic equivalent, the 18S rRNA gene, quickly and clearly separating bacterial and archaeal DNA from eukaryotic DNA.  By amplifying only the 16S rRNA gene with specific primers, the eukaryotic components fall off into the background and the researcher can easily clone only the prokaryotes. Secondly, the 16S rRNA gene can be species specific.  What this means is that the microbiologist or clinical researcher can not only distinguish bacteria from plants, animals, fungi and protists, but can determine the diversity within the bacterial community itself.    

    Within a single gram of soil there can be upwards of 109 organisms representing 103 to 106 distinct taxa.  The lab’s purpose is to create and support technologies that will do such high throughput analysis with the 16S rRNA gene, classifying large numbers of sequences within a microbial environment.  One major effort the lab has contributed to assist with this is the Greengenes technology.  Greengenes is a web based series of steps that allows the user to input a set of prokaryotic DNA sequences and receive an output file identifying each of the sequences.  The user can work through the various steps to trim out data of low quality, align the samples to account for mutations and/or differences, and have their samples compared to near neighbors or type strains (landmark species).  Another useful feature of Greengenes is that the user can have any chimeric DNA that may have been created during the PCR process removed, thereby increasing the accuracy and usefulness of the sequenced samples.  (An artifact of the PCR process, chimeras are sequences of DNA made from two or more parent sequences.)  On the output side, the user can receive an evaluation of their samples based upon six internationally recognized taxonomies.  The user also has the option to have their output in a form ready for input into a tree to show the closeness in the DNA sequence of given samples.   This is one particular area where near neighbor analysis can be useful. 

Another major effort in the lab to assist in high throughput analysis is creating and pursuing microarray technology as a means of analyzing DNA and identifying bacteria quickly and accurately.  The usefulness of microarrays is twofold.  First, the microarray allows researchers to identify a greatly increased number of biological samples, on the order of 100’s.  Standard technology requires the researcher to extract, isolate, and clone each sequence that they are interested in having sequenced.  Sequencing is a serial process which can be time-consuming for most labs, often consuming 5-10 days.  Once the data come back from the sequencing lab they still need to be analyzed and compared to known taxonomic groups to individually identify the sequences.  The microarray allows samples to be identified directly within the lab. 

The second, related benefit of the microarray is that they allow researchers to get a more complete record of the biological identity and diversity of the sample being analyzed.  Rather than the researcher randomly choosing gene samples to clone and then be sequenced, (often only those sequences occurring with the greatest frequency within a sample), the microarray can sample an entire environmental sample.  For a given environmental/medical sample at least 40,000 16S rRNA genes would have to be sequenced in order to get a reproducible record of the biological diversity contained within that sample.  Often, many of the under represented bacteria within a sample are overlooked in conventional sequencing.  Microarrays have the capacity to eliminate this oversight by simultaneously assaying all (~1012) genes recovered from a sample.

Microarray Technology

16S rRNA gene

Tutorial Main

Greengenes Main
 FASTA
file
Align
1
Align
2
Align
3
Align
4
Align
5
Chimera
1
Chimera
2
Classify Tree