|
7mer
- a term derived from the word oligomer.
You're probably familiar with the term polymer- a molecule
composed of
many repeating subunits. "Poly" means many.
"Oligo" means few. Here, an oligomer refers to a
nucleotide base sequence that is only repeated a few times. A 7mer
would look
like ACTTTTTTTGATC,
where the thymine is repeated seven times in a row.
You may see 10mer, 15mer, etc. The number as a
prefix simply
defines exactly how many times that base repeats.
ARB – ARB is a
UNIX-based
software that can create a local database, analyze and make trees of
nucleic
acid sequences.
BLAST
– Basic
Local Alignment Search Tool is an algorithm which does a best alignment
of two
sequences before comparing them for similarities/differences.
Conserved
– The
conserved regions of a gene are those parts that, over time, have
accumulated
very few mutations/changes in the nucleotide sequence.
Core Set –
The Core Set is a group
of about 10,000 sequences within the complete Greengenes database. It is a set of sequences
representative of
most of the major prokaryotic taxa.
eukaryote
- an organism whose DNA is bound inside a nuclear membrane and has
organelles
within the cytoplasm.
FASTA
– The FASTA is a text file
comprised of a unique identification of the sequence(s) followed by the
actual
nucleotide bases of the sequence(s).
It
is the file format either returned from the sequencing lab or the file
created
by calling the bases from a chromatogram. To learn more about FASTA files, click here.
GenBank
– GenBank is the National
Institute of Health’s (NIH) database of genomic sequences. It is a repository of all
publicly available
DNA sequences. To learn more about GenBank click here.
Hypervariable
– The hypervariable
regions of a gene are those portions that, over time, have been more
tolerant
of mutations, and have therefore accumulated more changes within the
nucleotide
sequence.
JGI
– The Joint Genomic Institute is a
Department Of Energy Lab, funded under through the Office of Biological
and
Environmental Research in DOE's Office of Science.
It is a collaborative effort to sequence
genomes, both for public and DOE projects. To learn more about the Joint Genomic Institute click here.
NAST
– Near Alignment Space Termination
is the Greengenes algorithm that matches up submitted sequences with
the
Greengenes database to look for similarities and align the submitted
sequences
based on those similarities.
NCBI
– The National
Center
for Biotechnology
Information is a subdivision of the National Institute of Health. It maintains databases,
(one of which is
GenBank) develops software and works on bioinformatics as well. To learn more about NCBI click here.
OTU
- An Operational Taxonomic Unit is just that, a
defined level which taxonomists use to discuss or compare organisms.
In context of the taxonomies used by Greengenes an OTU refers
to the terminal level at which that taxonomy classifies the sequences.
While it might be all the way down to the specific strain for
one taxonomy it might only be to sub-order for another.
phenotype
- the observable, physical characteristics of an organism. When
talking about microorganisms it harkens back to classifying them using
characteristics like gram positive or gram negative, shape of the cell
(bacillus or coccus), etc.
prokaryote - an
organism lacking a defined nucleus
and other organelles
prokMSA
– an old
name for Greengenes
putative
chimera –
these are sequences that Greengenes/Bellerephon deems to most likely be
chimeras because it has found strong similarities between portions of
the
sequence and other submitted sequences.
Simrank
– Simrank
is an algorithm which uses oligomers in
common to compare two nucleic acid sequences
Tutorial Main
Greengenes
Main
|