|
| Removing Chimeras (continued) |
Removing
Chimeric Sequences From Your Report:
Step 2:
Choosing Parameters by Which Greengenes & Bellerophon
Will Operate
All of these parameters have defaults.
Use them. Following is an explanation of the
parameters by which Greengenes
instructs Bellerophon to operate. This explanation is merely meant to
help
sort through what Bellerophon will be doing, but the beginning user should use the
default parameters to run a chimeric check.
A) Your sequences will be compared to
one another to
look for similarities. Greengenes is looking to see if there
is a
high similarity between sequences, possibly indicating that one of your
sequences is chimeric and derived from the another submitted sequence,
called a parent sequence. "Parent sequences" are just that,
the
sequences from which
part of a chimeric sequence has derived.
B) Greengenes has a 'Core Set'
of sequences
(about 10,000 out of the 100,000 in the library) which are
thought to be largely representative of most taxonomic groups.
C) If your sequence matches a
known sequence in the Core Set by 97% or more,
it will be considered to be non-chimeric and not submitted to
Bellerophon for checking.
D) Greengenes is comparing,
base by base, your
sequence with its core set. When Greengenes is able to align
1250
bases within a conserved region of the gene it makes the assumption
that it isn't looking at a novel or chimeric sequence. Therefore, such
sequences will be skipped when submitting to Bellerophon.
E) Once Greengenes removes
from the chimeric
check any sequence that aligned for 1250 bases by 97% or more, it then
compares the remaining sequences to the Core Set and to eachother to
look for parents. The default "7" here
means that for each window of columns compared it is considering the
top seven parental options.
F) If a potential chimera is
composed of a
parent that only loosely matches the candidate (default: <90%)
then
Greengenes will call this a sub threshold chimera and won't submit it
to Bellerophon for further chimeric checking.
G) The "Divergence Ratio
Threshold" is a
mathematical calculation with which the beginning user need not be
concerned.
H) Enter the email address to
which you would
like your non-chimeric FASTA file sent. You will also be sent
a
report summarizing the Bellerophon/Greengenes results regarding your
submitted
sequences.
To see an example of a Bellerophon/Greengenes summary report along with
complete descriptions of each of its columns, click here.
To proceed with
classifying your clean sequences, click here.
What is a Chimera?
Tutorial Main
Greengenes
Main
|
|