|
|
|
Align a batch of sequences against the 16S greengenes rRNA gene database using NAST
|
Use this tool for aligning your set of 16S rRNA gene sequences using NAST or for finding near-neighbors or both.
Each query sequence in your uploaded file will be searched for 16S rRNA gene
sequences and aligned according to a Core Set of alignment templates. You can even upload a whole fasta genome and NAST will
find, extract, and align the 16S rRNA genes for you.
Additionally, Simrank (an N-mer comparison tool)
will be used to find nearest-neighbors (non-chimeric) as well as nearest-isolates for each of your
sequences from the entire Greengenes database.
Then, the query sequences can be
returned by email either aligned or unaligned, with or without the near-neighbor sequences included in
supplementary files. Please refrain from uploading files containing more than 500 sequences.
If you have a larger project, try breaking it up into several files, or contact tdesantis@lbl.gov to
make alternate arrangements. This version of NAST will align at a rate of ~10 sequences per minute.
Server status:
0 NAST alignment job(s) currently underway.
Notes on contents of returned table:
A table will accompany your NAST-aligned sequences if selected above. The columns of this table are as follows:
- candidate sequence ID: The name of your sequence.
- candidate nucleotide count: The number of bases in your sequence.
- errors: A description of an error encountered when NASTing a particular sequence.
- template ID: The prokMSA_id (a.k.a. gg_id) of the sequence used as the alignment template.
- BLAST percent identity to template: Percentage calculated along HSP only.
- longest insertion relative to template: Largest local misalignment produced in order to preserve the global alignment.
- candidate span aligned: If only a sub-section of the candidate sequence could be aligned to the template, you see the span positions here.
- candidate nucleotide count post-NAST: Should always be 7682, that is the point of NAST.
- unaligned length: Count of bases making it into the aligned sequence.
- count of single nucleotide 7mers or longer Nmers: Just a helpful alert to odd homopolymers.
- non-ACGT nucleotide count
- non-ACGT nucleotide percent
|
|
|
|