Consensus

Functions
Home

Browse
Export
Slice
Consensus
Compare
Search
Probe
Align
Trim
Download
Curate
More Tools...

About
Citation
Tutorial
FAQ
Objectives
Methods
Contact

My Interest List
0 sequences

remove all
collapse all
show marked

My Taxonomy

Changing taxonomy will empty My Interest List.
 
Align a batch of sequences against the 16S greengenes rRNA gene database using NAST
Use this tool for aligning your set of 16S rRNA gene sequences using NAST or for finding near-neighbors or both.  Each query sequence in your uploaded file will be searched for 16S rRNA gene sequences and aligned according to a Core Set of alignment templates. You can even upload a whole fasta genome and NAST will find, extract, and align the 16S rRNA genes for you. Additionally, Simrank (an N-mer comparison tool) will be used to find nearest-neighbors (non-chimeric) as well as nearest-isolates for each of your sequences from the entire Greengenes database. Then, the query sequences can be returned by email either aligned or unaligned, with or without the near-neighbor sequences included in supplementary files. Please refrain from uploading files containing more than 500 sequences. If you have a larger project, try breaking it up into several files, or contact tdesantis@lbl.gov to make alternate arrangements. This version of NAST will align at a rate of ~10 sequences per minute.
Server status:
1 NAST alignment job(s) currently underway.
Fasta file to upload:

Please avoid any odd characters (parentheses, for example) in your file name. Be sure there is a return (newline) after the final nucleotide in your file.

Batch size for NAST:
Tells NAST how many records to align at once. Bigger batches (max. 100) help the whole job to complete sooner but smaller batches (min. 1) give more frequent feedback to the screen while the job is running. This parameter has no effect on alignment quality.
Significant match requirements:
  • Minimum length:
    Uploaded sequences that do not align to a "template" sequence over at least this many bases will not be included in the output.
  • Minimum percent identity:
    Uploaded sequences that do not share at least this similarity to a "template" sequence will not be included in the output.
Files you desire:
  • Tab-delimited text file summarizing alignment fate of each sequence. It will be titled with a "xls" extension, "xyz_NAST.xls", for example, for convenient opening in spreadsheet applications but it is plain text format.
  • Sequence file of uploaded sequences sucessfully aligned; "xyz_NAST.fasta", for example.
  • Sequence file of uploaded sequences not able to be aligned according to user requirements; "xyz_NASTnot.fasta", for example.
  • Sequence file containing nearest-neighbor non-chimeric sequences for each sequence in upload (redundant neighbors removed). File wil be named "xyz_nn_NAST.fasta", for example.
  • Sequence file containing near-neighbor non-chimeric sequences from named isolates (redundant neighbors removed). File will be named "xyz_nni_NAST.fasta", for example.
Formatting options:
remove common alignment gap characters (returned sequences will contain an equal number of characters)
remove all alignment gap characters (returned sequences will be unequal in length)
do not remove alignment gap characters (returned sequences will be 7,682 characters)
is my preferred file format.
Delivery:
The files requested above will be compressed into one email attachment. Greengenes is using the tgz format which can be opened in MacOSX, WindowsXP, and UNIX-like platforms without the need for special software in our tests.
Email address to send results (required):
 
  • Last Database Update: June 26, 2009 5:39PM
  • 397006 aligned 16S rDNA records >1250nt