Contig Finder: Identify Contigs with Your Genes


Step 1: Upload Your Genome File

Drag & Drop Your Genome File Here in FASTA/FASTQ Format
(Max File Size is 250MB)

-or-

File Upload Status

No files have been uploaded yet.

Step 2: Provide Query Sequences in FASTA Format

Advanced Settings for Gene Matches:
  • BLAST E-value Threshold: [0.0001-10]
  • BLAST Alignment Type: Gapped | Ungapped
  • Minimum Query Coverage Cutoff: [1-100] % 


Contigs Selected
(0/10)

No contigs have been selected yet!

    Search Results

    Your Selected Contigs




    Contig Finder Genome File Requirements

    Gene File Requirements

    Minimum Query Coverage Cutoff

    Only query sequences with a percentage of basepairs/residues falling within significant BLAST hits at, or above, this value will appear in your results. Using the full-length gene query sequence, we tile the hits or 'HSPs' (High-scoring Segment Pair) which map with the same strand/direction as the HSP with the highest bitscore. A simplified example is shown below:

    Original Query Gene: 1 ACCACCTTGAACAATCC 17
    Genome Contig Sequence: 1 AACACCTCTCTCTTAAACTTT 21

    BLAST HIT 1:
    Query 1 ACCACCT 7
            | |||||
    Sbjct 1 AACACCT 7

    BLAST HIT 2:
    Query 6  CTTGAACAAT 15
             ||| |||  |
    Sbjct 12 CTTAAACTTT 21


    Now we map the significant hits back to the original:

    Original: ACCACCTTGAACAATCC
        Hit1: A-CACCT
        Hit2:      CTT-AAC--T
    Combined: ACCACCTTGAACAAT--
    Coverage: 15/17 (88.24%)


    Note how the gaps within BLAST hits are ignored when calculating the final coverage score. If the 'Minimum Query Coverage Cutoff' was set to 88% this gene would map, however, if it was set to 89% it would not. This feature is included to help avoid queries with only a small fragment mapping to a genome from cluttering up results. Setting the value to '1' will show any query with at least one significant hit in your results.

    Circular Genome Mode

    This mode can be useful when dealing with circular genomes from bacteria or mitochondria. When using this mode, Genome1 acts as a reference for the order that genes appear. All other genomes will then be rotated to maximize the number of genes which match this order. This mode requires each genome to have only one circular contig and will be disabled if you upload a genome with more than one contig.