SBSA Tool now supports 19307 SARS genomes (Coronavirus)


Upload Variant File  (Example)
Variant.csv file is limited 500,000 lines. Please ensure that the chromosome names in the variant file match the chromsome names in the chromosome list.

 

Preprocessed Inputs  Select a Genome  (genome and chromosome list)
Please ensure that the chromosome names in the variant file match the chromsome names in the chromosome list.
  • ICGC: Download
  • TCGA: Download
  • GTEx: Download
  • RNA Editing: Download
  •  Upload Genome File
    A new species can be supported by uploading its genome FASTA file. Due to size limit, uploading new fasta is limited to 100MB (gz format) and limited to by chromosome only. Thus for species with multiple chromsomes, multiple uploads are required.

     Input your own binding motif/sequence  Check against existing database 
    Function Input file type Input target file Preprocessed Motif and Interval
    Motif sequence  Single Binding Sequence 
    A single DNA nucleotide sequence can input into the text box on the right. Mutations from step 1 is checked against reference FASTA to determine if this sequence is produced due to mutations.
    Enter a single binding sequence
    (max length = 25)

    OR
    The file size is limited 20M;
     RNA Binding Protein motifs
     Transcription Factor motifs
     miRNA seed (miRBase)
     miRNA-mRNA 3'UTR (starBase) 
     Multiple Binding Sequences (FASTA) 
    Similar to single sequence, when multiple sequences are to be input, A FASTA format file can be uploaded using the Upload File button on the right. Mutations from step 1 is checked against reference FASTA to determine if any of the sequences in the uploaded FASTA file is produced due to mutations.
     Binding Motif (MEME format) 
    Upload a motif file in MEME format using the Upload File button on the right. Mutations from step 1 is checked against reference FASTA to determine if any of the sequences in the uploaded MEME motif file is produced due to mutations.
    Example 
    Genomic interval  Genomic interval file 
    Genomic interval file can be uploaded using the Upload File button on the right. It is similar to BED format but with five columns: chromosome, start, end, strand and ID. Mutations from step 1 is checked against reference FASTA to determine if any of the sequences between  Start and End is produced. For example, a miRNA seed region file can be used as an input to determine any miRNA seed sequence has been produced by mutations.
    Example
    All somatic motifs  Identify all somatic motifs 
    Identify all possible somatic motifs based on input somatic mutations. Sequence length can be adjusted by the slide bar below.

      5
    Flanking length on either side of the mutation. For single nucleotide mutation, if 5 is chosen, a motif or sequence of length 11 will be generated.


    Annotation Algorithm
     
    FIMO (PMID:21330290) is an algorithm that determine the binding potential between a sequence and a motif. Exact match algorithm requires exact match without any mismatch.


     FIMO     Exact Match


    MEME Output
     
    Whether output result using MEME format.


     Yes     No


    Somatic Motif Type
     
    Output Gain motif or Loss motif.


     Gain      Loss


    Personalized Genome
     
     This option should be selected only when input mutation file is derived from a single subject. If yes is selected, SBSA will consider all possible combinations of somatic sequences within the targeted short genomic interval.


     Yes     No


    Additional Annotation
     
    Whether to annotate the mutation/SNP/RNA editing's genomic location, only available for Human GRCh37 and GRCh38 at the moment.


     Yes     No


    Email
     
    If input is large, runtime can be several minutes, instead of waiting, results can be emailed. Sometimes, due to institution's security protocol, the link in the email may become garbled. Then please try to change the email.


    Please do not submit repeatedly