Upload Variant File  (Example)
Variant file (*.csv, *.vcf, *.csv.gz, *.vcf.gz, *.csv.zip, *.vcf.zip) is limited 500,000 lines. Use controlled chr names (see list on the right).


Preprocessed Variant Files  Select a Genome  (genome and chromosome list)
Please ensure that the chromosome names in the variant file match the chromsome names in the chromosome list.
  •  Upload Genome File
    Select a common organism above. For rare or custom genome, upload a new reference genome. One file for one chr (**.fa.gz); limit to 100 MB per file.

     Check against custom binding motif/sequence  Check against existing database 
     Single binding sequence 
    A single DNA nucleotide sequence can be input into the text box on the right.
    Single sequence (max length = 25)
     RNA Binding Protein motifs
     Transcription Factor motifs
     miRNA seed (miRBase v22.1)
     miRNA-mRNA 3'UTR (starBase v2.0) 
     Multiple binding sequences (FASTA) 
    Upload a FASTA file to check against multiple sequences simultaneously.
    Upload needed file
    The file (*.csv) size is limited 20M;

     Binding motif PWM (MEME format) 
    Upload a motif file in MEME format using the Upload File button on the right. One file can contain multiple PWMs.
     Genomic interval file 
    Genomic interval file can be uploaded using the Upload File button on the right. It is similar to BED format but with five columns: chromosome, start, end, strand and ID. Try the human miRNA seed region file (Example).
    All somatic
     Identify all somatic motifs 
    Identify all possible somatic motifs based on input somatic mutations. Sequence length can be adjusted by the slide bar below.

    Flanking length on either side of the mutation. For single nucleotide mutation, if 5 is chosen, a motif or sequence of length 11 will be generated.

    Sequence matching
    FIMO (PMID:21330290) and Exact match are alternative methods to assess fitness of a sequence to a PWM motif. FIMO is valid only for binding targets in PWM (turned off for built-in TF/RBP libraries due to computational burden). Exact match is applicable to PWM and binding sequence targets.

    For certain analysis scenarios, output a secondary file in MEME format.
    In case of sequence/PWM binding targets, Somatic Motif effect is designated as Gain or Loss. User can choose to fetch results of only one effect type. In other cases, the effect is generic (i.e., affeted or disrupted) and cannot be distinguished into Gains or Losses.

    Personalized Genome approach accomodates adjacent joint variants (see Manuscript). Consider this option only when variants are summarized from a single subject (rather than from a cohort).

    Sequence or conservation based algorithms such as SIFT, CADD, and MetaSVM can be invoked to predict the degree of functional impact resulting from the input variants.

    Tissue Specific Annotation 
    Contextual expression informs on the expression level of the fetched regulator TF or RBP, expressed as a rank of the regulator in the GTEx transcriptomes. Fine-mapping methods distinguish the posterior probability of eQTLs, resulting in CAVIAR, CaVEMaN, and DAP-G datasets. Contextual expression and Fine-mapping methods are both originated from GTEx datasets, and they must be set along with a tissue designation.

    Enormous input data entail prolonged runtime. Instead of standing by, user can leave an email to receive an offline notification when job is done. Try a different email if an email address is garbled due to institution's security protocol.

     Exact Match




    Tissue list:
    Contextual expression Fine-mapping methods
    Single sequence target
    Binding target is set as a simple single sequence.

    JASPAR TF motifs
    Binding targets are set as a built-in TF motif library (JASPAR).

    TF motifs & contextual expression
    Built-in JASPAR motifs are used and TF contextual expression is requested.

    miRBase seed regions
    SBSA examines interval overlapping when binding targets are supplied as set of genomic intervals.

