Genomatix-Logo
Overview of Help-Pages
GEMS Launcher Logo

DiAlign TF: Multiple alignment plus TF sites


[Introduction] [Input] [Parameters] [Output] [References]

Introduction

DiAlign TF displays transcription factor (TF) binding site matches within a multiple alignment. It is possible to display all TF binding site matches, TF binding site matches common to all or subset of the input sequences, or common TF binding site matches that are located in aligned regions. The TF binding sites are visualized in the alignment as colored boxes.

The input sequences are aligned with the multiple alignment program DiAlign. TF binding site matches are identified by MatInspector.


Input

General: Sequence Formats
Accepted DNA sequence formats The following formats for DNA sequences are accepted: There should be only IUPAC characters in the sequence, any other characters will be skipped!
Sequence Input
Choose from your previously uploaded sequences Select a sequence file from the list of your personal sequence files which were saved in the result management in prior analyses (via "add sequences", see below).
Quick Upload new Paste your sequence(s) in the form field in one of the accepted formats (see above). Note that sequences pasted in the "quick upload" field are not saved for future use.
Add sequences

Sequences or sequence files uploaded here are automatically saved in the result management for later use:

Enter the formatted DNA sequence(s) Enter your correctly formatted sequence(s) directly into the form, e.g. with copy and paste (see above for accepted formats).
or upload a file containing sequence(s) (max. 100 MB) If your browser supports this option, a sequence file can be uploaded.
If you use this option, the file should contain the sequence(s) in either one of the formats listed above.
Please note, that the size for uploaded files is limited to 100 MB. If you want to analyze larger sequences please contact support@genomatix.de. For whole chromosomes you can use the accession number option below (e.g. 'NC_000001' for human chromosome 1).
Accession number(s) If you are interested in one or several special sequences from a database section, you can supply a list of accession numbers. If you want to select more than one accession number, please separate the accession numbers by commas or spaces.

On the Genomatix server accession numbers from the following databases can be entered:

  • GenBank (sections Bacteria, Invertebrates, Other Mammalian, Other Vertebrates, Plants, Primates, Rodents, Viruses, ESTs) (e.g. 'M65229')
  • Eukaryotic Promoter Database (EPD) (e.g. 'EP30014')
  • NCBI Reference Sequences (mRNA sequences) (e.g. 'NM_000402')
  • Genomatix Promoter Database (e.g. 'GXP_107276')
  • dbSNP (e.g. 'rs1234')

Dialign TF Parameters

Alignment Parameters
Threshold T

DiAlign uses diagonals to construct an alignment. The threshold T influences the set of used diagonals: with T > 0, a diagonal is considered for alignment only if its weight exceeds this threshold. Regions of lower similarity are not aligned.

DiAlign usually produces reasonable alignments without a threshold, i.e. with T = 0.
Increasing the threshold reduces the computing time of DiAlign, but also influences the alignment quality. If T is too large, even significant similarities are ignored. We recommend to use a threshold value between 0 and 1 (maximum allowed value for T is 5).

These parameters are hidden by default. You can use the reveal box next to the section header to reveal them!
Output Parameters
Display of alignment '*' signs below alignment

'*' characters are used in the DiAlign output to create a pseudo-graphical representation indicating

  • the relative degree of local similarity among the input sequences (diagonal similarity),
  • nucleic/amino acid similarity at each position of the alignment
  • positions where all nucleic/amino acids are identical, or
  • variable positions.

In the first two cases, the user can specify the maximum number of '*' characters per column in the program output thus changing the resolution of the graphics. In the other two cases, one '*' signs denotes identical or variable positions, respectively.

The latter two options are especially suited for very similar sequences where one is interested only in the mismatches within an alignment.

These parameters are hidden by default. You can use the reveal box next to the section header to reveal them!
Number of nucleic/amino acids per line

The default number of nucleic/amino acids per line in the alignment output is 50. It can be set to 0 (= unlimited) so that the complete alignment is shown in one line.

These parameters are hidden by default. You can use the reveal box next to the section header to reveal them!
TF search parameters
MatInspector parameters The parameters here correspond to the following MatInspector parameters:
Display of TFs

By default, TF binding sites are displayed if they are common to at least 85 % of the input sequences and are located at the same position within the alignment.

It is also possible to display all TF binding sites identified by MatInspector or all TF binding sites common to a user-definable number of sequences.

Email address Here you can choose between two methods for receiving the results:
  • Show result directly in browser window
    In this option the URL of the result is directly shown in your browser window.

    Warning: Please use this option only for analyses which can be performed in a short time.
    If the analysis takes longer than the timeout of the webserver, the connection will be terminated and you will receive an error message (e.g. "The document contained no data."). In this case, the results will not be available, please restart the analysis using the option below "Send the URL of the result to".

  • Send the URL of the result via email
    In this option an email with the URL of the results will be sent to the user provided email address, when the analysis is finished.

The results will be available for a limited time on our server. For details of how long your results will be kept please see the result-email. After that period they will be deleted unless protected in the project management!


Output Examples

Example of DiAlign TF output:

V$RORAV$HMTBV$NFATV$ZFHX

alignment position 51........ 61........ 71........ 81........ 91........
mouse45 CCgACCAAGA GGGATTTCAC CTAAATCCAT TCAGTCAGTG TATGGGGGTT
rat45 CCAACCAAGA GGGATTTCAC CTAAATCCAT TCAGTCAGTG TATGGGGGTT
human45 GTAACAAAGA GGGATTTCAC CTACATCCAT TCAGTCAGTC TTTGGGGGTT
pig45 GTAACCAAGA GGGATTTCAC CTACATCCAT TCAGTCAGTT TATGGGGGTT
bovine43 GTAACCAAGA GGGATTTCAC CTAAATCCAT TCAGTCAGTT TATGGGGGTT
chicken41 CTTACACCAG TGAATGTAGG TAAAATCCCT CTCGCag--- ----------
duck34 CTTTCCCCAA TGAATGTAGG TAAAATCCCT CTTGC----- --------TT



*
*
*





*
*
*





*
*
*
*




*
*
*
*
*
*


*
*
*
*
*
*


*
*
*
*
*
*


*
*
*
*
*
*
*

*
*
*
*
*
*
*

*
*
*
*
*
*
*

*
*
*
*
*
*
*


*
*
*
*
*
*
*

*
*
*
*
*
*
*

*
*
*
*
*
*
*

*
*
*
*
*
*
*

*
*
*
*
*
*
*

*
*
*
*
*
*
*

*
*
*
*
*
*
*

*
*
*
*
*
*
*

*
*
*
*
*
*
*

*
*
*
*
*
*
*


*
*
*
*
*
*
*

*
*
*
*
*
*


*
*
*
*
*



*
*
*
*
*



*
*
*
*
*
*
*

*
*
*
*
*
*
*

*
*
*
*
*
*
*

*
*
*
*
*
*
*

*
*
*
*
*
*
*

*
*
*
*
*
*
*


*
*
*
*
*
*
*

*
*
*
*
*
*
*

*
*
*
*
*
*
*

*
*
*
*
*
*
*

*
*
*
*
*
*
*

*
*
*
*
*
*


*
*
*
*
*
*


*
*
*
*
*
*


*
*
*
*
*
*


*
*







*
*
*
*
*



*
*
*
*
*



*
*
*
*
*
*


*
*
*
*
*
*


*
*
*
*
*
*


*
*
*
*
*
*


*
*
*
*
*
*


*
*
*
*
*
*


*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
alignment position 101....... 111....... 121....... 131....... 141.......
mouse95 TAAAGAAATT CCAGAGAGTC ATCAGAAGAG GAAAAACAAA GGTAATGCTT
rat95 TAAAGAAATT CCAGAGAGTC ATCAGAAGAG GAAAAACAAA GGTAATGCTT
human95 TAAAGAAATT CCAAAGAGTC ATCAGAAGAG GAAAAATGAA GGTAATGTTT
pig95 TAAAGAAATT CCAAAGAGTC ATCAGAAGAG GAAAAATAAA GGTAATGCTT
bovine93 TAAAGAAATT CCAAAGAGTC ATCAGAAGAG GAAAAATAAA GGTAATGCTT
chicken78 TAAAAAAATT CCAAAGTGTC ATC-GGGGAG GAAAAACAAA AGTAATGATT
duck71 TAAAAAAATT CCAAAGTGTC ATCAGGGGAG GAAAAACAAA AGTAATGATT



*
*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*
*


*
*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*


*
*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*
*


*
*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*


*
*
*
*
*
*
*
*
*

*
*
*
*
*
*
*



*
*
*
*
*
*
*



*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*
*


*
*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*
*

alignment position 151....... 161....... 171....... 181....... 191.......
mouse145 TCTGCCACAC AGGTAGACTC --TTTGAAAA TATGTGTAAT ATGTAAAACA
rat145 TCTGCTACAC AGGTAGACTC --T--GAAAA TATGTGTAAT ATGTAAAACA
human145 TTT--CAGAC AGGTAAAGTC --TTTGAAAA TATGTGTAAT ATGTAAAACA
pig145 TTTGCCACAC AGGTAGAATC TTT--GAAAA TATGTGTAAT ATGTAAAACA
bovine143 TTTGCCACAC AGGTAGAATC TTT--GAAAG TATGTGTAAT ATGTAAAACA
chicken127 CTTGCCATAC AGGTAGAGCA --CATGAAAA AATGTGTAAT A---AAAACC
duck121 CTTGCCATAC AGGTAAAGCA --TATGAAAA AATGTGTAAT A---AAAcct



*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*

*
*
*
*
*
*
*


*
*
*
*
*
*
*


*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*


*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*

*
*
*
*
*
*
*
*

*
*
*
*
*
*



*
*
*
*
*
*



*
*
*
*





*
*
*
*





*
*
*
*






*








*








*
*
*
*





*
*







*
*







*
*
*
*
*
*
*


*
*
*
*
*
*
*


*
*
*
*
*
*
*


*
*
*
*
*
*
*


*
*
*
*
*
*
*



*
*
*
*
*
*
*


*
*
*
*
*
*
*


*
*
*
*
*
*
*


*
*
*
*
*
*
*


*
*
*
*
*
*
*


*
*
*
*
*
*
*


*
*
*
*
*
*
*


*
*
*
*
*
*
*


*
*
*
*
*
*
*


*
*
*
*
*
*
*



*
*
*
*
*
*
*


*
*
*
*
*




*
*
*
*
*




*
*
*
*
*




*
*
*
*
*
*



*
*
*
*
*
*



*
*
*
*
*




*
*
*
*
*




*
*
*
*
*




*
*
*
*
*





Explanations:


Extract alignment region

User-defined regions of the alignment (e.g. conserved regions of the alignment without TF binding site matches) can be extracted to a sequence file. This sequence file can then be used as input for MatDefine to define a new matrix for a putative TF binding site.

Extract alignment region
Positions in alignment Enter the start and end position of the alignment region that should be extracted.
Output file Enter the name of the sequence file. The sequences will be saved in your personal sequence directory.

References

DiAlign TF is described in