![]() |
![]() |
Display of transcription factor binding sites common to a set of sequences. The sequences are scanned for matches to TF binding sites by MatInspector. The common sites are displayed graphically and in table-form. The length of all input sequences combined must not exceed 1 million basepairs.
For the search for common TF sites, the same sequence input options as in MatInspector are available, except for the database search option.
| Library selection | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Libraries | Available libraries are the MatInspector weight matrix library of transcription factor binding sites and a plant IUPAC library based on PLACE. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Matrix Search Parameters | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Matrix / IUPAC parameters |
Depending on the type of the selected library further search parameters like matrix group or matrix filters can be entered. The parameters correspond to the MatInspector matrix parameters or IUPAC parameters. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Matrix filters | Matrices used for the analysis can
be filtered by the tissues they are associated with. Just select
one or more tissues (e.g. blood cells or liver) from the list. The tissue
associations can be viewed by using the link "Show all tissue associations".
List of available tissues:
Tissues are assigned to matrix families, not individual matrices. The tissue associations of matrix families are determined by automatic evaluation of all PubMed abstracts (co-citations of transcription factors and tissues) and subsequent manual curation. Note: Up to now tissue filtering is only available for vertebrate matrices. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| TF sites common to | The lower limit of sequences within the input set that has to contain the common TF sites. Default is the absolute number of sequences that corresponds to at least 85% of the input sequences. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Output format | Usually, output consists of a graphical and a tabular display of the common TF sites. Using this parameter, you can avoid that the graphics are displayed. This can be useful e.g. if your computer is not equipped with too much working memory. However, if the number of input sequences is larger than 50, always only the table is displayed (the graphics would take too long to load and use up too much resources on your computer). You can NOT circumvent this behavior by setting this parameter. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Transcription factor binding sites common to the set of input sequences are displayed as an interactive graphics including a match summary table
NOTE: If the input consists of more than 50 sequences, only the table is displayed. You can also omit the graphical output by choosing the "table only" option of the output format parameter.
This interactive graphic page is implemented as a Java applet, running within your web browser requiring a Java plug-in (also see our technical FAQs).
The Java graphic consists of four parts:
The black and gray patterned line represents the sequence. Each scale line corresponds to 50 basepairs.
Each sequence is preceded by a colored box containing information on the sequence, in particular (top to bottom):
Additionally, the line is flanked by red numbers indicating the currently visible portion of the sequence, i.e. the index of the first and last base pair respectively. Note that indices are absolute sequence positions, i.e. the positions range from '1' to the length of the sequence.
Each
matrix match is represented by a half round symbol. Matches which were found
on the positive or negative strand reside on top or below the sequence line.
There is one color for each matrix family, i.e. matches which are caused by
matrices of the same family are painted in the same color.
The different colors of the symbols indicate the different matrix (resp. IUPAC) families. Note that matches caused by user-defined matrices are represented by a different symbol (a square instead of the half-round shape).
Left-clicking
on a matrix match symbol will show a small display window containing information
on the specified match: The name of the matrix, the matrix family to which
it belongs, the position of the matrix match (relative to the sequence), the
matrix threshold, and the nucleotide sequence which was matched. The annotation
window is also a hyperlink. Following the link will open a new browser window
showing further information on the matrix family.
Moving the mouse pointer out of (or clicking on) a highlighted matrix match
symbol will close the information display. Additionally, the symbol (of the
matrix match) is put into the background. This is helpful as the matrix matches
often overlap each other and can even be completely covered.
| The arrow symbol on the sequence | stands for a transcription start site (TSS) or putative TSS. Please note that there can be several (or none) TSS for one sequence. |
The features of this panel are described here.
Next to the buttons for the basic
features of the Java graphics toolbar you can find a button
for
displaying the match summary table
The remaining buttons are for filtering of the matrix/IUPAC
matches: You can filter the matrix matches by threshold, occurrence
and family.
Each matrix is provided with a search threshold, which is one of the adjustable
Matrix Parameters. When a
matrix matches a sequence, the level of similarity is expressed as a number
called the "matrix similarity" or "matrix threshold". This
number can be larger or equal, but never smaller than the search threshold
for the matrix.
You can filter the matches by their matrix similarity relative
to the search threshold by changing the value of the text field. Simply left-click
the up- or down arrow on the right-hand side of the text field to increase
or decrease the value by 0.01. The value "+/- 0" (which is the initial
value) stands for the search threshold. The displayed value is a cutoff threshold,
which means that all matches with at least this threshold are displayed. The
possible values for this text field range from "+/- 0" to "+0.05".
Whenever there is more than one sequence you can also filter by the occurrence of families. That means you can specify that only those matches are displayed whose corresponding matrices belong to a family which is found on at least the displayed number of sequences. You can change the number by left-clicking on the arrows. The single arrows will increase respectively decrease the number by 1, the double arrows by 5. The largest possible value is the total number of sequences. The smallest possible value is determined by the above-mentioned pre-filtering and lies between 1 and the total number of sequences (in the latter case, no filtering by occurrence is possible). The initial value is the smallest possible number.
The "Select all" and "Deselect all" buttons can be used to check, resp. uncheck all checkboxes for the matrix/IUPAC families at once.
Using the checkboxes, it is possible to filter by family.
Only those matches are visible, whose matrix belongs to a selected family.
Each checkbox has a border of the same color as the corresponding matrix/IUPAC
match symbols.
If a checkbox and the associated family name appear transparent, this is an
indication that the corresponding matches are currently not visible because
they do not satisfy the current conditions regarding occurrence and/or threshold.
To facilitate the selection, one can use the
"Select all" and "Deselect all" buttons from
the toolbar.
Note that the visibility of a matrix match is controlled by the conjunctive combination of the filters. Thus, for a matrix match symbol to be displayed, the corresponding matrix family's checkbox must be "checked", there must be family matches on at least the specified number of sequences and the match itself must have a similarity of at least the chosen threshold.
You can display the match summary table by clicking the
button
from the tool bar. The table shows the list
of common TF sites, in how many sequences they occur and how often they match
in each input sequence. Additionally, a significance value (p-value)
is given for each common TF site.

Clicking a column header will resort the table by the values of this column, i.e. by
The common transcription factor binding sites identified by MatInspector can be exported to a tab-delimited file by using one of the "Export" buttons below the match summary table. The files are saved to your local disk and can be opened directly with Microsoft Excel.
Seq. name hs23527/CKM hs23527/CKM hs23527/CKM mm24253/Ckm mm24253/Ckm mm24253/Ckm rn7910/Ckm rn7910/Ckm rn7910/Ckm |
Family/Matrix V$AP2F/AP2.01 V$AP4R/AP4.01 V$AP4R/AP4.01 V$AP4R/AP4.01 V$AP4R/AP4.01 V$AP2F/AP2.01 V$AP4R/AP4.01 V$AP2F/AP2.01 V$AP4R/AP4.01 |
Opt. thresh. 0.89 0.97 0.97 0.97 0.97 0.89 0.97 0.89 0.97 |
Start pos. 220 226 398 61 246 330 85 238 244 |
End pos. 232 242 414 77 262 342 101 250 260 |
Strand - - + - - - - - - |
Core sim. 0.976 1.000 1.000 1.000 1.000 0.976 1.000 0.976 1.000 |
Matrix sim. 0.918 0.970 0.977 0.976 0.970 0.893 0.976 0.918 0.970 |
Sequence ggCCCCccgccgg ggggaCAGCtggccccc aggctCAGCtgccctcc attagCAGCtgggagaa ggggaCAGCtggccctt agCCCCatggcct atcagCAGCtgggagaa ggCCCCccgccgg ggggaCAGCtggccccc |
Note: With both export options the common TF sites identified originally will be exported. Filtering in the graphics has no influence on the matches that are exported to Excel.
| © 1998-2010 Genomatix Software GmbH - All rights reserved |