Genomatix-Logo
Overview of Help-Pages
MatInspector-Logo

Search for common TF sites in multiple sequences


[Introduction] [Input] [Parameters] [Output and Export] [Interactive Graphics]

Introduction

Display of transcription factor binding sites common to a set of sequences. The sequences are scanned for matches to TF binding sites by MatInspector. The common sites are displayed graphically and in table-form. This task allows at most 1000 input sequences and the length of all input sequences combined must not exceed 1 million basepairs.


Input

For the search for common TF sites, the same sequence input options as in MatInspector are available, except for the database search option.


Parameters

Library selection
Libraries Available libraries are the MatInspector weight matrix library of transcription factor binding sites and a plant IUPAC library based on PLACE.
Matrix Search Parameters
Matrix / IUPAC parameters

Depending on the type of the selected library further search parameters like matrix group or matrix filters can be entered.

The parameters correspond to the MatInspector matrix parameters or IUPAC parameters.

Matrix filters Matrices used for the analysis can be filtered by the tissues they are associated with. Just select one or more tissues (e.g. blood cells or liver) from the list. The tissue associations can be viewed by using the link "Show all tissue associations".

List of available tissues:

Adipose Tissue Adrenal Glands Antibody-Producing Cells Antigen-Presenting Cells
Bladder Blastomeres Blood Cells Blood Platelets
Bone Marrow Cells Bone and Bones Brain Breast
Cardiovascular System Cartilage Central Nervous System Connective Tissue
Digestive System Ear Embryonic Structures Endocrine System
Erythrocytes Eye Germ Cells Granulocytes
Heart Hematopoietic System Hemocytes Immune System
Integumentary System Islets of Langerhans Kidney Leukocytes
Leydig Cells Liver Lung Luteal Cells
Lymphocytes Monocytes Muscle, Skeletal Muscle, Smooth
Muscles Myeloid Cells Myocardium Nervous System
Neuroglia Neurons Nose Ovary
Pancreas Parathyroid Glands Phagocytes Pineal Gland
Pituitary Gland Prostate Respiratory System Skeleton
Spinal Cord Testis Thymus Gland Thyroid Gland
Ubiquitous Urogenital System

Tissues are assigned to matrix families, not individual matrices. The tissue associations of matrix families are determined by automatic evaluation of all PubMed abstracts (co-citations of transcription factors and tissues) and subsequent manual curation.

Note: Up to now tissue filtering is only available for vertebrate matrices.

These parameters are hidden by default. Clicking on will reveal them.
TF sites common to The lower limit of sequences within the input set that has to contain the common TF sites. Default is the absolute number of sequences that corresponds to at least 85% of the input sequences.

Output of the task "Common TF sites"

Transcription factor binding sites common to the set of input sequences are displayed

Note: If the input consists of more than 50 sequences, no graphics is displayed since it will not be legible.


Interactive graphics:

common transcription factors

The interactive graphic consists of four parts:

The main sequence panel

The black line represents the sequence. Depending on the zoom level a scale in basepairs is given.

Each sequence is labeled with a colored box containing information on the sequence, in particular:

The color of the box represents the organism. Please note that possibly not all this information is available for all of your input sequences. If e.g. no organism information can be provided, the box comes in a default color.
The sequences are vertically aligned along their start positions.

matrix matches Each matrix match is represented by a half round symbol. Matches which were found on the positive or negative strand are depicted on top or below the sequence line. There is one color for each matrix family, i.e. matches to matrices of the same family are painted in the same color.

The different colors of the symbols indicate the different matrix (resp. IUPAC) families.

match annotation Left-clicking on a matrix match symbol will show a tooltip with information on the specified match: The name of the matrix, the matrix family to which it belongs, the position of the matrix match (relative to the sequence), the genomic position (if available), the matrix similarity and the actual nucleotide sequence which was matched.


The arrow symbol on the sequence transcription start region stands for a transcription start site (TSS) or putative TSS. Please note that there can be several (or no) TSS for one sequence.

The navigation panel

The view on the sequence can be changed by using the zoom- and scroll element in the lower part of the graphics. The navigation panel contains a scaled down version of the sequence. By default, the whole sequence is displayed.

To zoom in click into the navigation bar and select the region to zoom to by holding down the mouse button. After this a grey box appears which marks the currently selected part of the sequence that is visible in the main sequence panel above.

navigation

To zoom further in or out, click on the grey box in the navigation panel and adjust the size of the box via its handles (the mouse pointer changes when hovering over the edges of the box). The sequence in the main panel will adjust to the selected window.

If you want to scroll along the sequence, move the grey box within the navigation panel by sliding it with the mouse to the desired position. /p>

Clicking anywhere inside the scrollbar will reset the default, i.e. the complete sequence is shown.

The toolbar including filter options

The toolbar contains buttons to export the currently displayed graphics to either PNG or JPG format. Pressing one of the two buttons opens a new window or tab with the image in the selected format. A right-click on this image opens a context-menu with the option to save the image on your computer.
If the button labeled "Enable dragging" was pressed, an additional option for changing the view for a sequence is available: by selecting a sequence (left-click) and dragging it left or right the view will change accordingly.
The remaining buttons are for filtering of the matrix/IUPAC matches: You can filter the matrix matches by threshold, occurrence and family.

toolbar

Each matrix is provided with a search threshold, which is one of the adjustable Matrix Parameters. When a matrix matches a sequence, the level of similarity is expressed as a number called the "matrix similarity" or "matrix threshold". This number can be larger or equal, but never smaller than the search threshold for the matrix.
You can filter the matches by their matrix similarity relative to the search threshold by changing the value of the text field. Simply left-click the up- or down arrow on the right-hand side of the text field to increase or decrease the value by 0.01. The value "0" (which is the initial value) stands for the search threshold. The displayed value is a cutoff threshold, which means that all matches with at least this threshold are displayed. The possible values for this text field range from "0" to "0.05".

Whenever there is more than one sequence you can also filter by the occurrence of matches. That means you can specify that only those matches are displayed whose corresponding matrices belong to a family which is found on at least the displayed number of sequences. You can change the number by clicking on the arrow and selecting a value from the drop-down list. The largest possible value is the total number of sequences, the smallest possible value is 1. The initial value is the number of sequences you have entered as sequence threshold (parameter "TF sites common to").

The "Select all" and "Deselect all" buttons can be used to check, resp. uncheck all checkboxes for the matrix/IUPAC families at once.

The checkbox list

Using the checkboxes, it is possible to filter by family. Only those matches are visible, whose matrix belongs to a selected family. Each checkbox has a border of the same color as the corresponding matrix/IUPAC match symbols.
If a checkbox and the associated family name appear transparent, this is an indication that the corresponding matches are currently not visible because they do not satisfy the current conditions regarding occurrence and/or threshold.
To facilitate the selection, one can use the "Select all" and "Deselect all" buttons from the toolbar.

Note that the visibility of a matrix match is controlled by the conjunctive combination of the filters. Thus, for a matrix match symbol to be displayed, the corresponding matrix family's checkbox must be "checked", there must be family matches on at least the specified number of sequences and the match itself must have a similarity of at least the chosen threshold.


Match summary table

This table shows the list of common TF sites, in how many sequences they occur and how often they match in each input sequence. Additionally, a significance value (p-value) is given for each common TF site.

Example:

match summary table

Clicking on one of the column headers will sort the table by the values of this column (ascending or descending as depicted by the little yellow arrows), i.e. by

The complete summary table can also be saved in Microsoft Excel™ format (as at most 30 sequences or columns can be displayed in the interactive table) by clicking on the button below the table.


Common matches table

This table lists those TF sites, that fulfill the constraint criterium set by the user, e.g. the matrix family occurs in at least 3 of 5 sequences.
For each input sequence the matrix and family name, the position and core and matrix similarity are given.

Example:

common match table

Clicking on one of the column headers will sort the table by by the values of this column (ascending or descending as depicted by the little yellow arrows).

For a detailed description of the columns please see the MatInspector help page

The complete common match table can also be saved in Microsoft Excel™ format by clicking on the button below the table.

Note: With both export options the common TF sites identified with the original user settings will be exported. Filtering in the graphics has no influence on the matches that are exported to Excel.