Genomatix-Logo
Overview of Help-Pages
GEMS Launcher Logo

Search for phylogenetically conserved promoter models


[Introduction] [Homology Group Selection] [Parameters] [Output]

Introduction

This task allows to search for transcription factor binding site models that are conserved in orthologous promoter sequences of several vertebrate species. This way further evidence for the functionality of promoter models can be gained because TF models are more likely to be functional when they have been preserved during evolution.

The Genomatix homology groups (defined by a proprietary algorithm, see Comparative Genomics) include 16 vertebrate species (Homo sapiens, Pan troglodytes, Macaca mulatta, Mus musculus, Oryctolagus cuniculus, Rattus norvegicus, Equus caballus, Canis lupus familiaris, Bos taurus, Sus scrofa, Monodelphis domestica, Ornithorhynchus anatinus, Xenopus tropicalis, Danio rerio, Gallus gallus, Taeniopygia guttata). A promoter model is defined to be conserved if it is found in at least one alternative promoter sequence of the genes in a homology group for all selected organisms.


Homology Group Selection

Homology Group Selection
Select organism Select the organisms for which the orthologous promoter sequences should be analyzed. The homology groups always correspond to the latest ElDorado release.

You can select all vertebrates or at least two individual organisms from the following list:

  • Homo sapiens
  • Macaca mulatta
  • Pan troglodytes
  • Mus musculus
  • Oryctolagus cuniculus
  • Rattus norvegicus
  • Equus caballus
  • Canis lupus familiaris
  • Bos taurus
  • Sus scrofa
  • Monodelphis domestica
  • Ornithorhynchus anatinus
  • Xenopus tropicalis
  • Danio rerio
  • Gallus gallus
  • Taeniopygia guttata
Constraints You can set one or several mandatory organisms in which promoter model matchess have to be present. Please note that all mandatory organisms need to be included in the organisms selected for the search.

The second constraint is the minimum number of organisms for which promoter model matches have to be found in the same homology group. The number has to be equal or lower than the number of organisms selected.


Parameters

Parameters
Model groups Please choose one or several of the available Genomatix model libraries.
If you have created your own models with FastM or FrameWorker, they can be found in the "User-defined models"-library.

You can decide if you want to

  • use all the models in the chosen libraries
  • use previously defined model subsets or
  • continue with a subset selection.

In the third case, there will be a separate page with a list of all models in the chosen libraries and you can select your model subset by clicking the checkboxes for each model.

Further parameters

All further parameters (e.g. "Max. number of matches") correspond to the ModelInspector search and output parameters.

Email address Here you can choose between two methods for receiving the results:
  • Show result directly in browser window
    In this option the URL of the result is directly shown in your browser window.

    Warning: Please use this option only for analyses which can be performed in a short time.
    If the analysis takes longer than the timeout of the webserver, the connection will be terminated and you will receive an error message (e.g. "The document contained no data."). In this case, the results will not be available, please restart the analysis using the option below "Send the URL of the result to".

  • Send the URL of the result via email
    In this option an email with the URL of the results will be sent to the user provided email address, when the analysis is finished.

The results will be available for a limited time on our server. For details of how long your results will be kept please see the result-email. After that period they will be deleted unless protected in the project management!


Output

Three output files are generated: the match overview, the detailed output, and the statistics file.

1. Match overview

The first output file contains:


2. Detailed output

The second output file contains detailed information for each individual element of the model:

Example:

GXP_294008: Alb, GXL_247281, GeneID: 24186, Rattus norvegicus, albumin

Model: CEBP_HNF1_02 (410 - 472 (+))

Matrix element
Model element
Position Str Sequence Core sim.
---
Mat. sim.
Model sim.
Distance to
next element
V$CEBP/CEBPB.01 410 - 424 (+) ATGATTTTGTAATGG 0.940 0.959 47 bp
V$HNF1/HNF1.02 456 - 472 (+) GGTTAATGATCTACAGT 1.000 0.923 ---


3. Statistics

The third output file contains a statistics of the model matches and detailed information for your own models:

Example for the statistics:

Model Name # matches in # homology groups in # organisms
Homo sapiens Mus musculus Rattus norvegicus Gallus gallus Canis familiaris Bos taurus
CEBP_HNF1_02 6 1 1 1 1 1 1 1