![]() |
![]() |
This task identifies regions in genomes of different species that are orthologous to the regions in the input file (input sequences should be > 50 bp).
To identify orthologous regions in a target species, a proprietary algorithm is used.
In a first step, homologous loci in the target organisms are searched in the ElDorado database (see Comparative Genomics). If no such loci are found, the flanking genes (up to 20 loci in both directions) are considered to find a syntenic region in the target organism. For the definition of a syntenic region, the two homologous genes in the target organism need to be on the same contig and must show the same relative strand orientation as the genes in the source organism.
In a second step, the input sequence is aligned to the syntenic region using a Smith-Waterman alignment. If the alignment fulfills the following criteria, the target region is listed in the output:
Sets of orthologous sequences can be saved, as well as analyzed for common TFBS patterns with FrameWorker or DiAlignTF to identify phylogenetically conserved regulatory structures.
New in RegionMiner Release 3.2 (Dec. 2009):
Significantly improved algorithm to find orthologous regions in cases that were previously not found.
| Input | |
|---|---|
| Input |
Input data are accepted as a tab delimited file in BED / bigBed file format containing the input regions specified at
least by chromosome number, start position and end position (in this order).
When adding a new file, a new window will open, asking you to either
For the new BED files, you will have to select the correct organism, as the
organism and the genome build are associated with the BED file for future use
(the default is your latest choice in the current session).
Note that BED files critically depend on the underlying genome build, which can be changed by selecting a different ElDorado version on the top right of the page before uploading a BED file. You can see the list of genomes available in ElDorado. Note that almost all browsers have a general upload limit of 2 GB, i.e. BED files bigger than this size should be zipped before uploading from your local computer. This restriction does not apply when using the direct import from the GGA/GMS. Optionally you can specify a name for saving uploaded BED files on the server, otherwise the name of the uploaded file will be used. If several files are uploaded, the string given here will be used as prefix for each BED file name. If any of the regions in the input file cannot be completely assigned to the selected genome (e.g. wrong chromosome numbering or wrong positions within a chromosome), an error message will appear and the regions will be skipped. If no valid region is found in an uploaded file, the complete file will be skipped. After one or several BED files were uploaded successfully, and after closing the popup window,
the list of available BED files will be automatically updated.
Uploaded BED files can be deleted from the project anytime via the project management. |
| Target | |
| Target species | The program searches the genomes of the species you select here for sequences which are orthologous to the regions in the input file. Depending on the source organism only a certain selection of target organisms is available (i.e. orthologs can be searched only within vertebrates, plants, or insects respectively). |
| Output | |
| Result | Here, you can edit the default name of the result file. |
| Email address | Here you can choose between two methods for receiving
the results:
The results will be available for a limited time on our server. For details of how long your results will be kept please see the result-email. After that period they will be deleted unless protected in the project management! We recommend to use the email option for more than ca. 500 input regions. |
The output has three sections:

For each input region, the output lists those species in which orthologs are found together with an alignment score.
Ortholog sets can be analyzed for common TFBS patterns with FrameWorker or DiAlignTF to identify phylogenetically conserved regulatory structures.
By combining the region selection via the "select-regions" buttons and a organism selection almost any combination of sequences can be extracted into one sequence file for further analysis. Input sequences and orthologous regions can be saved in the local file system or in your personal sequence directory.

| © 1998-2011 Genomatix Software GmbH - All rights reserved |