![]() |
![]() |
This task calculates distance correlations between elements from at least two input files (called Anchor Set and Partner Set(s) ),
or between elements in one input file and annotated genomic elements, e.g. promoters.
The output is a correlation graph showing the distribution of elements from the Partner Set(s) in correlation
to the anchors of the Anchor Set.
The elements of the Anchor Set are defined to be at position 0 in the graph, and for each Partner Set a
separate curve will be displayed.
Additionally, the correlations as well as the elements of the sets can be extracted based on the distance between the elements,
thus allowing to extract elements from large sets that fulfill certain distance requirements.

| Input data | |
|---|---|
| Anchor Set (Genomic elements used as anchor for correlation) |
Input data are accepted as a tab delimited file in BED / bigBed file format containing the input regions specified at
least by chromosome number, start position and end position (in this order).
When adding a new file, a new window will open, asking you to either
For the new BED files, you will have to select the correct organism, as the
organism and the genome build are associated with the BED file for future use
(the default is your latest choice in the current session).
Note that BED files critically depend on the underlying genome build, which can be changed by selecting a different ElDorado version on the top right of the page before uploading a BED file. You can see the list of genomes available in ElDorado. Note that almost all browsers have a general upload limit of 2 GB, i.e. BED files bigger than this size should be zipped before uploading from your local computer. This restriction does not apply when using the direct import from the GGA/GMS. Optionally you can specify a name for saving uploaded BED files on the server, otherwise the name of the uploaded file will be used. If several files are uploaded, the string given here will be used as prefix for each BED file name. If any of the regions in the input file cannot be completely assigned to the selected genome (e.g. wrong chromosome numbering or wrong positions within a chromosome), an error message will appear and the regions will be skipped. If no valid region is found in an uploaded file, the complete file will be skipped. After one or several BED files were uploaded successfully, and after closing the popup window,
the list of available BED files will be automatically updated.
Uploaded BED files can be deleted from the project anytime via the project management. Alternatively, the position data of a set of genomic elements from the ElDorado database can be selected. Available elements are:
|
| Partner Set(s) (to be checked for correlations to Anchor Set) |
The data for the Partner Set(s) of genomic elements can be uploaded or selected
as described above for the Anchor Set. Up to 5 Partner Sets can be selected here (this will result in 5 curves in the output graph), i.e. several BED files can be selected from the "previously uploaded"-list or several different genomic elements can be selected. |
| Output | |
| Organism | If all input sets for GenomeInspector are genomic elements from ElDorado, the organism must be selected here. Otherwise, the species is inferred from the uploaded BED file(s) and the selection here is ignored. E.g. if a correlation between Primary Transcripts and MicroRNAs is analysed, the species must be given here. |
| Options | Distance: This is the distance between an element in the Anchor Set and an element in the Partner Set that will be analyzed. The default is 1000 bp, resulting in a window of 2000 bp (-1000 to +1000 from the anchor position in the Anchor Set elements) being displayed in the output graph. Required calculation time increases with maximum distance. Note that very long distances and large input sets can lead to a server timeout. Anchor for elements of Anchor Set
Use only distinct elements from Anchor Set These parameters are hidden by default. You can use the
|
| Colors |
For each correlation graph to appear in the output (currently up to 5)
a color can be selected, allowing a user-defined
combination of colors.
These parameters are hidden by default. You can use the
|
| Nucleotide Content | As optional additional output, the combined GC content, as well as the individual contents of A, C, G, and T can be displayed.
Percentages are calculated for each position based on the alignment of elements of Set 1 at their anchor. Note that nucleotide content statistics will slow down the program, especially for long distance ranges. These parameters are hidden by default. You can use the
|
| Result | Here, you can edit the default name of the result file. |
The correlation table summarizes the number of correlations. It lists, for each combination of Anchor Set with Partner Set,
The correlation graph shows the distance distribution of elements in the Partner Set(s) to
the anchor points of elements in Anchor Set.
All anchor points are aligned at position 0 in the graph.
For elements in the Partner Set(s), the whole length of the elements is used for display.
Therefore, a partner element of length 100 situated between 501 and 600 bps downstream of the anchor point
in a correlation pair will increase the values of the correlation curve by 1 at all positions from 501 to 600.
A, C, G, T, and GC contents are shown optionally in green, red, yellow, blue, and magenta.
The left ordinate shows the correlation count, and right ordinate the optional nucleotide content.
If only one Prtner Set was selected, additionally the mean correlation count +/- standard deviation is indicated by blue lines intersecting the left ordinate.
The button "Download Graph Values" allows to export the values from the graphics in tab-separated format for import into other programs (e.g. Excel).
Consider a correlation of transcripts (as Anchor Set, with the start of the transcript as anchor point) with uploaded regions from a ChIP-Chip experiment as Set 2 elements. Setting the distance parameters for extraction to -500 to -300 allows you to extract transcripts that have a correlated ChIP-Chip region overlapping with the region 500 to 300 upstream of their TSS. Similarly, the according ChIP-Chip regions can be extracted.

A corresponding correlation table might look like this:

GenomeInspector is described in the following publications:
| © 1998-2011 Genomatix Software GmbH - All rights reserved |