Genomatix-Logo
Overview of Help-Pages

VCF File Toolbox


[Introduction] [Input] [Actions]

Introduction

The VCF file toolbox provides a number of tools that are often needed when handling VCF files. The following actions can be performed:
  1. Filter VCF files
    • by a list of genes
    • by a list of positions
  2. Merging/concatenating several VCF files
  3. Conversion from VCF to BED/bigBed

General Parameters for the VCF file toolbox

Input
Input
Input data are accepted in VCF file format. Within this section you can either
  • choose from previously uploaded VCF files
  • or add a new VCF file to the list (by clicking "Add VCF file...")

When adding a new file, a new window will open, asking you to either

  • upload one or several VCF files from your local computer
  • or import one or several VCF files from the GMS (see more details)
  • or import one or several VCF files from the GGA (see more details)

Note that files critically depend on the underlying genome build, which can be changed by selecting a different ElDorado version on the top right of the page before uploading a file.

Tasks
Filter VCF files
Filter VCF files to retain only the specified positions.

Filter by a list of genes

Here you can either paste a list of Entrez Gene IDs (e.g 30818) or Ensembl Gene IDs (e.g. ENSG00000115041) into the input field or upload a file with Gene IDs.
In both cases the Gene IDs can be separated by commas, blanks, or new lines.

Note that the algorithm takes the selected species into account, so Gene IDs from e.g. mouse will not be found if "Homo sapiens" is selected as current genome (top right).

Filter by a list of positions

You can also filter VCF files by a list of positions from a BED file. Therefore simply select one of the previously uploaded BED files in the list or upload one.

Note: The chromosome notation (including MT) will automatically be detected and adapted.
Merging/concatenating VCF files
Here, two or more VCF files can be concatenated.
The sample columns have to be equal (in count and names) to the columns in the VCF file from the top selection, else the file will be skipped.
Check the 'allow overlaps (-a)' checkbox to allow first coordinates of one file to precede the last record of another.
Also, duplicates will be removed and the resulting VCF file will be sorted.
Note, that you need to select the first file from the selection on top of the page, and the file(s) to add to the first files here.

Note: The chromosome notation (including MT) will automatically be detected and adapted.
Conversion from VCF to BED/bigBed format Upload a VCF file or select an uploaded VCF file to convert it to BED or bigBed format.
Converts 1-based, closed VCF data to 0-based, half-open BED data. For each ALT allele a BED record will be printed.
Deletions will influence the end position in the BED record by the length difference between the REF and the ALT allele.
A total of six columns will be printed per BED record:
#chrom	chromStart	chromEnd	name	score/quality	strand
#chrom, #name, #score/qual are taken 'as is' from the VCF file (#CHROM, #ID, '#QUAL').
Output
Result Here, you can edit the default name of the result file.
Email option
Email address Here you can choose between two methods for receiving the results:
  • Show result directly in browser window
    In this option the URL of the result is directly shown in your browser window.

    Warning: Please use this option only for analyses which can be performed in a short time.
    If the analysis takes longer than the timeout of the webserver, the connection will be terminated and you will receive an error message (e.g. "The document contained no data."). In this case, the results will not be available, please restart the analysis using the option below "Send the URL of the result to".

  • Send the URL of the result via email
    In this option an email with the URL of the results will be sent to the user provided email address, when the analysis is finished.

The results will be available for a limited time on our server. For details of how long your results will be kept please see the result-email. After that period they will be deleted unless protected in the project management!

We recommend to use the email option for more than ca. 1,000 input regions.