Sharing knowledge helps to grow it.

Strategy Seminar and Workshop Next Generation Sequencing Data Analysis

Intended audience

Users who want to apply Next Generation Sequencing for ChIP or expression studies. All analyses are done on the Genomatix Mining Station (GMS) and Genomatix Genome Analyzer (GGA).

Training concept

Lectures on theoretical background alternate with practical hands-on examples under the guidance of an instructor. The second day is dedicated to the analysis of your own NGS data.


2 days, approx. 6-7 hrs. per day, start on first day: 10:00 am

Number of participants

3 (minimum) up to 6 (maximum)


Intrexon Bioinformatics Germany GmbH, Bayerstr. 85a, 80335 Munich, GERMANY


How do I book?

Contact us at to book a seat or learn more about our seminar schedule.

Please note that there is a 50% discount for academic participants.

Seminar contents

Day 1

  1. Methodological background and Genomatix Mining Station (GMS): mapping, SNP detection, read classification.
  2. Methodological background and hands-on examples using Genomatix Genome Analyzer (GGA):
    • Expression analysis
    • Assessment of related biology
    • Data visualization
    • Peak detection and classification
    • TF binding site analysis in ChIP peaks
    • De novo definition of common sequence motifs from ChIP data
    • Next-neighbor analysis and regulatory target prediction for ChIP regions
    • Meta analysis of different data sets
    • SNP analysis

Day 2

  1. The second day is reserved for the analysis of your own data.

    In order for you to be able to use your data in the course, please make sure that:
    • The format and content of your data is supported by the Genomatix system. Please consult the file SuitableDataFormats.pdf for supported data formats and source organisms. It is important to note that the majority of analyses in the Genomatix system depend on genome annotation and are therefore available for a number of eukaryotic species, which you find listed in the above document, but presently not for prokaryotes.
    • Sequence data of yours that need to be mapped must reach us ten days before the course at the latest - please see a detailed description how to handle and upload your data below.


Important information concerning the handling and analysis of your own data during the course.

For the analysis of NGS data, we will use the Genomatix Mining Station (GMS) and the Genomatix Genome Analyzer (GGA). There will be a live demonstration of the GMS, which is used for the first level analysis of the read sequences (including mapping), on the morning of the first day. On day two, you will have the opportunity to analyze your own data using the GGA. For most applications, the GGA accepts mapped data in the BED or bigBed file format. BAM or SAM format is presently not supported. These are the options for bringing your own data:

  1. If your read data are already mapped and in BED or bigBed file format please upload your data one week before the seminar start at the latest. This leaves time for us to contact you in case there are any compatibility issues with your data. For a description of these formats, please refer to the file SuitableDataFormats.pdf. Please verify that the species and the genome build your reads have been mapped to accord with the list of supported genomes, which you'll also find there. Some older genome builds (e.g. hg18) are also available. Please enquire beforehand if you plan to send data that have been mapped to a previous genome version, or if you cannot provide your mapped data in BED or bigBed format.
  2. If you want to analyze SNPs in your data you have two options: One, map your reads, call the SNPs using your software of choice, and bring the data to the course; in this case the SNP data must be presented as text files in the variant call format (VCF, see a description here), or as a list of dbSNP-IDs for SNP analysis on the GGA (effects on TF bindings sites, canonical splice sites, and protein sequences). Please make sure that the genome version your SNP data are based on is supported by the Genomatix system, as listed in SuitableDataFormats.pdf. Alternatively, upload your sequence data as described below, and notify us of your plans to do SNP analysis - we will then call SNPs for you on the GMS. Please note that SNP analysis cannot be done on data in BED or bigBed format.
  3. If your read data are available as sequence files we will map them for you in advance, using the GMS and make them available to you during the course. In this case, please have your data uploaded to the Genomatix ftp site 10 days prior to the seminar start at the latest (earlier is better in case you encounter connectivity problems). For organizational reasons, acceptance of data we receive later may be declined. Please consult the file SuitableDataFormats.pdf for supported sequence file formats and origin species. There is a limit to the amount of data we can accept from each participant. Please do not upload more than ca. 50 million Illumina or SOLiD reads, or 25 million 454 or IonTorrent reads. If your reads contain barcodes or linkers, please trim them accordingly, or send us the barcode/linker sequences so that we can mask them when mapping your data.

Please observe the following guidelines for uploading your sequence data for mapping by Genomatix:

Before you upload please make sure that you:

  • Include only files that are necessary and suitable for mapping (see descriptions of accepted file formats in SuitableDataFormats.pdf)
  • Include only data from supported species (see list of accepted species in the above link)
  • Compress your files into an archive file (zip, tar.gz, tar.bz2, rar, or 7z)
  • Select a filename that facilitates identification, e.g. including your name

How to upload your data per ftp:

  • Please open a command line console and go into the directory where your data files are.
  • Start the command "ftp".
  • You will be asked for a username - use "ftp" as username and your e-mail address as password.
  • After the welcome message, please enter the commands "bin" for binary transmission, "pass" for passive connections, "prompt" to be able to use wildcards later on without being asked for confirmation of every single file, and "hash" to see the upload progress.
  • Then please change into the uploads directory by issuing the command "cd uploads".
  • Now you can start the upload by issuing "mput *" - this wildcard will upload ALL files in your current directory. To limit the upload e.g. to zip files, use "mput *.zip" instead.
  • You should see several lines of hash signs showing the progress of your upload.
  • After the prompt returns (which will take quite a while) you can leave the program by closing the window or entering "quit".

After the upload has completed, please notify us per e-mail to Please make sure you included the following information:

  • File name of your uploaded file
  • Species your data originate from
  • Type of sequencer used
  • Read length (if fixed)
  • Barcode/linker sequences where applicable
  • Short description of the data (e.g. is it RNA- or ChIP-Seq, what is experiment/control) and of the aim of the analysis
  • Your phone number in case we need to get back to you