Genomatix-Logo
Overview of Help-Pages

Genomatix: Expression Analysis for RNASeq Data
(only available on GGA)


[Introduction] [Parameters] [Output]

Introduction

This task analyzes tags from RNA-Seq experiments and calculates expression values for all transcripts/loci available in ElDorado.
If two data sets are supplied (e.g. treatment vs. control, or condition A vs. condition B, or tissue A vs. tissue B), the task can perform a differential analysis, i.e. calculate lists of up- and down-regulated transcripts/loci between the data sets.
If replicates for treatment and control data are available, the user can select from different methods like 'DESeq2', 'DESeq' or 'edgeR' to calculate the differential expression.
Optionally, the input reads can also be classified by their association with features from the ElDorado genome annotation (read statistics for exon, intron, partial, promoter, intergenic region).

Definition: There are the two options for this task (for details see below): To keep the following descriptions as simple as possible, we use the terms transcript/locus or transcripts/loci to denote the two types of regions used for the RNA analysis, depending on the parameter setting.

Analysis of one data set

The reads in the input data set are analyzed, and for each transcript/locus in the respective genome the RPKM value (reads per kilobase transcript per million reads, Mortazavi et al., 2008) and additionally a normalized expression value (NE) is calculated from the read distribution. The NE-value is based on the number of reads located in the exons of the transcript/locus and is normalized to the length of the transcript/locus and the density of the data set.

Normalized expression/enrichment value (NE-value)
The NE-value is calculated based on the following formula:
NE = c * #readsregion / (#readsmapped * lengthregion)
where NE is the normalized expression or enrichment value,
#readsregion: the reads (sum of base pairs) of falling into either the transcript or the cluster region,
#readsmapped: all mapped reads (in base pairs),
lengthregion: the transcript or cluster length in base pairs
and c a normalization constant set to 107.

NE-values are provided for the whole transcript/locus and for the most and least expressed exon of the transcript/locus. The results are summarized in a statistical overview.
Additionally a list with the GeneIds of the genes with the highest expression values can be downloaded or directly be used within the Genomatix Pathway System.
If replicates for one condition are provided, the analysis is done separately for each input file. This allows comparison of the expression across replicates.

Differential Analysis

If the expression values for two different conditions (here called "treatment" and "control" for simplicity) are to be compared, the following statistical testing methods for evaluating differential expression are available:

While the Audic-Claverie-method does not handle replicates, 'DESeq2', 'DESeq' and 'edgeR' were developed specifically for replicate data. Moreover, edgeR cannot be used if there are no replicates available.

Audic and Claverie introduced a formula to compute a conditional probability for observing N reads (treatment) in a class given that M reads were observed before (control). These p-values, in combination with the Genomatix normalized expression (NE) value are used to evaluate differential expression.

The 'DESeq2', 'DESeq' and 'edgeR' methods model count data (here the number of reads from an RNA-Seq experiment mapped to a transcript) by a negative binomial distribution. The parameters of the distribution (mean and dispersion) are estimated from the data, i.e. from the read counts in the input files. Each method computes a measure of read abundance, i.e. expression level (called 'base mean' or 'mean of normalized counts' in DESeq/DESeq2, and 'concentration' or 'counts-per-million' in edgeR) for each transcript and apply a hypothesis test to each transcript to evaluate differential expression. In particular, the three methods determine a (adjusted) p-value and a log2 fold change (in expression level) for each transcript.
One parameter can be set for DESeq: the dispersion estimates are found by fitting a curve through the per-transcript dispersion estimates. The way this fitting is done can be specified to be either 'parametric' (the default in DESeq) or 'local'. Default settings are used for the other parameters, in particular single pooled values are used as empirical dispersion estimates, and the maximum of the empirical and fitted values is used the dispersion for a transcript resp. cluster. If there are no replicates, the settings are changed to the 'blind' method for computing the empirical dispersion estimates, and the fitted dispersion values are used. Sometimes, the parametric fitting fails, and in this case, the analysis should be rerun with the 'fitting method' set to 'local'. For details please refer to the DESeq vignette.
For DESeq2, two parameters are settable: The testing for differential expression can either be done with a Wald test or a Likelihood-ratio test. The former is the default testing method in DESeq2, while the latter is the one in use for DESeq. The other settable parameter is - as for DESeq - the fitting method used in dispersion estimation. See the DESeq2 vignette for details.
edgeR normalizes the count data using the TMM (trimmed mean of M-values) method introduced by Robinson and Oshlack. All parameters used in the edgeR algorithm as set to their respective default values. In particular, tagwise (i.e. per-transcript) dispersion estimation is used, with the tagwise dispersions squeezed towards the common dispersion, as described in the edgeR vignette.
Before the analysis, any transcripts without any mapped reads are removed from the dataset, i.e. from the input file 91.input_replicate_analysis, all transcripts are removed, which have a read count of '0' in all samples. In the output file 92.output_replicate_analysis, these transcripts are listed with the value 'NA' in each output column except the 'id' (p-value, fold-change etc.).

For defining up- and down-regulated transcripts between two conditions or samples, the following criteria are used (parameters set by the user):

Note that the first input set is regarded as "treatment", whereas the second input file is used as "control", i.e. "up-regulation" refers to a higher expression in set1 than in set2. Also note that the direction of up- and down-regulation will change if the two data sets are exchanged in the input.

To calculate the list of up-regulated genes, all up-regulated alternative transcripts of a gene are used to calculate a mean log2 fold change in expression level. The gene list containing GeneId, Symbol and mean log2 fold change is then sorted by the highest log2 fold change. The top 50 genes are displayed in the output, the complete list can be downloaded and can be used as input data e.g. for the Genomatix Pathway System.
The list of down-regulated genes is calculated correspondingly, using all down-regulated alternative transcripts of a gene.
The program also gives the list of up- and down-regulated genes, i.e. those genes where some alternative transcripts are up-regulated and some others are down-regulated at the same time.
See more details in the program output section below.


Part of this task and functionality is described in:

Sultan M, et al (2008)
A Global View of Gene Activity and Alternative Splicing by Deep Sequencing of the Human Transcriptome
Science 321 (5891), 956-60


Parameters

Input
Input file(s) with read positions from RNA-Seq

Input data are accepted in BED / bigBed file format or BAM file format containing the input regions. For some tasks BAM support might not be available.
The maximum amount of input regions and their maximum length can differ for the various tasks. The limits are usually shown on top of the input pages.

Within this section you can either
  • choose from previously uploaded BED/BAM files
  • or add a new BED or BAM file to the list (by clicking "Add BED/BAM file...")
For those tasks that allow to choose replicate data as input, you can use shift/ctrl-keys to select multiple files from the list. All selected files will then be treated as replicates.

When adding a new file, a new window will open, asking you to either

  • upload one or several BED/BAM files from your local computer
  • or import one or several BED/BAM files from the GMS (see more details)
  • or import one or several BED/BAM files from the GGA (see more details)
For the new BED/BAM files, you will have to select the correct organism, as the organism and the genome build are associated with the BED file for future use (the default is your latest choice in the current session).
Note that files critically depend on the underlying genome build, which can be changed by selecting a different ElDorado version on the top right of the page before uploading a file. You can see the list of genomes available in ElDorado.

Note that almost all browsers have a general upload limit of 2 GB, i.e. files bigger than this size should be zipped before uploading from your local computer. This restriction does not apply when using the direct import from the GGA/GMS.

Optionally you can specify a name for saving uploaded files on the server, otherwise the name of the uploaded file will be used. If several files are uploaded, the string given here will be used as prefix for each file name.

If any of the regions in the input file cannot be completely assigned to the selected genome (e.g. wrong chromosome numbering or wrong positions within a chromosome), an error message will appear and the regions will be skipped. If no valid region is found in an uploaded file, the complete file will be skipped.

After one or several BED/BAM files were uploaded successfully, and after closing the popup window, the list of available BED/BAM files will be automatically updated.

Uploaded BED or BAM files can be deleted from the project anytime via the project management.

Optional
control file(s) for differential analysis

If additional input data is available (e.g. data from a different condition or tissue, here called "control" data), it can be selected or uploaded here. After the tickbox is checked, an additional selection will appear (same options as for the "treatment" file(s), see above).
If several BED/BAM files are selected within the scrolling box, they are treated as replicates for the "control" condition.

Differential Analysis Parameters

The differential analysis parameter section will only appear, if at least one control file was uploaded in the section above.
There are four available algorithms for calculating the differential expression/enrichment values:

  • DESeq (recommended for replicate data, but does work on non-replicates, too)
    It is possible to select the 'fitting type' parameter for DESeq, i.e. the way how the curve is fitted through the dispersion estimates. For details on the meaning of this parameter please refer to the DESeq vignette.
  • DESeq2 (recommended for replicate data, but does work on non-replicates, too)
    As for DESeq, it is possible to select the 'fitting type' parameter for DESeq2, i.e. the way how the curve is fitted through the dispersion estimates.
    Additionally, DESeq2 offer two alternative methods for testing for differential expression: Wald test and Likelihood-ratio test (with Wald test being the default).
    For details on the meaning of the parameters please refer to the DESeq2 vignette.
  • edgeR (recommended for replicate data, does not work on non-replicates)
  • the procedure introduced by Audic and Claverie (if no replicates are available)
For a short introduction to the different methods, see above in the Introduction to the Differential Analysis.

The thresholds that define a transcript as differentially expressed (or a region as enriched/depleted) can be set here. There are two criteria, that are combined (both must be satisfied for differential expression/enrichment):

  • an adjusted p-value threshold for the significance of observing the detected change
    Note that the p-values calculated by the different methods (DESeq/DESeq2, edgeR, Audic-Claverie) can differ.
    Also note, that setting the p-value to 1 allows skipping of this criterium.
  • a threshold for the log2 fold change of expression/enrichment level
    A log2 ratio of 1 is a fold change of 2; a log2 ratio of 0.585 is a fold change of 1.5; e.g. if the log2 fold change of expression/enrichment is set to ≥ 1, the expression values must go up by at least 100% to appear in the differentially expressed transcripts/enriched regions list.
    The log2 fold change thresholds can be set separately for up- and down-regulation (enrichment/depletion).
    Note, that by setting the log2 fold change thresholds to 0, fold changes are ignored in the analysis.

Analysis Options
Transcript/Locus The expression analysis can be based on different units of underlying data:
  • Locus-based expression analysis:
    The exons of all transcripts with the same GeneID within a Genomatix locus are taken together and this "gene body" is used for counting reads (i.e. reads in overlapping exons of transcripts within the same locus are counted once)
  • Transcript-based expression analysis:
    All transcripts are considered separately when counting reads in exons (and reads within overlapping transcripts/exons might be counted several times)
If the transcript-based expression analysis is checked, the transcripts used for expression analysis can additionally be constrained by their source (e.g. NCBI RefSeq). By default, all non-redundant transcripts available in ElDorado are used. Depending on the organism, several transcript sources are available. For example, human and mouse transcripts are available from
  • NCBI RefSeq
  • Ensembl
  • NCBI GenBank
For plants, additional sources may be available (e.g. Phytozome for Glycine max).
Read Classification When checked, a read classification is done for each input file from the input data: The number of input reads overlapping genomic elements like exons, introns, promoters and intergenic regions will be given in the result.
Strand Specificity

Check this box if the sequencing experiment was done in a strand specific manner (depending on library preparation).

Output
Result Here, you can edit the default name of the result file.
Email address Here you can choose between two methods for receiving the results:
  • Show result directly in browser window
    In this option the URL of the result is directly shown in your browser window.

    Warning: Please use this option only for analyses which can be performed in a short time.
    If the analysis takes longer than the timeout of the webserver, the connection will be terminated and you will receive an error message (e.g. "The document contained no data."). In this case, the results will not be available, please restart the analysis using the option below "Send the URL of the result to".

  • Send the URL of the result via email
    In this option an email with the URL of the results will be sent to the user provided email address, when the analysis is finished.

The results will be available for a limited time on our server. For details of how long your results will be kept please see the result-email. After that period they will be deleted unless protected in the project management!

We recommend to use the email option for this task!


Output

The output has a number of sections, depending on the input (one or two data sets) and parameters:

  1. Analysis Parameters
  2. Differential Expression Analysis (only if both treatment and control data sets were provided)
  3. Read Classification Statistics for all input files (if optional parameter was set)
  4. Expression Analysis for each input data set
  5. Download of Data Files
The result sections are described in detail below.

1. Analysis Parameters


2. Differential Expression Analysis

For the differential expression analysis, a comparison of the expression values of the two input data sets (treatment versus control, possibly each with replicate data) is done. First the comparison is done on transcript/locus level by the selected method (DESeq, edgeR, or Audic/Claverie), i.e. each transcript/locus is checked, whether it fulfills the user-defined thresholds regarding
Only for the transcript-based analysis option:
In a second step, the analysis is done on gene level (here, a gene denotes a locus with a corresponding GeneId). To calculate the list of up-regulated genes, all up-regulated alternative transcripts of a gene are used to calculate a mean log2 fold change of expression level. The resulting gene list containing GeneId, Symbol and the mean log2 fold change is then sorted by the highest log2 fold change and the top genes are displayed in the output (see below).

Note that the log2 fold change values cannot be calculated under certain conditions (e.g. if no expression is detected for a transcript/locus in the control set). Such cases are indicated by a "-Inf", "Inf" or "NA" value in the output.

Differential Expression Overview

The following numbers are given in an overview table
expression profile overview

The download links below the numbers allow accessing

The formats of the files are described below.

Additionally all data files can be downloaded as a compressed archive in the download-section at the very end of the result page.

Here is a short overview of results and their corresponding file names (for format details see the section "Download of Data Files" below):

Details onFilename
all analyzed transcripts/loci 10.transcript_summary or 10.loci_summary (.tsv)
differentially expressed transcripts/loci 11.diff_expressed_transcripts or 11.diff_expressed_loci (.tsv, .bed for up/down)
both up- and down-regulated genes 12.diff_expressed_genes_up_and_down (.tsv)
up-regulated genes 13.diff_expressed_genes_up (.tsv, .list)
down-regulated genes 14.diff_expressed_genes_down (.tsv, .list)
all analyzed genes 15.genes_summary (.tsv)

Up-Regulation

Only for the transcript-based analysis option:
To calculate the list of up-regulated genes, all up-regulated alternative transcripts of a gene are used to calculate a mean log2 fold change of expression level. The resulting gene list containing GeneId, Symbol and the mean log2 fold change is then sorted by the highest log2 fold change.
The top 50 up-regulated genes are displayed in the output; the file with details can be downloaded from the overview table (see above).
Note, that not all alternative transcripts of a gene are necessarily regulated in the same direction, thus the total number of alternative transcripts for a gene and the number of up-regulated transcripts for a gene can differ. Here, the mean log2 fold change refers to all up-regulated alternative transcripts of a certain gene.
expression profile overview

Down-Regulation

The list of down-regulated genes is calculated as for the up-regulated genes, just using all down-regulated alternative transcripts of a gene and sorting by the lowest mean log2 fold change in expression..

Pathway and Network Analysis

Any subset of the up- and down-regulated gene lists can be used to start Genomatix Pathway System (GePS). In GePS, the log2 values will be used for coloring the genes in the pathways.
If the expression analysis was not done in the human system, an additional option allows to use the orthologous genes in human. This allows a transfer to the human canonical Signal Transduction Pathways available in GePS.

Plots

Note: The graphics are at first depicted as smaller icons, which can be enlarged by clicking them.

3. Read Classification Statistics

In case the read statistics option has been checked, statistics regarding the distribution of the reads for each of the input data sets is given.

Here, each input read is classified either as

Additionally, a region can be classified as promoter.

A graphical representation shows the enrichment of reads in the classes above compared to the genomic background (tab "Enrichment") and the distribution as a pie chart in comparison with the genomic background (tab "General").

enrichment
read statistics
A table lists the number of reads overlapping the genomic classes.
read statitics

Additionally, a table with the distribution of the reads on the different chromosomes of the genome is available. The content of this table is hidden by default, but can be shown by clicking the "Show details" link in the header.

If a detailed read classification is desired, this can be done for the input files with the task Annotation & Statistics.


4. Expression Analysis for each input data set

A table with an overview of the number of loci and transcripts found to be expressed is shown. Additionally, the minimum, average and maximum Normalized Expression values are given. (The NE-value is based on the number of reads located in the exons of the transcript and is normalized to the length of the transcript and the density of the data set (for details see above). For loci, the maximum NE value of all transcripts within the locus is used.)

expression profile overview

Details on NE value distribution

The histogram shows how many transcripts are expressed with a specific intensity. The histogram displays 50 classes of NE values. Note that the last class sums up all NE values larger than 1.
This section is hidden by default and can be shown by clicking on the >>>show details<<< link.

expression profile histogram

Expression Profile for Transcripts

A file with the complete annotation for each transcript as described in the expression profile section of the NGS-Analyzer page can be downloaded.
Additionally, a BED file with a region for each transcript annotated in ElDorado with the NE-value as score can be downloaded or saved directly into the Genomatix Suite project management, to be available for further analysis with other tasks.

Expression Profile for Genes

Here, the 5 genes with the highest expression values are displayed (the list can be extended to 50).
The complete gene list can be downloaded; it contains the GeneIds and Gene symbol of the genes together with the number of transcripts for this gene and their normalized expression values (NE value and RPKM value of the highest expressed transcript as well as the mean NE and mean RPKM values for all transcripts of the gene).
Additionally, the Genomatix Pathway System can be started directly from here, allowing the user to select the number of genes to use as input.
expression profile output

5. Download of Data Files

All resulting data and graphics files can be downloaded as an archive (tar-file).
The expression analysis results for each input data set can be found in the sub-directories called either "sample_<nr>" or "ctrl_<nr>".
Also, some additional data files are available, if one of the test methods 'edgeR' and 'DESeq' was selected

Here is an overview of results and their corresponding file names (for format details see below):

Details onFilename
For each input data set:
expression values for each transcript in this sample sample-dir/07.expression_profile
Statistics on the distribtuion of NE values sample-dir/08.expression_statistics
Main results:
all analyzed transcripts/loci 10.transcript_summary or 10.loci_summary (.tsv)
differentially expressed transcripts/loci 11.diff_expressed_transcripts or 11.diff_expressed_loci (.tsv, .bed for up/down)
both up- and down-regulated genes 12.diff_expressed_genes_up_and_down (.tsv)
up-regulated genes 13.diff_expressed_genes_up (.tsv, .list)
down-regulated genes 14.diff_expressed_genes_down (.tsv, .list)
all analyzed genes 15.genes_summary (.tsv)
If differential analysis with the test methods 'edgeR', 'DESeq', or 'DESeq2' was selected:
count data used for the test method 91.input_replicate_analysis
library size, i.e. the total read numbers 91.input_replicate_analysis.libsize
result from the test method 92.output_replicate_analysis
Plots:
MA plot 93.MA_plot.png
Volcano plot of adjusted p-values 94.volcano_plot.png
Histogram of unadjusted p-values 95.pvalue_histogram.png
Dispersion plot (for DESeq/DESeq2) 96.dispersion_plot.png
BCV plot (for edgeR) 96.bcv_plot.png

Data files for all analyzed transcripts/loci and for differentially expressed transcripts/loci

The differentially expressed transcripts/loci are the subset of all transcripts/loci, containing only those transcripts/loci that fullfil the user-defined criteria for the (adjusted) p-value and log2 fold change threshold.
Both the tab-separated file for all analyzed transcripts/loci (default name "10.transcript_summary.tsv" or "10.loci_summary.tsv", respectively) and the data file for the differentially expressed transcripts/loci ("11.diff_expressed_transcripts.tsv" or "11.diff_expressed_loci.tsv") contain the following information.
Note: columns marked with a (*) are only available with the transcript-based analysis option
  1:(*) transcript ID (Eldorado)
  2:(*) accession number of the transcript (external e.g. RefSeq, Genbank, Ensembl)
  3:(*) locus ID (Eldorado)
  4:  symbol of the gene
  5:  gene ID (NCBI Entrez Gene, 0 if not available, -2 if ambiguous)
  6:  contig/chromosome accession number
  7:  chromosome
  8:  strand
  9:  start position of the transcript/locus
 10:  end position of the transcript/locus (start < end)
 11:  length of the transcript/locus (sum of exons)
 12:  number of exons
 13:  p-value (depends on the selected method)
 14:  adjusted p-value(depends on the selected method)
 15:  log2(fold change), logarithmic (base 2) fold-change in read abundance/expression level in treatment 
      over control (> 0 is enrichment in treatment, < 0 is decrease in treatment). 
      The computation of the log2(fold change) value depends on the selected method:
        Audic-Claverie: log2 (NE value of treatment data set / NE value of control data set)
        DESeq/DESeq2: based on base mean values of treatment and control
        edgeR: based on concentration values of treatment and control
      Note, that the log2(fold change) value can be -Inf/+Inf if one of the conditions shows no expression
 16:  Regulation of treatment (set1) compared to control (set2), (values can be "up", "down", "no")

 the following columns depend on the number of input files:
 - number of reads for each replicate from the treatment sets and the control sets
 - normalized expression value for each replicate from the treatment sets and the control sets
 - the mean normalized expression value across the treatment replicates
 - the standard deviation of the normalized expression values across the treatment replicates
 - the mean normalized expression value across the control replicates
 - the standard deviation of the normalized expression values across the control replicates
 - RPKM for each replicate from the treatment sets and the control sets
 - the mean RPKM across the treatment replicates
 - the standard deviation of the RPKM values across the treatment replicates
 - the mean RPKM value across the control replicates
 - the standard deviation of the RPKM values across the control replicates

Example transcript data file (three replicates for treatment and control each, method 'DESeq')

TranscriptId(*)    Accn(*)        LocusId(*)      Symbol   GeneId   ContigAccn  Chromosome   Strand  Start      End         ...
GXT_21962895     NM_001034592 GXL_353276    DNAJB2   533668   NC_007300   chr2         +       111629112  111636393   ...
GXT_21962908     NM_174360    GXL_353253    CXCR2    281863   NC_007300   chr2         -       110616093  110617795   ...
GXT_21962913     XM_592166    GXL_353073    LCT      514332   NC_007300   chr2         +       64499763   64548198    ...

... Transcript length   #exons       p-value      adj. p-value    log2(fold change)  Regulation    ...
...    1894                10       5.86E-001      9.66E-001        -0.18               down       ...
...    1703                 1       7.42E-002      5.73E-001        -Inf                down       ...
...    5784                17       1.60E-001      7.93E-001         0.66               up         ...

... #reads treat1   #reads treat2   #reads treat3   #reads ctrl1   #reads ctrl2   #reads ctrl3     ...
...      1124           2563            629              2619            1105          1100        ...
...         0              0              0                 2               2             0        ...
...         0             11              2                 0               4             0        ...

...  NE treat1       NE treat2       NE treat3       NE ctrl1       NE ctrl2       NE ctrl3        ...
...     0.29           0.25            0.25            0.28           0.34           0.3           ...
...     0              0               0               0              0              0             ...
...     0              0               0               0              0              0             ...

... mean NE(treat)     stddev NE(treat)     mean NE(ctrl)     stddev NE(ctrl)                      ...
...     0.26                 0.02               0.31              0.03                             ...
...     0                    0                  0                 0                                ...
...     0                    0                  0                 0                                ...

...  RPKM treat1       RPKM treat2     RPKM treat3     RPKM ctrl1     RPKM ctrl2     RPKM ctrl3    ...
...     0.29             0.25            0.25            0.28           0.34           0.3         ...
...     0                0               0               0              0              0           ...
...     0                0               0               0              0              0           ...

... mean RPKM(treat)     stddev RPKM(treat)     mean RPKM(ctrl)     stddev RPKM(ctrl)
...     0.26                 0.02                  0.31                0.03
...     0                    0                     0                   0
...     0                    0                     0                   0

Data files for up-regulated genes and for down-regulated genes

The data files for the up-/down-regulated genes (default names "13.diff_expressed_genes_up.tsv" and "14.diff_expressed_genes_down.tsv") contain data allowing to assess variance of expression within the alternative transcripts of the gene. The following columns are available (tab-separated format):
Note: columns marked with a (*) are only available in with the transcript-based analysis option
  1:  gene ID (NCBI Entrez Gene)
  2:  symbol of the gene
  3:(*) number of alternative transcripts for this gene that are up-/down-regulated
  4:(*) total number of alternative transcripts available in the Genomatix annotation for this gene
  5: (mean)(*) log2 fold change of up-/down-regulated transcripts/loci
  6:(*) min log2 fold change of up-/down-regulated transcripts/loci
  7:(*) max log2 fold change of up-/down-regulated transcripts/loci
  8:(*) standard deviation across the log2 fold change values of the regulated alternative transcripts
  9: (minimum)(*) p-value for the regulated transcripts/loci
 10: mean NE(treat.reg.):   mean normalized expression value for the regulated transcripts/loci in the 
                            treatment data
 11: stddev NE(treat.reg.): standard deviation across the NE values for the regulated transcripts/loci in the 
                            treatment data
 12: mean NE(ctrl.reg.):    mean normalized expression value for the regulated transcripts/loci in the 
                            control data
 13: stddev NE(ctrl.reg.):  standard deviation across the NE values for the regulated transcripts/loci in the 
                            control data
 14: mean RPKM(treat.reg.):   mean RPKM value for the regulated transcripts/loci in the treatment data
 15: stddev RPKM(treat.reg.): standard deviation across the RPKM values for the regulated transcripts/loci in the 
                              treatment data
 16: mean RPKM(ctrl.reg.):    mean RPKM value for the regulated transcripts/loci in the control data
 17: stddev RPKM(ctrl.reg.):  standard deviation across the RPKM values for the regulated transcripts/loci in the 
                              control data

Example data file for up-regulated genes (13.diff_expressed_genes_up.tsv)

GeneId    Symbol  #transcripts regulated(*)  total #transcripts for gene(*)     ...
8074      FGF23               2                         2                   ...
4490      MT1B                3                         6                   ...
212       ALAS2               9                        14                   ...

...  mean log2(fold change) of reg. trans.(*)     min fold change of reg. trans.(*)    max fold change of reg. trans.(*)    ...
...       5.916                                       5.916                         5.916                        ...
...       5.068                                       5.059                         5.073                        ...
...       4.513                                       4.256                         5.064                        ...

...  fc stddev(*)   (min)(*) p_value    mean NE(treat.reg.)  stddev NE(treat.reg.)   mean NE(ctrl.reg.)    ...
...    0.000      1.29e-04          0.09296               0.116             0.00190          ...
...    0.007      5.20e-04          0.10983               0.076             0.00175          ...
...    0.364      3.22e-03          0.00795               0.004             0.00028          ...

...  stddev NE(ctrl.reg.)   mean RPKM(treat.reg.)   stddev RPKM(treat.reg.)   mean RPKM(ctrl.reg.)   stddev RPKM(ctrl.reg.)
...         0.002             9.05212               11.249                 0.18425                  0.178
...         0.002            10.76112                7.458                 0.25606                  0.208
...         0.000             0.79321                0.382                 0.02748                  0.041
The corresponding .list files (default names "13.diff_expressed_genes_up.list" and "14.diff_expressed_genes_down.list") contain a short format, with only
 1: gene Id (NCBI Entrez Gene)
 2: mean log2 fold change of up/down-regulated transcripts
 3: symbol of the gene

Data file for up- and down-regulated genes

This file is only available for the transcript-based analysis option:
The tab-separated file for the up- and down-regulated genes (default name "12.diff_expressed_genes_up_and_down.tsv") contains the following information.
 1: gene Id (NCBI Entrez Gene)
 2: symbol of the gene
 3: total number of alternative transcripts for this gene
 4: number of up-regulated transcripts for this gene
 5: mean log2 fold change of up-regulated transcripts
 6: number of down-regulated transcripts for this gene
 7: mean log2 fold change of down-regulated transcripts
 

Example data file for up- and down-regulated genes

GeneId     Symbol    total #transcripts for gene   ...
407173     JSP.1           10                      ...

... #up-regulated transcripts   mean log2(fold change up)    #down-regulated transcripts    mean log2(fold change down)
...      1                        2.11                                3                        -2.17

Data file for all genes

The data files for all genes (default names "15.genes_summary.tsv") contains data allowing to assess variance of expression within the alternative transcripts of all genes (this is mainly for the transcript-based analysis). The following columns are available (tab-separated format):
Note: columns marked with a (*) are only available in with the transcript-based analysis option
 1: gene Id (NCBI Entrez Gene)
 2: symbol of the gene
 3:(*) total number of alternative transcripts for this gene
 4: mean normalized expression in all transcripts/loci in treatment file(s)
    please note, that here the mean NE is calculated across ALL transcripts/loci of the gene across ALL replicates
 5: standard deviation of normalized expression of all transcripts/loci in treatment (possibly replicates)
 6: mean NE of all transcripts/loci in control file(s)
 7: standard deviation of normalized expression of all transcripts/loci in control (possibly replicates)
 8: mean RPKM in all transcripts/loci in treatment file(s)
    please note, that here the mean RPKM is calculated across ALL transcripts/loci of the gene across ALL replicates
 9: standard deviation of RPKM of all transcripts/loci in treatment (possibly replicates)
10: mean RPKM all transcripts/loci in control file(s)
11: standard deviation of RPKM of all transcripts/loci in control (possibly replicates)

Example data file all genes

GeneId  Symbol  total #transcripts for gene(*)     mean NE(treat)  stddev NE(treat)  mean NE(ctrl)  stddev NE(ctrl)   ...
1   A1BG           13                        5.55547          2.737          5.71738            2.709     ...
2   A2M            15                        2.09609          0.930          2.01791            0.832     ...
3   A2MP1          5                         0.00171          0.001          0.00196            0.002     ...

... mean RPKM(treat)   stddev RPKM(treat)   mean RPKM(ctrl)   stddev RPKM(ctrl)
...      547.45967         269.023           564.79435          266.654
...      216.99577          96.342           207.91488           85.830
...        0.16746           0.130             0.19160            0.157

Data files if one of the test methods 'edgeR', 'DESeq' or 'DESeq2' was selected