Genomatix-Logo
Overview of Help-Pages
ElDorado-Logo

Genomatix Transcriptome Viewer

Genomatix Transcriptome Viewer (TViewer) allows the comprehensive visualisation of gene structures, i.e. transcript and promoter annotation from ElDorado. It also provides a visual framework for integration of paired-end analysis data, like

Transcripts are represented by exons (colored by normalized expression values (NE)), splice-junction lines (thickness by number of reads for transcriptome mappings) and promoters (colored by percentage of methylation). Transcript wide coverage plots for paired-end transcriptome mappings are shown in a dedicated viewer, the coverages can be plotted on read of fragment level in addition the paired-end distances are provided. All transcripts can be drawn separately or in a splicing graph layout where shared exons are merged.


Contents


Getting started

The Transcriptome Viewer can either be started via the Genomatix Software Suite navigation bar or with a preloaded gene locus from ElDorado under "Alternative Transcripts" or under "More Gene Info". The loci for each organism from the ElDorado database can be loaded via the Analyses menu on the top right. There, also RNA-seq, bisulfite or gene fusion data can be loaded from a connected GMS (Genomatix Mining Station) or as imported BAM file. All loci from the selected organism will be dispayed in a tab in the panel "Analyses". Now a locus can be selected and all transcripts are shown by default as a splicing graph, where shared exons are merged. If RNA-seq data has been loaded, the exons and splice junctions are drawn according to the number of mapped reads.

Fig. 1a: The Analyses menu in the upper navigation bar displays each organism from the ElDorado database.
Fig. 1b: After an analysis has been loaded all loci are displayed in the left panel "Analyses". Having selected a locus from the panel "Analyses", the transcripts are displayed by default as splicing graph. Additionaly all transcripts are listed in the right transcript panel.

A transcript can be highlighted by clicking on its exons and splice-junctions or by selecting the transcript in the transcript panel on the lower right. Promoters can be added to the view by enabling them in the view settings. Furthermore a promoter panel similiar to the transcript panel can be enabled in the top menu bar.

The navigation buttons

This menu contains all organisms from the ElDorado database, all imported BAM files and all RNA-seq (post-processed for the TViewer), Bisulfite-seq or gene fusion analyses from a connected GMS.
Each locus can be exported as an image, a text or graphML file.
This combo box allows to open various panels.
Toggle between the classic view (left button) or the splicing graph view (right button), where shared exons are merged.
Toggle between the unique view (left button) or the unique+multiple view (right button). In the unique view only reads are considered which could be mapped uniquely. In the unique+multiple view all mapped reads are considered. This is only possible for RNA-seq analyses from transcriptome mappings.
Table 1: Navigation buttons and combo boxes located in the top navigation bar.
Opens the analyses panel in which the loci of the loaded analyses can be opened.
Opens a settings panel in which the promoter can be added or removed in the displayed graph.
Opens the transcripts panel in which all transcripts for the loaded locus are listed.
Opens the promoter panel in which all promoters for the loaded locus are listed.
Opens the fused transcripts panel in which all transcript fusions for the loaded gene fusion are listed.
Opens the "Histogram of mapped reads on splice junctions".
Opens the "paired-end view panel" in which coverage and distances of paired-reads are plotted.
Table 2: The navigation buttons in the combo box "Panels" located in the top navigation bar.
Note that navigation buttons concerning RNA-seq data might not be shown if no RNA-seq data is available.

The classic view

In the classic view each transcript is drawn separately by drawing linear sequences of exons and splice junctions.

Fig. 2: Classic view of the locus from gene SERPINF2. Each transcript is drawn separately.

The splicing graph view

In the classic view shared exons are drawn multiple times. For most gene loci this produces a highly redundant picture. Therefore, in the splicing graph view shared exons are merged to create a so called splicing graph. Thereby, transcripts are essentially described by paths through the splicing graph. Two exons can be connected by multiple identical splice junctions, if they are both in multiple transcripts. If one clicks on an exon or on a splice junction, all involved transcripts are highlighed in the graph as well as in the transcript panel.

Fig. 3: Splicing graph view of the locus from the human gene SERPINF2. Shared exons are merged to create a so called splicing graph.

Exon positions

The layout of the classic and the splicing graph view preserves the positioning of the exons. Thus, exons that start or stop at the same position, are aligned accordingly. To obtain a condensed view, introns do not necessarily reflect their original length. To additionally describe the sequential arrangement of exons, each exon is described by two numbers: The first number describes the relative start position and the second number describes the relative stop position of the exon. If two exons have the same start or stop position on the genome, they both get the same number.

Fig. 4: Image detail of the splicing graph of the locus from the human gene SERPINA6. The exons 4-3 and 4-4 start at the same position, but 4-3 ends at a previous position. The exons 9-5 and 10-6 lie within the exon 8-7.

The transcript panel

The transcript panel lists all transcripts for the loaded locus with several attributes. After a click on a listed transcript, the transcript is highlighted in the view. Serveral transcripts can be highlighted by holding the ctrl-key (cmd key on MacOS X). All transcripts can be highlighted by selecting all transcripts in the panel or by clicking in the white area of the graph display.

Each transcript can be added or removed from the view by clicking on the respective checkbox. Combo boxes at the bottom of the panel allow filtering the transcripts in the view by their quality and their source. The display of attributes of the transcript panel can be added or removed with the "+"- and the "-"-button.

Fig. 5: All transcripts of the locus of the human gene SERPINA6.

Attributes of the transcript panel:

Promoters

Promoters can be enabled/disabled for display by clicking on the view settings button in the top bar and then selecting the checkbox. Promoters are depicted by slightly pointed oblongs and are only drawn for the region upstream the exon. Shared exons which are the first exon of a transcript and also the inner exon of some other transcripts are not merged, to depict a better view of the promoters. More information on the definition of promoter regions can be found here.

Fig. 6: All promoters of the locus of the human gene SERPINA6 are drawn in front of the first exons of all transcripts.

The promoter panel with details can be enabled by clicking on the promoter button in the panels combo box. For each promoter its associated transcripts are listed, together with a link to the MatInspector search for transcription factor binding sites. With a click on a promoter the associated transcripts are highlighted in the view.

Fig. 7: All promoters of the locus of the human gene SERPINA6.


RNA-seq mapping (only available on GGA with licensed NGS Mapping tasks)

The Transcriptome Viewer allows to interactively inspect transcript expression from RNA-seq data like transcriptome mappings or genome mappings from imported BAM files.

Fig. 8: Overview of the Genomatix Transcriptome Viewer with RNA-seq data. The top panel allows to load RNA-seq data, change the settings, open the paired-end view, and change the transcript drawing modes. The left panel depicts all genes expressed in the current analysis. A filter allows to quickly find genes of interest and the compare function allows to calculate fold-changes between any two analyses. The transcripts panel contains all transcripts of the gene currently loaded, with additional information like average and minimal expression values. The overview panel in the upper right can be used to zoom and pan through the main canvas. In the main view all transcripts of the currently loaded gene locus are shown. Each exon is colored by its NE value, and the thickness of the splice-junctions is drawn according to their read coverage.

Please note that paired-end data and splice junction coverage from BAM files is not supported yet.

Loading RNA-seq data

RNA-seq data can be loaded either as imported BAM file (genome mapping) from the GGA or as processed transcriptome/splice junction mapping. The imported BAM file is directly available in the analyses combo box and exon and transcript NEs are calculated on the fly with the NGSAnalyzer. However, the transcriptome and the optional splice junction mapping first have to be preprocessed with the "TViewer data generation module" on the GMS. After this, the available RNA-seq projects are listed in the Analyses menu (perhaps the analyses have to be reloaded. In this menu a whole project or individual selected RNA-seq analyses can be loaded. Each analysis is displayed as a tab in the panel "Analyses". Please note that you can only load different analyses from the same organism and ElDorado version.
Clicking on the refresh button () allows reloading the Analyses menu, e.g. if new analyses should be available from the GMS or a BAM file has been imported.

Fig. 9: The Analyses menu in the upper navigation bar displays each organism from the ElDorado database and all RNA-seq projects which can be loaded in the Transcriptome Viewer.

Representation of RNA-seq data

For the coloring of exons and the thickness of the splice-junction lines, two modes are available: the "unique+multiple view" and the "unique view" (Please note, that the distinction between unique and multiple hits is only made for transcriptome and splice junction mappings). The "unique+multiple view" will consider all single reads that map to a transcript, even if they also map to other transcripts (e.g., because of exon sharing). The "unique view" will only consider single reads that are uniquely mapped. In the tooltip both expression values are displayed. The exons are colored according to their normalized expression (NE) value. The thickness of the splice-junction lines indicates the number of reads spanning over the adjacent exons. The expression gradient in Fig. 11a depicts the coloring of the exons from zero to the maximum NE value of all exons in the loaded locus. The maximum NE value of exons in the loaded locus is determined from all opened analyses.

If the reads were also mapped against the splice junction library, novel splice junctions are also included in the view. Novel splice junctions by cis-splicing are drawn as dashed lines and novel splice junctions by trans-splicing are drawn as dotted lines. The minimum number of reads a novel splice junction must have to be included in the view, can be defined in the settings panel.
Please note that novel splice junctions are filtered by default for splice junctions which have less or equal number of mapped reads as known splice junctions with the same exon boundaries. This filter can be disabled in the settings panel.

Fig. 10: The dotted line from exon 3-3 to exon 6-5 indicates a novel splice-junction by trans-splicing and the dashed line from exon 3-3 to exon 6-6 indicates a novel splice junction by cis-splicing.

Normalized Expression Values

Normalized expression values (NE) are calculated from the read distribution for transcripts as well as for exons.
The NE-value for an exon/transcript is based on the number of nucleotides of all reads which cover the exon/transcript and is normalized to the length of the exon/transcript and the total number of nucleotides of all mapped reads in the data set.

Normalized expression/enrichment value (NE-value)
The NE-value is calculated based on the following formula:
NE = c * #nucleotidesregion / #nucleotidesmapped * lengthregion
where NE is the normalized expression value,
#nucleotidesregion the number of nucleotides of reads falling into the exon/transcript,
#nucleotidesmapped the number of nucleotides of all mapped reads,
lengthregion the exon/transcript length in base pairs
and c a normalization constant set to 107.

Comparison of RNA-seq analyses

Multiple analyses can be loaded via the Analyses menu. A slider panel allows to color the current graph by different analyses. Moving the slider changes not only the coloring of the graph, but also changes the data in transcript panel and paired-end view.

Furthermore, two analyses can be compared by selecting two analyses with the compare function (compare button in the lower right of the analyses panel). Then log fold-changes between all transcripts are calculated on the basis of their NE (unique+multiple) and the loci are listed with the maximum fold-change of their transcripts as new tab in the panel "Analyses". From this tab a locus with its transcripts can be loaded. Note that the two compared analyses are not available from the slider panel.
Then the graph is colored according to the fold-changes of their exons and their splice-junctions. High over-expression of an exon or splice-junction in the first analysis is indicated by stronger coloring towards red. A strong under-expression of an exon or splice-junction in the first analysis is depicted by coloring towards blue (see figures below).

Fig. 11a: Expression and positive fold-change gradient (from zero to max).
Fig. 11b: Negative fold-change gradient (from zero to max).

Paired-end view

Paired-end data of a transcriptome mapping is visualized in a dedicated plot. The coverage of fragments can be plotted for each transcript. Furthermore, the distances between the paired reads are displayed by plotting the average distances along the transcript. The distances can also be visualized separately for first and second read by selecting "Directional distances". Here, the distance for the first read is drawn as blue plus-sign and the distance for the second read is drawn as red minus-sign.
Note that all paired-end reads are transcript-unique, i.e. all paired-end reads could only be mapped once within the transcript.

This plot allows to see if the complete transcript is covered with paired-end reads and if they are within the expected distance. If the expected distance deviates, this could, for example, be an indication of exon skipping (Fig. 12). Exons are alternately displayed grey and white.

Fig. 12: Paired-end view. Paired-end coverage and distance plot of transcript NM_202002 from gene FOXM1. Exon 9 shows a drastic drop in read coverage and the paired-end reads in exon 8 and 10 have increased distances. These two lines of evidence support the hypothesis that this transcript is not expressed.

Histogram of mapped reads on splice junctions

The histogram shows the distribution of mapped reads on splice junctions for transcriptome mappings. The mapped reads are plotted for all splice junctions (known and novel) as well as only for the novel splice junctions. The number of mapped reads are binned and the height of a bar represents the number of splice junctions with the corresponding number of mapped reads. The slider at the bottom of the panel allows to limit the range of visualized mapped reads.

Fig. 13: Histogram of mapped reads on splice junctions.

Bisulfite-seq mapping (only available on GGA with GMS)

The Transcriptome Viewer allows to interactively inspect not only transcript expression but also promoter methylation states from Bisulfite-Seq mappings on the GMS.

Fig. 14: A splicing graph with methylated promoters at the top and a promoter panel with methylation informations at the bottom.

Loading Bisulfite-Seq data

The Analyses menu allows to directly load Bisulfite-Seq analyses and to assign a Bisulfite-Seq analysis to an RNA-Seq analysis.

The assignment of a Bisulfite-Seq analysis to an RNA-Seq analysis can be done with a simple drag-and-drop procedure: by dragging a Bisulfite-Seq analysis and droping it on the according RNA-Seq analysis the analysis are combined for display. If the assignment is possible then the cursor shows a green plus , otherwise the cursor shows a red cross e.g. if the ElDorado version or the organism are not compatible.

Please note that the assignment of analyses is only stored for the current session. The assignments will be cleared, once you have logged out.

Fig. 15a: Dragging a Bisulfite-Seq analysis on an RNA-Seq analysis and dropping it.
Fig. 15b: The analyses after applying the assignment.

Representation of Bisulfite-seq data

The Bisulfite-Seq data will be visualized as colored promoters. The higher the percentage of methylation, the more towards red the promoter will appear. A tooltip shows the weighted percentage of methylation. This value is calculated as follows:

(meth_plus × cov_plus) + (meth_minus × cov_minus) cov_plus + cov_minus

Formular 1: meth_plus = average of all methylation percentage level for all covered CpG sites on the plus strand, cov_plus = number of covered CpG sites on the plus strand, meth_minus = average of all methylation percentage level for all covered CpG sites on the minus strand, cov_minus = number of covered CpG sites on the minus strand

Fig. 16: The splicing graph has three methylated promoters and two unmethylated promoters. The splicing graph shows only little expression which might result from the methylated promoters.

Promoter panel

The assigned Bisulfite-Seq analyses will be also shown in detail in the promoter panel. Promoter attributes like coverage and number of CpG sites will be shown. Attributes of the "Promoter panel" can be added or removed with the "+"- and the "-"-button.

Fig. 17: The promoter panel shows a whole range of mapping information, if a Bisulfite-Seq analysis has been assigned.

Comparison of Bisulfite-seq analyses

Bisulfite-Seq analyses can be compared like RNA-Seq analyses (see Comparison of RNA-seq analyses) by selecting two Bisulfite-seq analyses with the compare function (compare button in the lower right of the analyses panel). Then differences between all promoters are calculated on the basis of their methylation value (weighted average of all methylation percentage levels, see formular 1) and the loci are listed with the maximum differences of their promoters as new tab in the panel "Analyses". Then the promoters of these loci are colored according to the differences. The coloring of the differences is shown in Fig. 11a and 11b.

RNA-seq analyses with assigned Bisulfite-seq analyses (see Loading Bisulfite-Seq data) can also be compared, so that you have in addition to NE fold-changes also methylation differences. Therefore the RNA-seq analyses with the assigned Bisulfite-seq analyses have to be compared .


Gene Fusion analyses (only available on GGA with GMS)

The Transcriptome Viewer allows to interactively inspect transcript fusion expression from Gene Fusion analyses on the connected GMS. Please note that most of the Gene Fusion features are only available in the splicing graph view.

Fig. 18: The left "Analyses panel" displays all detected Gene Fusions. A selected Gene Fusion is shown in the middle as splicing graph. All detected transcript fusions are shown in the panel at the right.

Loading a Gene Fusion analysis

Gene Fusion analyses are listed in the Analyses menu under its corresponding RNA-Seq analysis. Gene Fusions of a selected analysis will then be shown in the "Analysis panel". The Gene Fusions are grouped in "Gene fusions within" (mate pairs align to different genes on the same chromosome), "Gene fusions across" (mate pairs align to different genes on different chromosomes) and "Readthroughs" (mate pairs align to adjacent genes).

Fig. 19: The Analyses menu in the upper left of the TViewer lists all projects with RNA-Seq analyses. Gene Fusion analyses (marked with a red circle in the figure) are listed under each corresponding RNA-Seq analysis.

Representation of Gene Fusions

Gene Fusion are shown as splicing graph by default and contain only the transcripts of the detected transcript fusions. Transcripts not supported by mate pairs are not shown. The option "Fuse transcripts" (selected by default) in the "Settings panel" removes all exons which are completely located between both breakpoint regions. The breakpoint region is the sequence within a transcript between the outmost 5' and 3' mates which align to a gene fusion cluster. The breakpoint regions are drawn as grey areas. If a breakpoint is identified, then it will be drawn as black vertical line. More information about the Gene Fusion analysis details can be found in the GMS manual. The "Fuse transcripts" option connects also transcripts with a splice junction if both transcripts are detected as fusion and are supported with enough read pairs. If the "Fuse transcripts" option is not selected, then both genes/loci will be drawn completely with all exons and the transcripts of the fusion won't be connected with splice junctions.

Fig. 20: The splicing graph shows the detected Gene Fusion between BCAS4 and BCAS3 in the breast cancer cell line MCF7. The left part of the graph shows the exons of the BCAS4 transcripts and the right part shows the exons of the BCAS3 transcripts. The transcripts are connected with splice junctions if both transcripts are detected as fusion and are supported with enough read pairs. The grey areas mark the breakpoint regions and the black vertical lines mark the breakpoint.
Fig. 21: The loci are drawn completely if the option "Fuse transcript" has not been selected.

In the panel "Settings" you can also define the minimum number of non-redundant read pairs and the breakpoint score that have to support a shown transcript fusion. The option "Merge SJs" merges transcript fusion SJs with existing ones to depict a clearer view.

Fig. 22: The same splicing graph as in Fig. 20, but with no merged splice junction. If the option "Merge SJs" is not selected, then every splice junction of every transcript fusion is drawn. Otherwise splice junctions for transcript fusion are merged with canonical splice junctions, if possible.

Fused transcripts panel

The panel "Fused transcripts" contains all detected transcript fusions passing the two filters "Minimal cluster size" and "Minimal average breakpoint score". Selecting one or several transcript fusions will highlight them in the splicing graph. The attributes in the panel "Fused transcripts" can be added or removed with the "+"- and the "-"-button.

Fig. 23: This panel shows detail information about all transcript fusions.

Attributes of the transcript fusion panel:

Transcript Fusion Sequence panel

The button "load" in the "transcript fusion panel" opens the "Transcript Fusion Sequence panel" and loads the cDNA, coding and amino acid sequence for the transcript fusion. The cDNA sequence is determined by the exons upstream and within of 5' breakpoint cluster and the exons downstream and within of the 3' breakpoint cluster. If a breakpoint has been detected, then the exons upstream of the 5' breakpoint and downstream of the 3' exons are used. If the breakpoint lies within an exon, then the only the sequence up to the breakpoint is used. The coding sequence is determined by the longest open reading frame in the cDNA sequence and the respective start and stop codon position are shown below. If no stop codon can be detected, then the coding sequence ranges from the first start codon to the end of the cDNA sequence. Furthermore the amino acid sequence translated from the coding sequence is shown.

Fig. 24: This panel shows sequence information of a transcript fusion.

Paired-end view

The paired-end view for a transcript fusion shows the coverage and distances for both complete transcripts. The skipped exons are shaded and enclosed with two dashed vertical lines. The central dashed vertical lines denotes the boundary between the two transcripts. The two continuous vertical lines denotes the position of the breakpoints (if available) and can superimpose the previously mentioned vertical dashed lines.

Fig. 25: The paired-end view for a transcript fusion shows the coverage and distances for both complete transcripts.

Export

A number of export options are available. Each locus can be exported as image, text and graphML file. Gene fusions are only exported for all transcripts of both loci as text files. The expression data is usually exported for the current analysis (analysis selected with the slider or opened differential analysis), except the exons and transcript can be exported for all opened analyses. Expression data for transcripts can also be exported for all loaded analyses.