Genomatix-Logo
Overview of Help-Pages
RegionMiner

RegionMiner subtask: MACS - Model based Analysis for ChIPSeq
(only available on GGA)


[Introduction] [Parameters] [Output]

Introduction

The RegionMiner task Clustering NGS Data allows to select the MACS algorithm, that analyzes tags from ChIP-Seq data sets and finds significant regions for further analysis with other RegionMiner tasks. It thus represents a clustering algorithm for ChIPSeq data like NGSAnalyzer.

Details of the algorithm are described in

Zhang et al.: Model-based Analysis of ChIP-Seq (MACS).
Genome Biol (2008) vol. 9 (9) pp. R137

This RegionMiner task automatically sets a number of parameters for MACS (e.g. genome size), thus simplifying the usage. The output comprises the original MACS output, as well as the visualisation of the peak model. Resulting clusters can be downloaded as BED files or directly saved to the project managment to be used with other RegionMiner tasks.


Parameters

MACS parameters
Tag size The length of the input tags/reads. By default (-1), this value is determined from the input BED file, by reading the first 100 BED regions and calculating the average region length
p-value P-value cutoff for peak detection. the default is 1e-5
Bandwidth This value is used while building the shifting model. Default is: 300
mfold The upper model fold value for MACS to select the regions with a high-confidence enrichment ratio against background to build a model. The lower model fold value is automatically set to 1. If no models are found, the no-model option is used by MACS automatically.
Redundancy threshold The number of copies of identical reads allowed in a library. Values can be 'auto', 'all' or an integer
  • 'auto': MACS calculates the maximum tags at the exact same location based on binomal distribution using 1e-5 as pvalue cutoff
  • 'all': all input tags are kept
  • integer: at most this number of tags will be kept at the same location

Output ('Complete Clustering Results')

Analysis Parameters

Here, the analysis parameters like input files, result name, database version (i.e. underlying genome), the MACS version and the MACS parameters are shown.

Cluster detection

A summary with cluster statistics is displayed, including number of clusters found, number of reads within clusters and avergae cluster length.

MACS Output

The original MACS output is listed.

Peak Model

The peak model graphics is displayed (here as a PNG graphic), which is generated by MACS when calling the resulting R script.

Direct download of result files for further analysis with Regionminer

The resulting BED file containing the clusters/peaks can be downloaded or can directly be saved into the project management for further analysis with other RegionMiner tasks. The BED file includes the PHRED like quality score (see below) for each region.

The Excel™ file with details produced by MACS can also be downloaded here. It contains information about called peaks. You can open it in Excel™ and sort/filter using excel functions. Information includes:

Note that the BED file format is zero-based and half-open, whereas numbering in the tab-separated Excel™ file is based at 1 and includes the end position.

Example Output

MACS output

Copyright

The original MACS program is available under the terms of the Artistic License at Harvard.
Copyright (c) 2008 Yong Zhang, Tao Liu (taoliu@jimmy.harvard.edu)