"Weight Matrix"
or "Nucleotide Distribution Matrix"
[Aligned sequences] [Profile
of a weight matrix] [Weight matrix] [IUPAC]
[GEMS Launcher main menu]
Weight matrices are selective descriptions of DNA patterns.
They are based on the nucleotide distribution observed
in aligned DNA sites. Weight matrices are
fully automatically generated using MatDefine.
The Genomatix program MatInspector scans
genomic sequences for matches to such weight matrices. Both tools are integrated
into GEMS Launcher.
| Aligned sequences
of a transcription factor binding site (yeast ABF) |
| Name | Alignment | | SCPLASM | | TATCTTTGTTAACGA | | | SCCOXCH2 | | GATCATTCCCAACGA | | | SCS33AA_INV | | GGTCACTCTAGACGG | | | M28606_INV | | TATCATTGCAAACGT | | | SCPHO5_INV | | CATCGTTAATGACGT | | | SCRGL2 | | TATCACGTCACACGA | | | SCPK01 | | CATCTCTCGCAACGG | | | SCUBCOX8_INV | | AGTCACGTGGAACGG | | | SCBAF1 | | CATCCCCATTAACGA | | | SCANB1RE_1 | | AATCATATTCGACGA | | | SCHIS3G_DED2 | | TGTCATTCTGAACGA | | | SCTMC1A | | AATCGTTTTGTACGT | | | SCHIS3G_DED1 | | CATCATTCTATACGT | | | SCRPO31 | | CATCACTATATACGT | | | SCANB1RE_2 | | TGTCGTCTCACACGG | | | SCMAT3_INV | | TATCGCCATATACGA | | | SCRPC40_INV | | AGTCACTATAAACGG | | | SCBTUB_2 | | GGTCACGATATACGT | | | SCBTUB_1_INV | | GGTCACTGTACACGT | | | SCMAT4 | | CATCATAAAATACGA | | | CHRIII_2 | | AATCACGAGCGACGG | | | SCENOC | | TGTCACTAACGACGT | |
|
| Profile of the nucleotide
distribution matrix |
|
|
 |
|
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
| IUPAC: |
n |
n |
r |
t |
c |
a |
y |
t |
n |
t |
n |
n |
A |
C |
G |
N |
n |
n |
n |
|
| Weight matrix
or nucleotide distribution matrix |
| Pos. |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
13 |
14 |
15 |
| A |
5 |
14 |
0 |
0 |
15 |
0 |
2 |
9 |
3 |
11 |
8 |
22 |
0 |
0 |
8 |
| C |
6 |
0 |
0 |
22 |
1 |
12 |
3 |
5 |
4 |
5 |
3 |
0 |
22 |
0 |
0 |
| G |
4 |
8 |
0 |
0 |
4 |
0 |
4 |
3 |
3 |
3 |
5 |
0 |
0 |
22 |
6 |
| T |
7 |
0 |
22 |
0 |
2 |
10 |
13 |
5 |
12 |
3 |
6 |
0 |
0 |
0 |
8 |
| IUPAC |
N |
R |
T |
C |
A |
Y |
T |
N |
T |
N |
N |
A |
C |
G |
N |
| Ci |
15.2 |
59.3 |
100.0 |
100.0 |
42.2 |
57.2 |
31.0 |
18.6 |
26.4 |
23.8 |
17.3 |
100.0 |
100.0 |
100.0 |
32.3 |
|
The plot of the profile of a weight
matrix visualizes the differences of the nucleotide conservation at a certain
position. The higher the number of asterisks (*) the higher the conservation.
Typically, a weight matrix of a transcription factor binding site consists
of a higher conserved core (red) and additional, less conserved positions.
The weight matrix table shows the frequencies of the nucleotides
(A, C, G, T) at each position (Pos.) of the aligned sequences. The
respective IUPAC string
shown under the weight matrix profile and the weight matrix table is just a
very rough description of a DNA pattern. For instance, compare the differences
in the altitude of the profile at the three different positions that result
in a "T" in the IUPAC string.
Usually, the strength in conservation of a certain position
within a protein binding site is due to the function of the binding site. Higher
conserved positions in the DNA site have higher impact to the binding strength
and the specificity with respect to the appropriate protein (the transcription
factor, for instance). A weight matrix reflects this profile of a binding site
most accurate in weighing each position according to the observed biological
conservation. Therefore, a weight matrix is a very accurate description of
a binding site.
| © 1998-2013 Genomatix Software GmbH - All rights
reserved |