Genomatix-Logo
Overview of Help-Pages
ElDorado-Logo

Transcript Information


[Identification] [Tissues] [Mapping quality] [Sequences]

The Transcript Info page presents information relevant for one transcript identified by its Genomatix TranscriptID. The data includes general information about the name, the genomic location, and the length of UTRs and coding sequence. It also provides information about the tissues the transcript is expressed in. The detailed data about the cDNA mapping used for the annotation process and the sequences of the transcript and the resulting protein are shown.

Identification

The Identification table provides you with the key features of the transcript like e.g. identifiers, length, number of exons, genomic location, and the Genomatix transcript quality. The length of the UTRs and the coding sequence (CDS) are derived from the transcript sequence by in silico methods. Conflicts with the UTR/CDS annotated for the related RefSeq/GenBank sequence are indicated.

Identification
Organism Homo sapiens
LocusID GXL_36015
Symbol SFRS1
Description splicing factor, arginine/serine-rich 1 (splicing factor 2, alternate splicing factor)
Transcript length 2026 bp
Number of exons 4
Transcript quality gold (experimentally verified 5' complete transcript)
Genomic location NC_000017 (-) 53435853 - 53439584
5'UTR 87 bp
CDS 606 bp (201 aa)
3'UTR 1333 bp

Tissues

For each transcript two different types of tissue association are provided.

  1. CAGE tags located up to 20bp upstream or downstream of the TSS of the transcript are used for annotation. Currently there are CAGE tags available from several tissues from mouse and human.
  2. ESTs are assigned to a Genomatix locus if one of their exons overlaps with an exon of the transcript. If the exon/intron structure of the EST perfectly fits into the exon/intron structure of a transcript the respective tissue annotation (if available) is assigned to the transcript.

The numbers following the tissue names indicate how often the respective tissue was annotated for this transcript.

Tissues
CAGE tags ESTs
liver(69)
adrenal gland(38)
bone marrow(31)
lung(24)
large intestine(23)
cerebrum(18)
colon(17)
embryo(13)
cecum(12)
brain(10)
thymus(10)
blood(7)
kidney(6)
adipose(6)
muscle(5)
rectum(4)
small intestine(4)
mammary gland(3)
testis(3)
frontal lobe(2)
parietal lobe(1)
spleen(1)
cerebellum(1)
na

Mapping quality

Transcripts in ElDorado are annotated by mapping the sequence of a cDNA from e.g. RefSeq or GenBank to the respective genomic sequence. The cDNA sequences may differ from the genomic sequences either by point mutations or insertions/deletions. Thus the resulting transcript annotated in ElDorado is not necessarily identical to the cDNA sequence used for the mapping. Gaps either in the cDNA or the genomic sequence located in the CDS of the transcript may result in frameshifts and the loss of the coding potential of the transcript. Details about this discrepancies are given in this table. The detailed output of the mapping process can be accessed by the link 'detailed mapping' in the last line of the table.

cDNA Mapping
The annotation of GXT_22214872 is based on mapping of AK225711.
5' not aligned 0 bp
3' not aligned 16 bp
Point mutations 1
Gaps in cDNA 0
Gaps in genomic sequence 0
detailed mapping

Sequences

The Sequences table provides the nucleotide sequence of the transcript and the protein sequence derived from. The exons in the transcript are alternately colored black and blue. Nucleotides in lower case indicate the UTRs. The coding sequence (CDS) is given in uppercase. Nucleotides correlated with a SNP are highlighted in blue. The known alleles are shown on 'mouse over'. The nucleotide sequence can be downloaded for further analysis either to your local disk or to your personal sequence directory on our server.

Sequences
cDNA sequence Protein sequence
   1 agacgtggtg ccgctgcggg ctcgctctgc cgtgcgctag gcttggtggg aaggcctgtt
  61 ctcgagtccg cgcttttcgt caccgccATG TCGGGAGGTG GTGTGATTCG TGGCCCCGCA
 121 GGGAACAACG ATTGCCGCAT CTACGTGGGT AACTTACCTC CAGACATCCG AACCAAGGAC
 181 ATTGAGGACG TGTTCTACAA ATACGGCGCT ATCCGCGACA TCGACCTCAA GAATCGCCGC
 241 GGGGGACCGC CCTTCGCCTT CGTTGAGTTC GAGGACCCGC GAGACGCGGA AGACGCGGTG
 301 TATGGTCGCG ACGGCTATGA TTACGATGGG TACCGTCTGC GGGTGGAGTT TCCTCGAAGC
 361 GGCCGTGGAA CAGGCCGAGG CGGCGGCGGG GGTGGAGGTG GCGGAGCTCC CCGAGGTCGC
 421 TATGGCCCCC CATCCAGGCG GTCTGAAAAC AGAGTGGTTG TCTCTGGACT GCCTCCAAGT
 481 GGAAGTTGGC AGGATTTAAA GGATCACATG CGTGAAGCAG GTGATGTATG TTATGCTGAT
 541 GTTTACCGAG ATGGCACTGG TGTCGTGGAG TTTGTACGGA AAGAAGATAT GACCTATGCA
 601 GTTCGAAAAC TGGATAACAC TAAGTTTAGA TCTCATGAGG TAGGTTATAC ACGTATTCTT
 661 TTCTTTGACC AGAATTGGAT ACAGTGGTCT TAAcagtgga atttcaaggt aaggattcag
 721 gcaaggttgt ccaagtaaat tgccagattt ctggttttag ttacattgta ttcattcagc
 781 atgtctgaag atagatgaaa gcttagatct ttcaatggaa agttctgtct atccaatagg
 841 gagaaactgc ctacatccgg gttaaagttg atgggcccag aagtccaagt tatggaagat
 901 ctcgatctcg aagccgtagt cgtagcagaa gccgtagcag aagcaacagc aggagtcgca
 961 gttactcccc aaggagaagc agaggatcac cacgctattc tccccgtcat agcagatctc
1021 gctctcgtac ataagatgat tggtgacact ttttgtagaa cccatgttgt atacagtttt
1081 cctttattca gtacaatctt ttcatttttt aattcaaact gttttgttca gaatgggcta
1141 aagtgttgaa ttgcattctt gtaatatccc cttgctccta acatctacat tcccttcgtg
1201 tctttgataa attgtatttt aagtgatgtc atagacagga ttgtttaaat ttagttaact
1261 ccatactctt cagactgtga tattgtgtaa atgtctatct gccctggttt gtgtgaactg
1321 ggatgttggg ggtgtttgtg gttatcttac ctggggaagt tcttatgttt atcttgcttt
1381 tcatgtgtct ttctgtagac atatctgaag agatggatta agaatgcttt ggattaagga
1441 ttgtggagca catttcaatc attttaggat tgtcaaaagg aggattgagg aggatcagat
1501 caataatgga ggcaatggtt tggattggag agggctcact ggatcccaat ccttggagct
1561 ggatcattgg attcaaatca taatgtggat aggataggga ggatgaatta ccaggattca
1621 tggagcggga tcagattacc aggaacatag gagtggattc ctgccccaac caaaccgcat
1681 tcgtgtggat ttttttattc aacttaattg gctattccaa agattttttt tttcctattt
1741 ttgacgattg gagcccttaa gatgcacgat ggaattgtgt tttgcgtttt ttggtaaaag
1801 gagcaaagcg aggacctgga gataaacgct ggagcaatct ccttggaagg attcagcacg
1861 agtagatggt aaacatttaa aggggaaagg gggggtttgt ttaaaatagt aaatcagtaa
1921 gtcacttcta aatttaaaga aaacaaaatt ggagttgaag aataagtagg tttccaattg
1981 gctattgccg ttttctttga aaaaataaac attttttaaa aaacta

as
  1 MSGGGVIRGP AGNNDCRIYV GNLPPDIRTK DIEDVFYKYG AIRDIDLKNR RGGPPFAFVE
 61 FEDPRDAEDA VYGRDGYDYD GYRLRVEFPR SGRGTGRGGG GGGGGGAPRG RYGPPSRRSE
121 NRVVVSGLPP SGSWQDLKDH MREAGDVCYA DVYRDGTGVV EFVRKEDMTY AVRKLDNTKF
181 RSHEVGYTRI LFFDQNWIQW S
Legend:
Exons are alternately coloured black and blue.
Nucleotides in lower case indicate the UTR.
The CDS is given in upper case.
Nucleotides highlighted in blue indicate SNPs (mouse over shows alleles).