GPAT: Retrieval of genomic annotation from large genomic position datasets

被引:23
作者
Krebs, Arnaud [1 ]
Frontini, Mattia [1 ]
Tora, Laszlo [1 ]
机构
[1] Univ Strasbourg 1, Dept Funct Genom, IGBMC, CNRS UMR 7104,INSERM U 596,CU Strasbourg, F-67070 Strasbourg, France
关键词
D O I
10.1186/1471-2105-9-533
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Recent genome wide transcription factor binding site or chromatin modification mapping analysis techniques, such as chromatin immunoprecipitation (ChIP) linked to DNA microarray analysis (ChIP on chip) or ChIP coupled to high throughput sequencing (ChIP-seq), generate tremendous amounts of genomic location data in the form of one-dimensional series of signals. After pre-analysis of these data ( signal pre-clearing, relevant binding site detection), biologists need to search for the biological relevance of the detected genomic positions representing transcription regulation or chromatin modification events. Results: To address this problem, we have developed a Genomic Position Annotation Tool (GPAT) with a simple web interface that allows the rapid and systematic labelling of thousands of genomic positions with several types of annotations. GPAT automatically extracts gene annotation information around the submitted positions from different public databases (Refseq or ENSEMBL). In addition, GPAT provides access to the expression status of the corresponding genes from either existing transcriptomic databases or from user generated expression data sets. Furthermore, GPAT allows the localisation of the genomic coordinates relative to the chromosome bands and the well characterised ENCODE regions. We successfully used GPAT to analyse ChIP on chip data and to identify genes functionally regulated by the TATA binding protein (TBP). Conclusion: GPAT provides a quick, convenient and flexible way to annotate large sets of genomic positions obtained after pre-analysis of ChIP-chip, ChIP-seq or other high throughput sequencing-based techniques. Through the different annotation data displayed, GPAT facilitates the interpretation of genome wide datasets for molecular biologists.
引用
收藏
页数:6
相关论文
共 10 条
[1]  
*ENCODE, ENCODE PROJ
[2]   Integration of the cytogenetic map with the draft human genome sequence [J].
Furey, TS ;
Haussler, D .
HUMAN MOLECULAR GENETICS, 2003, 12 (09) :1037-1044
[3]  
*GFF, GFF FORM
[4]   An integrated software system for analyzing ChIP-chip and ChIP-seq data [J].
Ji, Hongkai ;
Jiang, Hui ;
Ma, Wenxiu ;
Johnson, David S. ;
Myers, Richard M. ;
Wong, Wing H. .
NATURE BIOTECHNOLOGY, 2008, 26 (11) :1293-1300
[5]   CEAS:: cis-regulatory element annotation system [J].
Ji, Xuwo ;
Li, Wei ;
Song, Jun ;
Wei, Liping ;
Liu, X. Shirley .
NUCLEIC ACIDS RESEARCH, 2006, 34 :W551-W554
[6]   The human genome browser at UCSC [J].
Kent, WJ ;
Sugnet, CW ;
Furey, TS ;
Roskin, KM ;
Pringle, TH ;
Zahler, AM ;
Haussler, D .
GENOME RESEARCH, 2002, 12 (06) :996-1006
[7]   RNA polymerase II transcription in murine cells lacking the TATA binding protein [J].
Martianov, I ;
Viville, S ;
Davidson, I .
SCIENCE, 2002, 298 (5595) :1036-1039
[8]   The multicoloured world of promoter recognition complexes [J].
Müller, F ;
Tora, L .
EMBO JOURNAL, 2004, 23 (01) :2-8
[9]   Study of stem cell function using microarray experiments [J].
Perez-Iratxeta, C ;
Palidwor, G ;
Porter, CJ ;
Sanche, NA ;
Huska, MR ;
Suomela, BP ;
Muro, EM ;
Krzyzanowski, PM ;
Hughes, E ;
Campbell, PA ;
Rudnicki, MA ;
Andrade, MA .
FEBS LETTERS, 2005, 579 (08) :1795-1801
[10]   A gene atlas of the mouse and human protein-encoding transcriptomes [J].
Su, AI ;
Wiltshire, T ;
Batalov, S ;
Lapp, H ;
Ching, KA ;
Block, D ;
Zhang, J ;
Soden, R ;
Hayakawa, M ;
Kreiman, G ;
Cooke, MP ;
Walker, JR ;
Hogenesch, JB .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2004, 101 (16) :6062-6067