Identifying functional single nucleotide polymorphisms in the human CArGome

被引:41
作者
Benson, Craig C. [2 ]
Zhou, Qian [1 ]
Long, Xiaochun [1 ]
Miano, Joseph M. [1 ]
机构
[1] Univ Rochester, Sch Med & Dent, Aab Cardiovasc Res Inst, Rochester, NY 14642 USA
[2] Univ Rochester, Ctr Med, Combined Internal Med Pediat Residency Program, Rochester, NY 14642 USA
关键词
transcription factor binding site; bioinformatics; serum response factor; CArG box; TRANSCRIPTION-FACTOR-BINDING; SERUM-RESPONSE FACTOR; GENOME-WIDE ASSOCIATION; OPEN-ACCESS DATABASE; REGULATORY POLYMORPHISMS; IDENTIFICATION; SITES; ELEMENTS; ACTIVATION; SNPS;
D O I
10.1152/physiolgenomics.00098.2011
中图分类号
Q2 [细胞生物学];
学科分类号
071009 ; 090102 ;
摘要
Benson CC, Zhou Q, Long X, Miano JM. Identifying functional single nucleotide polymorphisms in the human CArGome. Physiol Genomics 43: 1038-1048, 2011. First published July 19, 2011; doi:10.1152/physiolgenomics.00098.2011.-Regulatory SNPs (rSNPs) reside primarily within the nonprotein coding genome and are thought to disturb normal patterns of gene expression by altering DNA binding of transcription factors. Nevertheless, despite the explosive rise in SNP association studies, there is little information as to the function of rSNPs in human disease. Serum response factor (SRF) is a widely expressed DNA-binding transcription factor that has variable affinity to at least 1,216 permutations of a 10 bp transcription factor binding site (TFBS) known as the CArG box. We developed a robust in silico bioinformatics screening method to evaluate sequences around RefSeq genes for conserved CArG boxes. Utilizing a predetermined phastCons threshold score, we identified 8,252 strand-specific CArGs within an 8 kb window around the transcription start site of 5,213 genes, including all previously defined SRF target genes. We then interrogated this CArG dataset for the presence of previously annotated common polymorphisms. We found a total of 118 unique CArG boxes harboring a SNP within the 10 bp CArG sequence and 1,130 CArG boxes with SNPs located just outside the CArG element. Gel shift and luciferase reporter assays validated SRF binding and functional activity of several new CArG boxes. Importantly, SNPs within or just outside the CArG box often resulted in altered SRF binding and activity. Collectively, these findings demonstrate a powerful approach to computationally define rSNPs in the human CArGome and provide a foundation for similar analyses of other TFBS. Such information may find utility in genetic association studies of human disease where little insight is known regarding the functionality of rSNPs.
引用
收藏
页码:1038 / 1048
页数:11
相关论文
共 90 条
[1]   Identification of candidate regulatory SNPs by combination of transcription-factor-binding site prediction, SNP genotyping and haploChIP [J].
Ameur, Adam ;
Rada-Iglesias, Alvaro ;
Komorowski, Jan ;
Wadelius, Claes .
NUCLEIC ACIDS RESEARCH, 2009, 37 (12)
[2]   In silico detection of sequence variations modifying transcriptional regulation [J].
Andersen, Malin C. ;
Engstrom, Par R. G. ;
Lithwick, Stuart ;
Arenillas, David ;
Eriksson, Per ;
Lenhard, Boris ;
Wasserman, Wyeth W. ;
Odeberg, Jacob .
PLOS COMPUTATIONAL BIOLOGY, 2008, 4 (01) :0043-0054
[3]   Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project [J].
Birney, Ewan ;
Stamatoyannopoulos, John A. ;
Dutta, Anindya ;
Guigo, Roderic ;
Gingeras, Thomas R. ;
Margulies, Elliott H. ;
Weng, Zhiping ;
Snyder, Michael ;
Dermitzakis, Emmanouil T. ;
Stamatoyannopoulos, John A. ;
Thurman, Robert E. ;
Kuehn, Michael S. ;
Taylor, Christopher M. ;
Neph, Shane ;
Koch, Christoph M. ;
Asthana, Saurabh ;
Malhotra, Ankit ;
Adzhubei, Ivan ;
Greenbaum, Jason A. ;
Andrews, Robert M. ;
Flicek, Paul ;
Boyle, Patrick J. ;
Cao, Hua ;
Carter, Nigel P. ;
Clelland, Gayle K. ;
Davis, Sean ;
Day, Nathan ;
Dhami, Pawandeep ;
Dillon, Shane C. ;
Dorschner, Michael O. ;
Fiegler, Heike ;
Giresi, Paul G. ;
Goldy, Jeff ;
Hawrylycz, Michael ;
Haydock, Andrew ;
Humbert, Richard ;
James, Keith D. ;
Johnson, Brett E. ;
Johnson, Ericka M. ;
Frum, Tristan T. ;
Rosenzweig, Elizabeth R. ;
Karnani, Neerja ;
Lee, Kirsten ;
Lefebvre, Gregory C. ;
Navas, Patrick A. ;
Neri, Fidencio ;
Parker, Stephen C. J. ;
Sabo, Peter J. ;
Sandstrom, Richard ;
Shafer, Anthony .
NATURE, 2007, 447 (7146) :799-816
[4]   De novo motif identification improves the accuracy of predicting transcription factor binding sites in ChIP-Seq data analysis [J].
Boeva, Valentina ;
Surdez, Didier ;
Guillon, Noelle ;
Tirode, Franck ;
Fejes, Anthony P. ;
Delattre, Olivier ;
Barillot, Emmanuel .
NUCLEIC ACIDS RESEARCH, 2010, 38 (11) :e126-e126
[5]   Ets ternary complex transcription factors [J].
Buchwalter, G ;
Gross, C ;
Wasylyk, B .
GENE, 2004, 324 :1-14
[6]   The importance and identification of regulatory polymorphisms and their mechanisms of action [J].
Buckland, PR .
BIOCHIMICA ET BIOPHYSICA ACTA-MOLECULAR BASIS OF DISEASE, 2006, 1762 (01) :17-28
[7]   Mcm1 promotes replication initiation by binding specific elements at replication origins [J].
Chang, VK ;
Donato, JJ ;
Chan, CS ;
Tye, BK .
MOLECULAR AND CELLULAR BIOLOGY, 2004, 24 (14) :6514-6524
[8]   SNPnexus: a web database for functional annotation of newly discovered and public domain single nucleotide polymorphisms [J].
Chelala, Claude ;
Khan, Arshad ;
Lemoine, Nicholas R. .
BIOINFORMATICS, 2009, 25 (05) :655-661
[9]   FitSNPs: highly differentially expressed genes are more likely to have variants associated with disease [J].
Chen, Rong ;
Morgan, Alex A. ;
Dudley, Joel ;
Deshpande, Tarangini ;
Li, Li ;
Kodama, Keiichi ;
Chiang, Annie P. ;
Butte, Atul J. .
GENOME BIOLOGY, 2008, 9 (12)
[10]   Comparative assessment of methods for aligning multiple genome sequences [J].
Chen, Xiaoyu ;
Tompa, Martin .
NATURE BIOTECHNOLOGY, 2010, 28 (06) :567-U53