Separation of phylogenetic and functional associations in biological sequences by using the parametric bootstrap

被引:95
作者
Wollenberg, KR [1 ]
Atchley, WR [1 ]
机构
[1] N Carolina State Univ, Dept Genet, Raleigh, NC 27695 USA
关键词
D O I
10.1073/pnas.070154797
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Quantitative analyses of biological sequences generally proceed under the assumption that individual DNA or protein sequence elements vary independently. However, this assumption is not biologically realistic because sequence elements often vary in a concerted manner resulting from common ancestry and structural or functional constraints. We calculated intersite associations among aligned protein sequences by using mutual information. To discriminate associations resulting from common ancestry from those resulting from structural or functional constraints, we used a parametric bootstrap algorithm to construct replicate data sets. These data are expected to have intersite associations resulting solely from phytogeny. By comparing the distribution of our association statistic for the replicate data against that calculated for empirical data, we were able to assign a probability that two sites covaried resulting from structural or functional constraint rather than phylogeny. We tested our method by using an alignment of 237 basic helix-loop-helix (bHLH) protein domains. Comparison of our results against a solved three-dimensional structure confirmed the identification of several sites important to function and structure of the bHLH domain. This analytical procedure has broad utility as a first step in the identification of sites that are important to biological macromolecular structure and function when a solved structure is unavailable.
引用
收藏
页码:3288 / 3291
页数:4
相关论文
共 26 条
[1]  
APPLEBAUM D, 1996, PROBABILITY INFORMAT
[2]   Positional dependence, cliques, and predictive motifs in the bHLH protein domain [J].
Atchley, WR ;
Terhalle, W ;
Dress, A .
JOURNAL OF MOLECULAR EVOLUTION, 1999, 48 (05) :501-516
[3]   A natural classification of the basic helix-loop-helix class of transcription factors [J].
Atchley, WR ;
Fitch, WM .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1997, 94 (10) :5172-5176
[4]   Modeling residue usage in aligned protein sequences via maximum likelihood [J].
Bruno, WJ .
MOLECULAR BIOLOGY AND EVOLUTION, 1996, 13 (10) :1368-1374
[5]   An analysis of simultaneous variation in protein structures [J].
Chelvanayagam, G ;
Eggenschwiler, A ;
Knecht, L ;
Gonnet, GH ;
Benner, SA .
PROTEIN ENGINEERING, 1997, 10 (04) :307-316
[6]  
Efron B., 1993, INTRO BOOTSTRAP, V1st ed., DOI DOI 10.1201/9780429246593
[7]   CRYSTAL-STRUCTURE OF TRANSCRIPTION FACTOR E47 - E-BOX RECOGNITION BY A BASIC REGION HELIX-LOOP-HELIX DIMER [J].
ELLENBERGER, T ;
FASS, D ;
ARNAUD, M ;
HARRISON, SC .
GENES & DEVELOPMENT, 1994, 8 (08) :970-980
[8]   RECOGNITION BY MAX OF ITS COGNATE DNA THROUGH A DIMERIC B/HLH/Z DOMAIN [J].
FERREDAMARE, AR ;
PRENDERGAST, GC ;
ZIFF, EB ;
BURLEY, SK .
NATURE, 1993, 363 (6424) :38-45
[9]   STRUCTURE AND FUNCTION OF THE B/HLH/Z DOMAIN OF USF [J].
FERREDAMARE, AR ;
POGNONEC, P ;
ROEDER, RG ;
BURLEY, SK .
EMBO JOURNAL, 1994, 13 (01) :180-189
[10]   CORRELATED MUTATIONS AND RESIDUE CONTACTS IN PROTEINS [J].
GOBEL, U ;
SANDER, C ;
SCHNEIDER, R ;
VALENCIA, A .
PROTEINS-STRUCTURE FUNCTION AND GENETICS, 1994, 18 (04) :309-317