Functional impact of missense variants in BRCA1 predicted by supervised learning

被引:50
作者
Karchin, Rachel [1 ]
Monteiro, Alvaro N. A.
Tavtigian, Sean V.
Carvalho, Marcelo A.
Sali, Andrej
机构
[1] Johns Hopkins Univ, Dept Biomed Engn, Baltimore, MD 21218 USA
[2] Johns Hopkins Univ, Inst Computat Med, Baltimore, MD 21218 USA
[3] H Lee Moffitt Canc Ctr & Res Inst, Risk Assessment Detect & Intervent Program, Tampa, FL USA
[4] Int Agcy Res Canc, F-69372 Lyon, France
[5] Univ Calif San Francisco, Dept Pharmaceut Chem, San Francisco, CA 94143 USA
[6] Univ Calif San Francisco, Calif Inst Quantitat Biomed Res, San Francisco, CA 94143 USA
关键词
D O I
10.1371/journal.pcbi.0030026
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Many individuals tested for inherited cancer susceptibility at the BRCA1 gene locus are discovered to have variants of unknown clinical significance (UCVs). Most UCVs cause a single amino acid residue ( missense) change in the BRCA1 protein. They can be biochemically assayed, but such evaluations are time-consuming and labor-intensive. Computational methods that classify and suggest explanations for UCV impact on protein function can complement functional tests. Here we describe a supervised learning approach to classification of BRCA1 UCVs. Using a novel combination of 16 predictive features, the algorithms were applied to retrospectively classify the impact of 36 BRCA1 C-terminal (BRCT) domain UCVs biochemically assayed to measure transactivation function and to blindly classify 54 documented UCVs. Majority vote of three supervised learning algorithms is in agreement with the assay for more than 94% of the UCVs. Two UCVs found deleterious by both the assay and the classifiers reveal a previously uncharacterized putative binding site. Clinicians may soon be able to use computational classifiers such as those described here to better inform patients. These classifiers can be adapted to other cancer susceptibility genes and systematically applied to prioritize the growing number of potential causative loci and variants found by large-scale disease association studies.
引用
收藏
页码:268 / 281
页数:14
相关论文
共 84 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   BRCA1 protein is linked to the RNA polymerase II holoenzyme complex via RNA helicase a [J].
Anderson, SE ;
Schlegel, BP ;
Nakajima, T ;
Wolpin, ES ;
Parvin, JD .
NATURE GENETICS, 1998, 19 (03) :254-256
[3]  
BAKER RT, 1994, J BIOL CHEM, V269, P25381
[4]   Prediction of the phenotypic effects of non-synonymous single nucleotide polymorphisms using structural and evolutionary information [J].
Bao, L ;
Cui, Y .
BIOINFORMATICS, 2005, 21 (10) :2185-2190
[5]   Point mutations in human beta cardiac myosin heavy chain have differential effects on sarcomeric structure and assembly: An ATP binding site change disrupts both thick and thin filaments, whereas hypertrophic cardiomyopathy mutations display normal assembly [J].
Becker, KD ;
Gottshall, KR ;
Hickey, R ;
Perriard, JC ;
Chien, KR .
JOURNAL OF CELL BIOLOGY, 1997, 137 (01) :131-140
[6]   EMPIRICAL AND STRUCTURAL MODELS FOR INSERTIONS AND DELETIONS IN THE DIVERGENT EVOLUTION OF PROTEINS [J].
BENNER, SA ;
COHEN, MA ;
GONNET, GH .
JOURNAL OF MOLECULAR BIOLOGY, 1993, 229 (04) :1065-1082
[7]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[8]   SmcHD1, containing a structural-maintenance-of-chromosomes hinge domain, has a critical role in X inactivation [J].
Blewitt, Marnie E. ;
Gendrel, Anne-Valerie ;
Pang, Zhenyi ;
Sparrow, Duncan B. ;
Whitelaw, Nadia ;
Craig, Jeffrey M. ;
Apedaile, Anwyn ;
Hilton, Douglas J. ;
Dunwoodie, Sally L. ;
Brockdorff, Neil ;
Kay, Graham F. ;
Whitelaw, Emma .
NATURE GENETICS, 2008, 40 (05) :663-669
[9]   The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 [J].
Boeckmann, B ;
Bairoch, A ;
Apweiler, R ;
Blatter, MC ;
Estreicher, A ;
Gasteiger, E ;
Martin, MJ ;
Michoud, K ;
O'Donovan, C ;
Phan, I ;
Pilbout, S ;
Schneider, M .
NUCLEIC ACIDS RESEARCH, 2003, 31 (01) :365-370
[10]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32