共 37 条
Inferring interaction partners from protein sequences
被引:92
作者:
Bitbol, Anne-Florence
[1
,2
,3
]
Dwyer, Robert S.
[4
]
Colwell, Lucy J.
[5
]
Wingreen, Ned S.
[1
,4
]
机构:
[1] Princeton Univ, Lewis Sigler Inst Integrat Genom, Princeton, NJ 08544 USA
[2] Princeton Univ, Dept Phys, Princeton, NJ 08544 USA
[3] Univ Paris 06, Sorbonne Univ, CNRS, Lab Jean Perrin,UMR 8237, F-75005 Paris, France
[4] Princeton Univ, Dept Mol Biol, Princeton, NJ 08544 USA
[5] Univ Cambridge, Dept Chem, Lensfield Rd, Cambridge CB2 1EW, England
来源:
基金:
美国国家科学基金会;
美国国家卫生研究院;
关键词:
protein-protein interactions;
coevolution;
paralogs;
maximum entropy;
direct coupling analysis;
STATISTICAL-MECHANICS;
PREDICTION;
FAMILIES;
SYSTEMS;
INFORMATION;
SPECIFICITY;
ALIGNMENTS;
LANDSCAPE;
PATHWAYS;
CONTACTS;
D O I:
10.1073/pnas.1606762113
中图分类号:
O [数理科学和化学];
P [天文学、地球科学];
Q [生物科学];
N [自然科学总论];
学科分类号:
070301 [无机化学];
070403 [天体物理学];
070507 [自然资源与国土空间规划学];
090105 [作物生产系统与生态工程];
摘要:
Specific protein-protein interactions are crucial in the cell, both to ensure the formation and stability of multiprotein complexes and to enable signal transduction in various pathways. Functional interactions between proteins result in coevolution between the interaction partners, causing their sequences to be correlated. Here we exploit these correlations to accurately identify, from sequence data alone, which proteins are specific interaction partners. Our general approach, which employs a pairwise maximum entropy model to infer couplings between residues, has been successfully used to predict the 3D structures of proteins from sequences. Thus inspired, we introduce an iterative algorithm to predict specific interaction partners from two protein families whose members are known to interact. We first assess the algorithm's performance on histidine kinases and response regulators from bacterial two-component signaling systems. We obtain a striking 0.93 true positive fraction on our complete dataset without any a priori knowledge of interaction partners, and we uncover the origin of this success. We then apply the algorithm to proteins from ATP-binding cassette (ABC) transporter complexes, and obtain accurate predictions in these systems as well. Finally, we present two metrics that accurately distinguish interacting protein families from noninteracting ones, using only sequence data.
引用
收藏
页码:12180 / 12185
页数:6
相关论文

