Protein interaction databases;
Database comparisons;
Protein interactions;
Molecular networks;
Systems biology;
Database and software selection;
INTERACTION NETWORKS;
CURATION;
VIEW;
D O I:
10.1016/j.jbi.2020.103380
中图分类号:
TP39 [计算机的应用];
学科分类号:
080201 [机械制造及其自动化];
摘要:
In absence of periodic systematic comparisons, biologists/bioinformaticians may be forced to make a subjective selection among the many protein-protein interaction (PPI) databases and tools. We conducted a comprehensive compilation and comparison of such resources. We compiled 375 PPI resources, short-listed 125 important ones (both lists are available at startbioinfo.com), and compared the features and coverage of 16 carefully-selected databases related to human PPIs. We quantitatively compared the coverage of 'experimentally verified' as well as `total' (experimentally verified and predicted) PPIs for these 16 databases. Coverage was compared in two ways: (a) PPIs obtained in response to gene queries using the web interfaces were compared. As a query set, 108 genes expressed differently across tissues (specific to kidney, testis, and uterus, and ubiquitous - i.e., expressed in 43 human normal tissues) or associated with certain diseases (breast cancer, lung cancer, Alzheimer's, cystic fibrosis, diabetes, and cardiomyopathy) were chosen. The coverage was also compared for the well-studied genes versus the less-studied ones. The coverage of the databases for high-quality interactions was separately assessed using a set of literature curated experimentally-proven PPIs (gold standard PPI-set); (b) the back-end-data from 15 PPI databases was downloaded and compared. Combined results from STRING and UniHI covered around 84% of 'experimentally verified' PPIs. Approximately 94% of the 'total' PPIs available across the databases were retrieved by the combined use of hPRINT, STRING, and IID. Among the experimentally verified PPIs found exclusively in each database, STRING contributed around 71% of the hits. The coverage of certain databases was skewed for some gene-types. Analysis with the gold-standard PPI-set revealed that GPS-Prot, STRING, APID, and HIPPIE, each covered 70% of the curated interactions. The database usage frequencies did not always correlate with their respective advantages, thereby justifying the need for more frequent studies of this nature.
机构:
Univ Roma Tor Vergata, Dept Biol, I-00173 Rome, ItalyUniv Roma Tor Vergata, Dept Biol, I-00173 Rome, Italy
Calderone, Alberto
;
Castagnoli, Luisa
论文数: 0引用数: 0
h-index: 0
机构:
Univ Roma Tor Vergata, Dept Biol, I-00173 Rome, ItalyUniv Roma Tor Vergata, Dept Biol, I-00173 Rome, Italy
Castagnoli, Luisa
;
Cesareni, Gianni
论文数: 0引用数: 0
h-index: 0
机构:
Univ Roma Tor Vergata, Dept Biol, I-00173 Rome, Italy
Fdn Santa Lucia Ist Ricovero & Cura Carattere Sci, Rome, ItalyUniv Roma Tor Vergata, Dept Biol, I-00173 Rome, Italy
机构:
Univ Roma Tor Vergata, Dept Biol, I-00173 Rome, ItalyUniv Roma Tor Vergata, Dept Biol, I-00173 Rome, Italy
Calderone, Alberto
;
Castagnoli, Luisa
论文数: 0引用数: 0
h-index: 0
机构:
Univ Roma Tor Vergata, Dept Biol, I-00173 Rome, ItalyUniv Roma Tor Vergata, Dept Biol, I-00173 Rome, Italy
Castagnoli, Luisa
;
Cesareni, Gianni
论文数: 0引用数: 0
h-index: 0
机构:
Univ Roma Tor Vergata, Dept Biol, I-00173 Rome, Italy
Fdn Santa Lucia Ist Ricovero & Cura Carattere Sci, Rome, ItalyUniv Roma Tor Vergata, Dept Biol, I-00173 Rome, Italy