Significant Subgraph Detection in Multi-omics Networks for Disease Pathway Identification

被引:2
作者
Abdel-Hafiz, Mohamed [1 ]
Najafi, Mesbah [2 ]
Helmi, Shahab [1 ]
Pratte, Katherine A. [3 ]
Zhuang, Yonghua [4 ]
Liu, Weixuan [4 ]
Kechris, Katerina J. [4 ]
Bowler, Russell P. [3 ,5 ]
Lange, Leslie [6 ]
Banaei-Kashani, Farnoush [1 ]
机构
[1] Univ Colorado Denver, Coll Engn, Dept Comp Sci & Engn, Big Data Management & Min Lab, Denver, CO 80204 USA
[2] Univ Colorado Denver, Coll Liberal Arts & Sci, Dept Math, Denver, CO USA
[3] Natl Jewish Hlth, Denver, CO USA
[4] Univ Colorado, Colorado Sch Publ Hlth, Dept Biostat & Informat, Anschutz Med Campus, Aurora, CO USA
[5] Univ Colorado, Sch Med, Anschutz Med Campus, Aurora, CO USA
[6] Univ Colorado, Dept Med, Div Biomed Informat & Personalized Med, Anschutz Med Campus, Aurora, CO USA
来源
FRONTIERS IN BIG DATA | 2022年 / 5卷
基金
美国国家卫生研究院;
关键词
graph clustering; PageRank; Louvain; multi-omics graph; subgraph detection;
D O I
10.3389/fdata.2022.894632
中图分类号
TP [自动化技术、计算机技术];
学科分类号
080201 [机械制造及其自动化];
摘要
Chronic obstructive pulmonary disease (COPD) is one of the leading causes of death in the United States. COPD represents one of many areas of research where identifying complex pathways and networks of interacting biomarkers is an important avenue toward studying disease progression and potentially discovering cures. Recently, sparse multiple canonical correlation network analysis (SmCCNet) was developed to identify complex relationships between omics associated with a disease phenotype, such as lung function. SmCCNet uses two sets of omics datasets and an associated output phenotypes to generate a multi-omics graph, which can then be used to explore relationships between omics in the context of a disease. Detecting significant subgraphs within this multi-omics network, i.e., subgraphs which exhibit high correlation to a disease phenotype and high inter-connectivity, can help clinicians identify complex biological relationships involved in disease progression. The current approach to identifying significant subgraphs relies on hierarchical clustering, which can be used to inform clinicians about important pathways involved in the disease or phenotype of interest. The reliance on a hierarchical clustering approach can hinder subgraph quality by biasing toward finding more compact subgraphs and removing larger significant subgraphs. This study aims to introduce new significant subgraph detection techniques. In particular, we introduce two subgraph detection methods, dubbed Correlated PageRank and Correlated Louvain, by extending the Personalized PageRank Clustering and Louvain algorithms, as well as a hybrid approach combining the two proposed methods, and compare them to the hierarchical method currently in use. The proposed methods show significant improvement in the quality of the subgraphs produced when compared to the current state of the art.
引用
收藏
页数:20
相关论文
共 45 条
[1]
ncPred: ncRNA-disease association prediction through tripartite network-based inference [J].
Alaimo, Salvatore ;
Giugno, Rosalba ;
Pulvirenti, Alfredo .
Frontiers in Bioengineering and Biotechnology, 2014, 2 (DEC)
[2]
Baadel S, 2016, PROCEEDINGS OF THE 2016 SAI COMPUTING CONFERENCE (SAI), P233, DOI 10.1109/SAI.2016.7555988
[3]
Prediction of new associations between ncRNAs and diseases exploiting multi-type hierarchical clustering [J].
Barracchia, Emanuele Pio ;
Pio, Gianvito ;
D'Elia, Domenica ;
Ceci, Michelangelo .
BMC BIOINFORMATICS, 2020, 21 (01)
[4]
Knowledge Graph Enhanced Community Detection and Characterization [J].
Bhatt, Shreyansh ;
Padhee, Swati ;
Sheth, Amit ;
Chen, Keke ;
Shalin, Valerie ;
Doran, Derek ;
Minnery, Brandon .
PROCEEDINGS OF THE TWELFTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM'19), 2019, :51-59
[5]
Fast unfolding of communities in large networks [J].
Blondel, Vincent D. ;
Guillaume, Jean-Loup ;
Lambiotte, Renaud ;
Lefebvre, Etienne .
JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2008,
[6]
Biclustering in data mining [J].
Busygin, Stanislav ;
Prokopyev, Oleg ;
Pardalos, Panos M. .
COMPUTERS & OPERATIONS RESEARCH, 2008, 35 (09) :2964-2987
[7]
Detecting communities in large networks [J].
Capocci, A ;
Servedio, VDP ;
Caldarelli, G ;
Colaiori, F .
PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2005, 352 (2-4) :669-676
[8]
CARRASCO J. J., 2003, Clustering of bipartite advertiser-keyword graph, P72
[9]
Chen M., 2008, P 23 AAAI C ART INT
[10]
Cheng Y, 2000, Proc Int Conf Intell Syst Mol Biol, V8, P93