Identification and ranking of key persons in a Social Networking Website using Hadoop & Big Data Analytics

被引:3
作者
Agarwal, Prerna [1 ]
Ahmed, Rafeeq [1 ]
Ahmad, Tanvir [1 ]
机构
[1] Jamia Millia Islamia, Comp Engn, Delhi, India
来源
INTERNATIONAL CONFERENCE ON ADVANCES IN INFORMATION COMMUNICATION TECHNOLOGY & COMPUTING, 2016 | 2016年
关键词
Betweenness Centrality; Closeness Centrality; Degree Centrality Big Data; MapReduce; key persons; ranking;
D O I
10.1145/2979779.2979844
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Big Data is a term which defines a vast amount of structured and unstructured data which is challenging to process because of its large size, using traditional algorithms and lack of high speed processing techniques. Now a days, vast amount of digital data is being gathered from many important areas, including social networking websites like Facebook and Twitter. It is important for us to mine this big data for analysis purpose. One important analysis in this domain is to find key nodes in a social graph which can be the major information spreader. Node centrality measures can be used in many graph applications such as searching and ranking of nodes. Traditional centrality algorithms fail on such huge graphs therefore it is difficult to use these algorithms on big graphs. Traditional centrality algorithms such as degree centrality, betweenness centrality and closeness centrality were not designed for such large data. In this paper, we calculate centrality measures for big graphs having huge number of edges and nodes by parallelizing traditional centrality algorithms so that they can be used in an efficient way when the size of graph grows. We use MapReduce and Hadoop to implement these algorithms for parallel and distributed data processing. We present results and anomalies of these algorithms and also show the comparative processing time taken on normal systems and on Hadoop systems.
引用
收藏
页数:6
相关论文
共 23 条
[1]   Approximate Incremental Big-Data Harmonization [J].
Agarwal, Puneet ;
Shroff, Gautam ;
Malhotra, Pankaj .
2013 IEEE INTERNATIONAL CONGRESS ON BIG DATA, 2013, :118-125
[2]  
[Anonymous], P WORKSH INF NETW
[3]  
[Anonymous], 2012, GLOBAL RES DATA INFR
[4]  
[Anonymous], 2010, RIDING WAVE EUROPE C
[5]  
[Anonymous], 2011, PROCEEDINGS SIAM, DOI DOI 10.1137/1.9781611972818.11
[6]   Consistent Process Mining Over Big Data Triple Stores [J].
Azzini, Antonia ;
Ceravolo, Paolo .
2013 IEEE INTERNATIONAL CONGRESS ON BIG DATA, 2013, :54-61
[7]   Clique Relaxations in Social Network Analysis: The Maximum k-Plex Problem [J].
Balasundaram, Balabhaskar ;
Butenko, Sergiy ;
Hicks, Illya V. .
OPERATIONS RESEARCH, 2011, 59 (01) :133-142
[8]   Network Analysis in the Social Sciences [J].
Borgatti, Stephen P. ;
Mehra, Ajay ;
Brass, Daniel J. ;
Labianca, Giuseppe .
SCIENCE, 2009, 323 (5916) :892-895
[9]   A Performance of Centrality Calculation in Social Networks [J].
Brodka, Piotr ;
Musial, Katarzyna ;
Kazienko, Przemyslaw .
2009 INTERNATIONAL CONFERENCE ON COMPUTATIONAL ASPECTS OF SOCIAL NETWORKS, PROCEEDINGS, 2009, :24-31
[10]  
Chen WD, 2010, MODELLING SIMULATION, P88