High Performance Big Data Clustering

被引：4

作者：

Agrawal, Ankit ^{[1
]}

Patwary, Md. Mostofa Ali ^{[1
]}

Hendrix, William ^{[1
]}

Liao, Wei-keng ^{[1
]}

Choudhary, Alok ^{[1
]}

机构：

[1] Northwestern Univ, Dept EECS, Evanston, IL 60208 USA

来源：

CLOUD COMPUTING AND BIG DATA | 2013年 / 23卷

关键词：

big data; clustering; density-based clustering; hierarchical clustering; DBSCAN ALGORITHM; PARALLEL;

D O I：

10.3233/978-1-61499-322-3-192

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Scientific advances are collectively exploding the amount, diversity, and complexity of data becoming available. Our ability to collect huge amounts of data has greatly surpassed our analytical capacity to make sense of it. Efficient use of high performance computing techniques is critical for the success of the data-driven paradigm to scientific discovery. Data clustering is one of the fundamental analytics tasks heavily relied upon in many application domains, like astrohpysics, climate science, bioinformatics, etc. In this book chapter, we illustrate the challenges and opportunities in mining big data using two recently developed scalable parallel clustering algorithms. Experimental results on millions of high-dimensional data points clustered in parallel on thousands of processor cores are also presented.

引用

页码：192 / 211

页数：20

共 63 条

[1] Parallel pairwise statistical significance estimation of local sequence alignment using Message Passing Interface library
Agrawal, Ankit
Misra, Sanchit
Honbo, Daniel
Choudhary, Alok
[J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2011, 23 (17) : 2269 - 2279
[2] Agrawal R., 1994, Quest Synthetic Data Generator
[3] [Anonymous], 2005, PARALLEL K MEANS DAT
[4] [Anonymous], 2006, CLUTO CLUSTERING HIG
[5] Arlia D., 2001, Euro-Par 2001 Parallel Processing. 7th International Euro-Par Conference. Proceedings (Lecture Notes in Computer Science Vol.2150), P326
[6] BECKMANN N, 1990, SIGMOD REC, V19, P322, DOI 10.1145/93605.98741
[7] MULTIDIMENSIONAL BINARY SEARCH TREES USED FOR ASSOCIATIVE SEARCHING
BENTLEY, JL
[J]. COMMUNICATIONS OF THE ACM, 1975, 18 (09) : 509 - 517
[8] The recycling of gas and metals in galaxy formation: predictions of a dynamical feedback model
Bertone, Serena
De Lucia, Gabriella
Thomas, Peter A.
[J]. MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2007, 379 (03) : 1143 - 1154
[9] ST-DBSCAN: An algorithm for clustering spatial-temp oral data
Birant, Derya
Kut, Alp
[J]. DATA & KNOWLEDGE ENGINEERING, 2007, 60 (01) : 208 - 221
[10] Breaking the hierarchy of galaxy formation
Bower, R. G.
Benson, A. J.
Malbon, R.
Helly, J. C.
Frenk, C. S.
Baugh, C. M.
Cole, S.
Lacey, C. G.
[J]. MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2006, 370 (02) : 645 - 655

← 1 2 3 4 5 6 7 →