Very Fast Interactive Visualization of Large Sets of High-dimensional Data

被引:9
作者
Dzwinel, Witold [1 ]
Wcislo, Rafal [1 ]
机构
[1] AGH Univ Sci & Technol, PL-30059 Krakow, Poland
来源
INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, ICCS 2015 COMPUTATIONAL SCIENCE AT THE GATES OF NATURE | 2015年 / 51卷
关键词
multidimensional scaling; particle-based stress minimization; interactive visualization;
D O I
10.1016/j.procs.2015.05.325
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The embedding of high-dimensional data into 2D/3D space is the most popular way of data visualization. Despite recent advances in developing of very accurate dimensionality reduction algorithms, such as BH-SNE, Q-SNE and LoCH, their relatively high computational complexity still remains the obstacle for interactive visualization of truly large datasets consisting of M similar to 10(6+) of high-dimensional N similar to 10(3+) feature vectors. We show that a new clone of the multidimensional scaling (MDS) - nr-MDS - can be up to two orders of magnitude faster than the modern dimensionality reduction algorithms. We postulate its linear O(M) computational and memory complexities. Simultaneously, our method preserves in 2D/3D target spaces high separability of data, similar to that obtained by the state-of-the-art dimensionality reduction algorithms. We present the effects of nr-MDS application in visualization of data repositories such as 20 Newsgroups (M = 1.8 . 10(4)), MNIST (M = 7 . 10(4)) and REUTERS (M = 2.67 . 10(5)).
引用
收藏
页码:572 / 581
页数:10
相关论文
共 20 条
[1]   Molecular dynamics multidimensional scaling [J].
Andrecut, M. .
PHYSICS LETTERS A, 2009, 373 (23-24) :2001-2006
[2]  
[Anonymous], 2013, P 16 INT C ARTIFICIA
[3]  
De Leeuw Jan, 2011, MULTIDIMENSIONAL SCA
[4]   Method of particles in visual clustering of multi-dimensional and large data sets [J].
Dzwinel, W ;
Blasiak, J .
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF GRID COMPUTING AND ESCIENCE, 1999, 15 (03) :365-379
[5]   LoCH: A neighborhood-based multidimensional projection technique for high-dimensional sparse spaces [J].
Fadel, Samuel G. ;
Fatore, Francisco M. ;
Duarte, Felipe S. L. G. ;
Paulovich, Fernando V. .
NEUROCOMPUTING, 2015, 150 :546-556
[6]   Two-Way Multidimensional Scaling: A Review [J].
France, Stephen L. ;
Carroll, J. Douglas .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2011, 41 (05) :644-661
[7]   Reducing the dimensionality of data with neural networks [J].
Hinton, G. E. ;
Salakhutdinov, R. R. .
SCIENCE, 2006, 313 (5786) :504-507
[8]   Dimensionality reduction for documents with nearest neighbor queries [J].
Ingram, Stephen ;
Munzner, Tamara .
NEUROCOMPUTING, 2015, 150 :557-569
[9]   Glimmer: Multilevel MDS on the GPU [J].
Ingram, Stephen ;
Munzne, Tamara ;
Olano, Marc .
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2009, 15 (02) :249-261
[10]   Big-Data Visualization [J].
Keim, Daniel ;
Qu, Huamin ;
Ma, Kwan-Liu .
IEEE COMPUTER GRAPHICS AND APPLICATIONS, 2013, 33 (04) :20-21