STXXL: standard template library for XXL data sets

被引:50
作者
Dementiev, R. [1 ]
Kettner, L. [2 ]
Sanders, P. [1 ]
机构
[1] Univ Karlsruhe, Fak Informat, Karlsruhe, Germany
[2] Max Planck Inst Informat, Saarbrucken, Germany
关键词
very large data sets; software library; C plus plus standard template library; algorithm engineering;
D O I
10.1002/spe.844
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We present the software library STXXL that is an implementation of the C++ standard template library (STL) for processing huge data sets that can fit only on hard disks. It supports parallel disks, overlapping between disk I/O and computation and it is the first I/O-efficient algorithm library that supports the pipelining technique that can save more than half of the I/Os. STXXL has been applied both in academic and industrial environments for a range of problems including text processing, graph algorithms, computational geometry, Gaussian elimination, visualization, and analysis of microscopic images, differential cryptographic analysis, etc. The performance of STXXL and its applications are evaluated on synthetic and real-world inputs. We present the design of the library, how its performance features are supported, and demonstrate how the library integrates with STL. Copyright (c) 2007 John Wiley & Sons, Ltd.
引用
收藏
页码:589 / 637
页数:49
相关论文
共 52 条
[1]   THE INPUT OUTPUT COMPLEXITY OF SORTING AND RELATED PROBLEMS [J].
AGGARWAL, A ;
VITTER, JS .
COMMUNICATIONS OF THE ACM, 1988, 31 (09) :1116-1127
[2]  
Albers S., 1998, Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing, P454, DOI 10.1145/276698.276858
[3]  
[Anonymous], 8 INT S ADV SPAT TEM
[4]  
[Anonymous], ACM J EXPT ALGORITHM
[5]  
[Anonymous], 2002, 34 S THEORY COMPUTIN
[6]  
Arge L, 2002, LECT NOTES COMPUT SC, V2461, P88
[7]  
Arge L, 1999, LECT NOTES COMPUT SC, V1619, P328
[8]  
Arge L, 1995, LECT NOTES COMPUT SC, V955, P334
[9]   Simple randomized merge/sort on parallel disks [J].
Barve, RD ;
Grove, EF ;
Vitter, JS .
PARALLEL COMPUTING, 1997, 23 (4-5) :601-631
[10]  
Bayer R., 1972, Acta Informatica, V1, P173, DOI 10.1007/BF00288683