Distributed processing of very large datasets with DataCutter

被引:97
作者
Beynon, MD
Kurc, T
Catalyurek, U
Chang, CL
Sussman, A
Saltz, J [1 ]
机构
[1] Univ Maryland, Dept Comp Sci, College Pk, MD 20742 USA
[2] Johns Hopkins Med Inst, Dept Pathol, Baltimore, MD 21287 USA
基金
美国国家科学基金会;
关键词
multi-dimensional datasets; data analysis; distributed computing; runtime systems; component architectures;
D O I
10.1016/S0167-8191(01)00099-0
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We describe a framework, called DataCutter, that is designed to provide support for subsetting and processing of datasets in a distributed and heterogeneous environment. We illustrate the use of DataCutter with several data-intensive applications from diverse fields, and present experimental results. (C) 2001 Published by Elsevier Science B.V.
引用
收藏
页码:1457 / 1478
页数:22
相关论文
共 26 条
[1]  
Aeschlimann M, 1999, INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED PROCESSING TECHNIQUES AND APPLICATIONS, VOLS I-V, PROCEEDINGS, P1833
[2]  
AFEWORK A, 1998, P 1998 AMIA ANN FALL
[3]  
AMIRI K, 1999, CMUCS99140
[4]  
ARPACIDUSSEAU A, 1997, P 1997 ACM SIGMOD C
[5]  
Beynon M., 2000, P 8 GODD C MASS STOR, P119
[6]  
Beynon M. D., 2000, Proceedings 9th Heterogeneous Computing Workshop (HCW 2000) (Cat. No.PR00556), P116, DOI 10.1109/HCW.2000.843737
[7]   Optimizing execution of component-based applications using group instances [J].
Beynon, MD ;
Kurc, T ;
Sussman, A ;
Saltz, J .
FIRST IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER COMPUTING AND THE GRID, PROCEEDINGS, 2001, :56-63
[8]  
CHOUDHARY A, 1994, SCCS636 NPAC
[9]  
EDJLALI G, 1997, P 11 INT PAR PROC S
[10]  
EYNON M, 1999, P 1999 INT C SUP