The design of discovery net: Towards open grid services for knowledge discovery

被引:51
作者
AlSairafi, S [1 ]
Emmanouil, FS [1 ]
Ghanem, M [1 ]
Giannadakis, N [1 ]
Guo, Y [1 ]
Kalaitzopoulos, D [1 ]
Osmond, M [1 ]
Rowe, A [1 ]
Syed, J [1 ]
Wendel, P [1 ]
机构
[1] Univ London Imperial Coll Sci Technol & Med, Dept Comp, London SW7 2BZ, England
关键词
D O I
10.1177/1094342003173003
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
With the emergence of distributed resources and grid technologies there is a need to provide higher level informatics infrastructures allowing scientists to easily create and execute meaningful data integration and analysis processes that take advantage of the distributed nature of the available resources. These resources typically include heterogeneous data sources, computational resources for task execution and various application-specific services. The effort of the high performance community has so far mainly focused on the delivery of low-level informatics infrastructures enabling the basic needs of grid applications. Such infrastructures are essential but do not directly help end-users in creating generic and re-usable applications. In this paper, we present the Discovery Net architecture for building grid-based knowledge discovery applications. Our architecture enables the creation of high-level, re-usable and distributed application workflows that use a variety of common types of distributed resources. It is built on top of standard protocols and standard infrastructures such as Globus but also defines its own protocols such as the Discovery Process Mark-up Language for data flow management. We discuss an implementation of our architecture and evaluate it by building a real-time genome annotation environment on top.
引用
收藏
页码:297 / 315
页数:19
相关论文
共 41 条
[1]  
ABRAMSON D, 1995, HPDC, P1125
[2]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[3]  
ALTSCHUL SF, 1990, J MOL BIOL, V215, P403, DOI 10.1006/jmbi.1990.9999
[4]   SETI@home - An experiment in public-resource computing [J].
Anderson, DP ;
Cobb, J ;
Korpela, E ;
Lebofsky, M ;
Werthimer, D .
COMMUNICATIONS OF THE ACM, 2002, 45 (11) :56-61
[5]  
[Anonymous], P WORKSH CLUST COMP
[6]  
[Anonymous], 1999, CRISP DM PROCESS MOD
[7]  
AVERY P, 2001, INT VIRTUAL DATA GRI
[8]   The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000 [J].
Bairoch, A ;
Apweiler, R .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :45-48
[9]  
BARU C, 1998, CASCON 98
[10]   Prediction of complete gene structures in human genomic DNA [J].
Burge, C ;
Karlin, S .
JOURNAL OF MOLECULAR BIOLOGY, 1997, 268 (01) :78-94