Biological data integration: Wrapping data and tools

被引:19
作者
Lacroix, Z [1 ]
机构
[1] Arizona State Univ, Tempe, AZ 85287 USA
来源
IEEE TRANSACTIONS ON INFORMATION TECHNOLOGY IN BIOMEDICINE | 2002年 / 6卷 / 02期
关键词
biological data integration; database view; eXtensible Markup Language (XML); mediation; web data sources;
D O I
10.1109/TITB.2002.1006299
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Nowadays scientific data is inevitably digital and stored in a wide variety of formats in heterogeneous systems. Scientists need to access an integrated view of remote or local heterogeneous data sources with advanced data accessing, analyzing, and visualization tools. Building a digital library for scientific data requires accessing and manipulating data extracted from flat files or databases, documents retrieved from the Web as well as data generated by software. We present an approach to wrapping web data sources, databases, flat riles, or data generated by tools through a database view mechanism. Generally, a wrapper has two tasks: it first sends a query to the source to retrieve data and, second builds the expected output with respect to the virtual structure. Our wrappers are composed of a retrieval component based on an intermediate object view mechanism called search views mapping the source capabilities to attributes, and an extensible Markup Language (XML) engine, respectively, to perform these two tasks. The originality of the approach consists of: 1) a generic view mechanism to access seamlessly data sources with limited capabilities and 2) the ability to wrap data sources as well as the useful specific tools they may provide. Our approach has been developed and demonstrated as part of the multidatabase system supporting queries via uniform object protocol model (OPM) interfaces.
引用
收藏
页码:123 / 128
页数:6
相关论文
共 42 条
[1]  
Abiteboul S, 1997, LECT NOTES COMPUT SC, V1186, P1
[2]  
ABITEBOUL S, 1997, J DIGITAL LIB
[3]  
ALTSCHUL S, 1990, J MOL BIOL, P403
[4]  
ASHISH N, 1997, ACM SIGMOD WORKSH MA
[5]   The SWISS-PROT protein sequence data bank and its supplement TrEMBL in 1999 [J].
Bairoch, A ;
Apweiler, R .
NUCLEIC ACIDS RESEARCH, 1999, 27 (01) :49-54
[6]  
BAKER P, 1998, P 6 INT C INT SYST M
[7]  
Bartels D., 1997, OBJECT DATABASE STAN
[8]   GenBank [J].
Benson, DA ;
Karsch-Mizrachi, I ;
Lipman, DJ ;
Ostell, J ;
Rapp, BA ;
Wheeler, DL .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :15-18
[9]  
BIRON PV, 2001, XML SCHEMA 2
[10]  
BRAY T, 2000, EXTENSIBLE MARKUP LA