K2/Kleisli and GUS: Experiments in integrated access to genomic data sources

被引:95
作者
Davidson, SB [1 ]
Crabtree, J
Brunk, BP
Schug, J
Tannen, V
Overton, GC
Stoeckert, CJ
机构
[1] Univ Penn, Ctr Bioinformat, 200 S 33rd St, Philadelphia, PA 19104 USA
[2] Univ Penn, Dept Comp & Informat Sci, Philadelphia, PA 19104 USA
[3] Univ Penn, Dept Genet, Philadelphia, PA 19104 USA
关键词
D O I
10.1147/sj.402.0512
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The integrated access to heterogeneous data sources is a major challenge for the biomedical community. Several solution strategies have been explored: link-driven federation of databases, view integration, and warehousing. In this paper we report on our experiences with two systems that were developed at the University of Pennsylvania: K2, a view integration implementation, and GUS, a data warehouse. Although the view integration and the warehouse approaches each have advantages, there is no clear "winner." Therefore, in selecting the best strategy for a particular application, users must consider the data characteristics, the performance guarantees required, and the programming resources available. Our experiences also point to some practical tips on how database updates should be published, and how XML can be used to facilitate the processing of updates in a warehousing environment.
引用
收藏
页码:512 / 531
页数:20
相关论文
共 76 条
  • [1] Abiteboul S., 1998, Proceedings of the Twenty-Fourth International Conference on Very-Large Databases, P38
  • [2] ALTSCHUL SF, 1990, J MOL BIOL, V215, P403, DOI 10.1006/jmbi.1990.9999
  • [3] GAIA: Framework annotation of genomic sequence
    Bailey, LC
    Fischer, S
    Schug, J
    Crabtree, J
    Gibson, M
    Overton, GC
    [J]. GENOME RESEARCH, 1998, 8 (03) : 234 - 250
  • [4] The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000
    Bairoch, A
    Apweiler, R
    [J]. NUCLEIC ACIDS RESEARCH, 2000, 28 (01) : 45 - 48
  • [5] The EMBL Nucleotide Sequence Database
    Baker, W
    van den Broek, A
    Camon, E
    Hingamp, P
    Sterk, P
    Stoesser, G
    Tuli, MA
    [J]. NUCLEIC ACIDS RESEARCH, 2000, 28 (01) : 19 - 23
  • [6] GenBank
    Benson, DA
    Karsch-Mizrachi, I
    Lipman, DJ
    Ostell, J
    Rapp, BA
    Wheeler, DL
    [J]. NUCLEIC ACIDS RESEARCH, 2000, 28 (01) : 15 - 18
  • [7] BERNSTEIN PA, 1999, IEEE DATA ENG B, V22, P9
  • [8] The Mouse Genome Database (MGD): expanding genetic and genomic resources for the laboratory mouse
    Blake, JA
    Eppig, JT
    Richardson, JE
    Davisson, MT
    [J]. NUCLEIC ACIDS RESEARCH, 2000, 28 (01) : 108 - 111
  • [9] UPDATING DERIVED RELATIONS - DETECTING IRRELEVANT AND AUTONOMOUSLY COMPUTABLE UPDATES
    BLAKELEY, JA
    COBURN, N
    LARSON, PA
    [J]. ACM TRANSACTIONS ON DATABASE SYSTEMS, 1989, 14 (03): : 369 - 400
  • [10] BLAKELEY JA, 1986, P ACM SIGMOD INT C M, P61