caCORE: A common infrastructure for cancer informatics

被引:95
作者
Covitz, PA [1 ]
Hartel, F [1 ]
Schaefer, C [1 ]
De Coronado, S [1 ]
Fragoso, G [1 ]
Sahni, H [1 ]
Gustafson, S [1 ]
Buetow, KH [1 ]
机构
[1] NCI, Ctr Bioinformat, NIH, US Dept HHS, Rockville, MD 20852 USA
基金
美国国家卫生研究院;
关键词
D O I
10.1093/bioinformatics/btg335
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation:Sites with substantive bioinformatics operations are challenged to build data processing and delivery infrastructure that provides reliable access and enables data integration. Locally generated data must be processed and stored such that relationships to external data sources can be presented. Consistency and comparability across data sets requires annotation with controlled vocabularies and, further, metadata standards for data representation. Programmatic access to the processed data should be supported to ensure the maximum possible value is extracted. Confronted with these challenges at the National Cancer Institute Center for Bioinformatics, we decided to develop a robust infrastructure for data management and integration that supports advanced biomedical applications. Results: We have developed an interconnected set of software and services called caCORE. Enterprise Vocabulary Services (EVS) provide controlled vocabulary, dictionary and thesaurus services. The Cancer Data Standards Repository (caDSR) provides a metadata registry for common data elements. Cancer Bioinformatics Infrastructure Objects (caBIO) implements an object-oriented model of the biomedical domain and provides Java, Simple Object Access Protocol and HTTP-XML application programming interfaces. caCORE has been used to develop scientific applications that bring together data from distinct genomic and clinical science sources.
引用
收藏
页码:2404 / 2412
页数:9
相关论文
共 17 条
  • [1] [Anonymous], 2000, RATIONAL UNIFIED PRO
  • [2] Gene Ontology: tool for the unification of biology
    Ashburner, M
    Ball, CA
    Blake, JA
    Botstein, D
    Butler, H
    Cherry, JM
    Davis, AP
    Dolinski, K
    Dwight, SS
    Eppig, JT
    Harris, MA
    Hill, DP
    Issel-Tarver, L
    Kasarskis, A
    Lewis, S
    Matese, JC
    Richardson, JE
    Ringwald, M
    Rubin, GM
    Sherlock, G
    [J]. NATURE GENETICS, 2000, 25 (01) : 25 - 29
  • [3] Thesaurus construction through knowledge representation
    Bechhofer, S
    Goble, C
    [J]. DATA & KNOWLEDGE ENGINEERING, 2001, 37 (01) : 25 - 45
  • [4] Beck K., 1999, EXTREME PROGRAMMING
  • [5] Cancer Molecular Analysis Project: Weaving a rich cancer research tapestry
    Buetow, KH
    Klausner, RD
    Fine, H
    Kaplan, R
    Singer, DS
    Strausberg, RL
    [J]. CANCER CELL, 2002, 1 (04) : 315 - 318
  • [6] The HL7 clinical document architecture
    Dolin, RH
    Alschuler, L
    Beebe, C
    Biron, PV
    Boyer, SL
    Essin, D
    Kimber, E
    Lincoln, T
    Mattison, JE
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2001, 8 (06) : 552 - 569
  • [7] The Distributed Annotation System
    Dowell, Robin D.
    Jokerst, Rodney M.
    Day, Allen
    Eddy, Sean R.
    Stein, Lincoln
    [J]. BMC BIOINFORMATICS, 2001, 2 (1)
  • [8] Forrey AW, 1996, CLIN CHEM, V42, P81
  • [9] The human genome browser at UCSC
    Kent, WJ
    Sugnet, CW
    Furey, TS
    Roskin, KM
    Pringle, TH
    Zahler, AM
    Haussler, D
    [J]. GENOME RESEARCH, 2002, 12 (06) : 996 - 1006
  • [10] THE UNIFIED MEDICAL LANGUAGE SYSTEM
    LINDBERG, DAB
    HUMPHREYS, BL
    MCCRAY, AT
    [J]. METHODS OF INFORMATION IN MEDICINE, 1993, 32 (04) : 281 - 291