Structured digital tables on the Semantic Web: toward a structured digital literature

被引:6
作者
Cheung, Kei-Hoi [1 ,2 ,3 ,4 ]
Samwald, Matthias [5 ,6 ]
Auerbach, Raymond K.
Gerstein, Mark B. [4 ,7 ]
机构
[1] Yale Univ, Yale Ctr Med Informat, Program Computat Biol & Bioinformat, New Haven, CT 06520 USA
[2] Yale Univ, Sch Med, Ctr Med Informat, New Haven, CT 06520 USA
[3] Yale Univ, Sch Med, Dept Genet, New Haven, CT 06520 USA
[4] Yale Univ, Dept Comp Sci, New Haven, CT 06520 USA
[5] Natl Univ Ireland Galway, Digital Enterprise Res Inst, Galway, Ireland
[6] Konrad Lorenz Inst Evolut & Cognit Res, Altenberg, Austria
[7] Yale Univ, Dept Mol Biophys & Biochem, New Haven, CT 06520 USA
关键词
bioinformatics; data integration; semantic publishing; Semantic Web; triplification; INFORMATION-RETRIEVAL; MINIMUM INFORMATION; ONTOLOGY;
D O I
10.1038/msb.2010.45
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
In parallel to the growth in bioscience databases, biomedical publications have increased exponentially in the past decade. However, the extraction of high-quality information from the corpus of scientific literature has been hampered by the lack of machine-interpretable content, despite text-mining advances. To address this, we propose creating a structured digital table as part of an overall effort in developing machine-readable, structured digital literature. In particular, we envision transforming publication tables into standardized triples using Semantic Web approaches. We identify three canonical types of tables (conveying information about properties, networks, and concept hierarchies) and show how more complex tables can be built from these basic types. We envision that authors would create tables initially using the structured triples for canonical types and then have them visually rendered for publication, and we present examples for converting representative tables into triples. Finally, we discuss how 'stub' versions of structured digital tables could be a useful bridge for connecting together the literature with databases, allowing the former to more precisely document the later. Molecular Systems Biology 6: 403 published online 24 August 2010; doi:10.1038/msb.2010.45 Subject Categories: bioinformatics; computational methods
引用
收藏
页数:13
相关论文
共 39 条
  • [1] Ahmed A, 2009, KDD-09: 15TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, P39
  • [2] The Semantic Web - A new form of Web content that is meaningful to computers will unleash a revolution of new possibilities
    Berners-Lee, T
    Hendler, J
    Lassila, O
    [J]. SCIENTIFIC AMERICAN, 2001, 284 (05) : 34 - +
  • [3] Minimum information about a microarray experiment (MIAME) - toward standards for microarray data
    Brazma, A
    Hingamp, P
    Quackenbush, J
    Sherlock, G
    Spellman, P
    Stoeckert, C
    Aach, J
    Ansorge, W
    Ball, CA
    Causton, HC
    Gaasterland, T
    Glenisson, P
    Holstege, FCP
    Kim, IF
    Markowitz, V
    Matese, JC
    Parkinson, H
    Robinson, A
    Sarkans, U
    Schulze-Kremer, S
    Stewart, J
    Taylor, R
    Vilo, J
    Vingron, M
    [J]. NATURE GENETICS, 2001, 29 (04) : 365 - 371
  • [4] The NIFSTD and BIRNLex Vocabularies: Building Comprehensive Ontologies for Neuroscience
    Bug, William J.
    Ascoli, Giorgio A.
    Grethe, Jeffrey S.
    Gupta, Amarnath
    Fennema-Notestine, Christine
    Laird, Angela R.
    Larson, Stephen D.
    Rubin, Daniel
    Shepherd, Gordon M.
    Turner, Jessica A.
    Martone, Maryann E.
    [J]. NEUROINFORMATICS, 2008, 6 (03) : 175 - 194
  • [5] Named graphs
    Carroll, JJ
    Bizer, C
    Hayes, P
    Stickler, P
    [J]. JOURNAL OF WEB SEMANTICS, 2005, 3 (04): : 247 - 267
  • [6] Vispedia*: Interactive Visual Exploration of Wikipedia Data via Search-Based Integration
    Chan, Bryan
    Wu, Leslie
    Talbot, Justin
    Cammarano, Mike
    Hanrahan, Pat
    [J]. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2008, 14 (06) : 1213 - 1220
  • [7] Content-rich biological network constructed by mining PubMed abstracts
    Chen, H
    Sharp, BM
    [J]. BMC BIOINFORMATICS, 2004, 5 (1)
  • [8] A graphical model approach to automated classification of protein subcellular location patterns in multi-cell images
    Chen, Shann-Ching
    Murphy, Robert F.
    [J]. BMC BIOINFORMATICS, 2006, 7 (1)
  • [9] The SWAN biomedical discourse ontology
    Ciccarese, Paolo
    Wu, Elizabeth
    Wong, Gwen
    Ocana, Marco
    Kinoshita, June
    Ruttenberg, Alan
    Clark, Tim
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2008, 41 (05) : 739 - 751
  • [10] PDZ protein interactions underlying NMDA receptor-mediated excitotoxicity and neuroprotection by PSD-95 inhibitors
    Cui, Hong
    Hayashi, Amy
    Sun, Hong-Shuo
    Belmares, Michael P.
    Cobey, Carolyn
    Phan, Thuymy
    Schweizer, Johannes
    Salter, Michael W.
    Wang, Yu Tian
    Tasker, R. Andrew
    Garman, David
    Rabinowitz, Joshua
    Lu, Peter S.
    Tymianski, Michael
    [J]. JOURNAL OF NEUROSCIENCE, 2007, 27 (37) : 9901 - 9915