The curation paradigm and application tool used for manual curation of the scientific literature at the Comparative Toxicogenomics Database

被引:38
作者
Davis, Allan Peter [1 ]
Wiegers, Thomas C. [1 ]
Murphy, Cynthia G. [1 ]
Mattingly, Carolyn J. [1 ]
机构
[1] Mt Desert Isl Biol Lab, Dept Bioinformat, Salsbury Cove, ME 04672 USA
来源
DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION | 2011年
关键词
CHEMICALS;
D O I
10.1093/database/bar034
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
The Comparative Toxicogenomics Database (CTD) is a public resource that promotes understanding about the effects of environmental chemicals on human health. CTD biocurators read the scientific literature and convert free-text information into a structured format using official nomenclature, integrating third party controlled vocabularies for chemicals, genes, diseases and organisms, and a novel controlled vocabulary for molecular interactions. Manual curation produces a robust, richly annotated dataset of highly accurate and detailed information. Currently, CTD describes over 349 000 molecular interactions between 6800 chemicals, 20 900 genes (for 330 organisms) and 4300 diseases that have been manually curated from over 25 400 peer-reviewed articles. This manually curated data are further integrated with other third party data (e. g. Gene Ontology, KEGG and Reactome annotations) to generate a wealth of toxicogenomic relationships. Here, we describe our approach to manual curation that uses a powerful and efficient paradigm involving mnemonic codes. This strategy allows biocurators to quickly capture detailed information from articles by generating simple statements using codes to represent the relationships between data types. The paradigm is versatile, expandable, and able to accommodate new data challenges that arise. We have incorporated this strategy into a web-based curation tool to further increase efficiency and productivity, implement quality control in real-time and accommodate biocurators working remotely. Database URL: http://ctd.mdibl.org
引用
收藏
页数:12
相关论文
共 22 条
  • [1] A New Face and New Challenges for Online Mendelian Inheritance in Man (OMIM®)
    Amberger, Joanna
    Bocchini, Carol
    Hamosh, Ada
    [J]. HUMAN MUTATION, 2011, 32 (05) : 564 - 567
  • [2] Gene Ontology: tool for the unification of biology
    Ashburner, M
    Ball, CA
    Blake, JA
    Botstein, D
    Butler, H
    Cherry, JM
    Davis, AP
    Dolinski, K
    Dwight, SS
    Eppig, JT
    Harris, MA
    Hill, DP
    Issel-Tarver, L
    Kasarskis, A
    Lewis, S
    Matese, JC
    Richardson, JE
    Ringwald, M
    Rubin, GM
    Sherlock, G
    [J]. NATURE GENETICS, 2000, 25 (01) : 25 - 29
  • [3] CODD EF, 1970, COMMUN ACM, V13, P377, DOI 10.1145/357980.358007
  • [4] Technical milestone - Medical subject headings used to search the biomedical literature
    Coletti, MH
    Bleich, HL
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2001, 8 (04) : 317 - 323
  • [5] Reactome: a database of reactions, pathways and biological processes
    Croft, David
    O'Kelly, Gavin
    Wu, Guanming
    Haw, Robin
    Gillespie, Marc
    Matthews, Lisa
    Caudy, Michael
    Garapati, Phani
    Gopinath, Gopal
    Jassal, Bijay
    Jupe, Steven
    Kalatskaya, Irina
    Mahajan, Shahana
    May, Bruce
    Ndegwa, Nelson
    Schmidt, Esther
    Shamovsky, Veronica
    Yung, Christina
    Birney, Ewan
    Hermjakob, Henning
    D'Eustachio, Peter
    Stein, Lincoln
    [J]. NUCLEIC ACIDS RESEARCH, 2011, 39 : D691 - D697
  • [6] The Comparative Toxicogenomics Database facilitates identification and understanding of chemical-gene-disease associations: arsenic as a case study
    Davis, Allan P.
    Murphy, Cynthia G.
    Rosenstein, Michael C.
    Wiegers, Thomas C.
    Mattingly, Carolyn J.
    [J]. BMC MEDICAL GENOMICS, 2008, 1 (1)
  • [7] The Comparative Toxicogenomics Database: update 2011
    Davis, Allan Peter
    King, Benjamin L.
    Mockus, Susan
    Murphy, Cynthia G.
    Saraceni-Richards, Cynthia
    Rosenstein, Michael
    Wiegers, Thomas
    Mattingly, Carolyn J.
    [J]. NUCLEIC ACIDS RESEARCH, 2011, 39 : D1067 - D1072
  • [8] GeneComps and ChemComps: a new CTD metric to identify genes and chemicals with shared toxicogenomic profiles
    Davis, Allan Peter
    Murphy, Cynthia G.
    Saraceni-Richards, Cynthia A.
    Rosenstein, Michael C.
    Wiegers, Thomas C.
    Hampton, Thomas H.
    Mattingly, Carolyn J.
    [J]. BIOINFORMATION, 2009, 4 (04) : 173 - 174
  • [9] Comparative Toxicogenomics Database: a knowledgebase and discovery tool for chemical-gene-disease networks
    Davis, Allan Peter
    Murphy, Cynthia G.
    Saraceni-Richards, Cynthia A.
    Rosenstein, Michael C.
    Wiegers, Thomas C.
    Mattingly, Carolyn J.
    [J]. NUCLEIC ACIDS RESEARCH, 2009, 37 : D786 - D792
  • [10] Integrating text mining into the MGI biocuration workflow
    Dowell, K. G.
    McAndrews-Hill, M. S.
    Hill, D. P.
    Drabkin, H. J.
    Blake, J. A.
    [J]. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, 2009,