Literature mining for the biologist: from information retrieval to biological discovery

被引:427
作者
Jensen, LJ [1 ]
Saric, J
Bork, P
机构
[1] European Mol Biol Lab, D-69117 Heidelberg, Germany
[2] EML Res gGmbH, D-69118 Heidelberg, Germany
[3] Max Delbruck Ctr Mol Med, D-13092 Berlin, Germany
关键词
D O I
10.1038/nrg1768
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
For the average biologist, hands-on literature mining currently means a keyword search in PubMed. However, methods for extracting biomedical facts from the scientific literature have improved considerably, and the associated tools will probably soon be used in many laboratories to automatically annotate and analyse the growing number of system-wide experimental data sets. Owing to the increasing body of text and the open-access policies of many journals, literature mining is also becoming useful for both hypothesis generation and biological discovery. However, the latter will require the integration of literature and high-throughput data, which should encourage close collaborations between biologists and computational linguists.
引用
收藏
页码:119 / 129
页数:11
相关论文
共 115 条
  • [81] SARIC J, 2005, BIOINFORMATICS 0726, DOI DOI 10.1093/BIOINFORMATICS/BTI597
  • [82] Thesaurus-based disambiguation of gene symbols
    Schijvenaars, BJA
    Mons, B
    Weeber, M
    Schuemie, MJ
    van Mulligen, EM
    Wain, HM
    Kors, JA
    [J]. BMC BIOINFORMATICS, 2005, 6 (1)
  • [83] From gene networks to gene function
    Schlitt, T
    Palin, K
    Rung, J
    Dietmann, S
    Lappe, M
    Ukkonen, E
    Brama, A
    [J]. GENOME RESEARCH, 2003, 13 (12) : 2568 - 2576
  • [84] Distribution of information in biomedical abstracts and full-text publications
    Schuemie, MJ
    Weeber, M
    Schijvenaars, BJA
    van Mulligen, EM
    van der Eijk, CC
    Jelier, R
    Mons, B
    Kors, JA
    [J]. BIOINFORMATICS, 2004, 20 (16) : 2597 - 2604
  • [85] ABNER: an open source tool for automatically tagging genes, proteins and other entity names in text
    Settles, B
    [J]. BIOINFORMATICS, 2005, 21 (14) : 3191 - 3192
  • [86] Extraction of transcript diversity from scientific literature
    Shah, PK
    Jensen, LJ
    Boué, S
    Bork, P
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2005, 1 (01) : 67 - 73
  • [87] Information extraction from full text scientific articles: Where are the keywords?
    Shah, PK
    Perez-Iratxeta, C
    Bork, P
    Andrade, MA
    [J]. BMC BIOINFORMATICS, 2003, 4 (1)
  • [88] SMALHEISER NR, 1994, NEUROSCI RES COMMUN, V15, P1
  • [89] Linking estrogen to Alzheimer's disease: An informatics approach
    Smalheiser, NR
    Swanson, DR
    [J]. NEUROLOGY, 1996, 47 (03) : 809 - 810
  • [90] Mining MEDLINE for implicit links between dietary substances and diseases
    Srinivasan, Padmini
    Libbus, Bisharah
    [J]. BIOINFORMATICS, 2004, 20 : 290 - 296