Automated extraction of information in molecular biology

被引:66
作者
Andrade, MA
Bork, P
机构
[1] European Mol Biol Lab, D-69012 Heidelberg, Germany
[2] Max Delbruck Ctr Mol Med, Dept Bioinformat, D-13092 Berlin, Germany
关键词
D O I
10.1016/S0014-5793(00)01661-6
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We review data mining techniques in molecular biology, specifically those that extract information from the scientific literature itself. As more of the biological literature is published electronically, there is an opportunity, and even a need, to automatically summarize the literature in a customized way, for example by associating keywords to a topic. These keywords can be extracted from relevant publications. The process of keyword extraction can be automated and optimized to keep literature pointers automatically up-to-date or to filter relevant information from the literature. To illustrate these points, OMIM (Online Mendelian Inheritance in Man), a database of human inherited diseases, was linked to the literature and keywords were derived that covered distinct aspects such as genetic information on the one hand and disease-specific protein and phenotypic information on the other. They were used to extract information that is helpful for keeping entries about disease up-to-date. (C) 2000 Federation of European Biochemical Societies. Published by Elsevier Science B.V, All rights reserved.
引用
收藏
页码:12 / 17
页数:6
相关论文
共 30 条
  • [1] Automatic extraction of keywords from scientific text: application to the knowledge domain of protein families
    Andrade, MA
    Valencia, A
    [J]. BIOINFORMATICS, 1998, 14 (07) : 600 - 607
  • [2] [Anonymous], PACIFIC S BIOCOMPUT
  • [3] [Anonymous], 1990, MOTHER TONGUE
  • [4] [Anonymous], 1998, GENOME INFORM
  • [5] Gene Ontology: tool for the unification of biology
    Ashburner, M
    Ball, CA
    Blake, JA
    Botstein, D
    Butler, H
    Cherry, JM
    Davis, AP
    Dolinski, K
    Dwight, SS
    Eppig, JT
    Harris, MA
    Hill, DP
    Issel-Tarver, L
    Kasarskis, A
    Lewis, S
    Matese, JC
    Richardson, JE
    Ringwald, M
    Rubin, GM
    Sherlock, G
    [J]. NATURE GENETICS, 2000, 25 (01) : 25 - 29
  • [6] Barnbrook G., 1996, Language and computers
  • [7] Blaschke C, 1999, Proc Int Conf Intell Syst Mol Biol, P60
  • [8] Information extraction
    Cowie, J
    Lehnert, W
    [J]. COMMUNICATIONS OF THE ACM, 1996, 39 (01) : 80 - 91
  • [9] A novel method for automatic functional annotation of proteins
    Fleischmann, W
    Möller, S
    Gateau, A
    Apweiler, R
    [J]. BIOINFORMATICS, 1999, 15 (03) : 228 - 233
  • [10] FUKUDA K, 1998, PAC S BIOCOMPUT, V3, P705