DIAN: A novel algorithm for genome ontological classification

被引:17
作者
Pouliot, Y [1 ]
Gao, J [1 ]
Su, QJJ [1 ]
Liu, GZG [1 ]
Ling, XFB [1 ]
机构
[1] DoubleTwist Inc, Oakland, CA 94612 USA
关键词
D O I
10.1101/gr.183301
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Faced with the determination of many completely sequenced genomes, computational biology is now faced with the challenge of interpreting the significance of these data sets. A multiplicity of data-related problems impedes this goal: Biological annotations associated with raw data are often not normalized, and the data themselves are often poorly interrelated and their interpretation unclear. All of these problems make interpretation of genomic databases increasingly difficult. With the current explosion of sequences now available from the human genome as well as from model organisms, the importance of sorting this vast amount of conceptually unstructured source data into a limited universe of genes, proteins, functions, structures, and pathways has become a bottleneck for the field. To address this problem, we have developed a method of interrelating data sources by applying a novel method of associating biological objects to ontologies. We have developed an intelligent knowledge-based algorithm, DIAN, to support biological knowledge mapping, and, in particular, to facilitate the interpretation of genomic data. In this respect, the method makes it possible to inventory genomes by collapsing multiple types of annotations and normalizing them to various ontologies. By relying on a conceptual view of the genome, researchers can now easily navigate the human genome in a biologically intuitive, scientifically accurate manner.
引用
收藏
页码:1766 / 1779
页数:14
相关论文
共 38 条
  • [1] [Anonymous], ENZ NOM 1992 REC NOM
  • [2] Gene Ontology: tool for the unification of biology
    Ashburner, M
    Ball, CA
    Blake, JA
    Botstein, D
    Butler, H
    Cherry, JM
    Davis, AP
    Dolinski, K
    Dwight, SS
    Eppig, JT
    Harris, MA
    Hill, DP
    Issel-Tarver, L
    Kasarskis, A
    Lewis, S
    Matese, JC
    Richardson, JE
    Ringwald, M
    Rubin, GM
    Sherlock, G
    [J]. NATURE GENETICS, 2000, 25 (01) : 25 - 29
  • [3] Genomics - The babel of bioinformatics
    Attwood, TK
    [J]. SCIENCE, 2000, 290 (5491) : 471 - 473
  • [4] The PRINTS database of protein fingerprints: A novel information resource for computational molecular biology
    Attwood, TK
    Avison, H
    Beck, ME
    Bewley, M
    Bleasby, AJ
    Brewster, F
    Cooper, P
    Degtyarenko, K
    Geddes, AJ
    Flower, DR
    Kelly, MP
    Lott, S
    Measures, KM
    ParrySmith, DJ
    Perkins, DN
    Scordis, P
    Scott, D
    Worledge, C
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1997, 37 (03): : 417 - 424
  • [5] The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000
    Bairoch, A
    Apweiler, R
    [J]. NUCLEIC ACIDS RESEARCH, 2000, 28 (01) : 45 - 48
  • [6] PROSITE - A DICTIONARY OF SITES AND PATTERNS IN PROTEINS
    BAIROCH, A
    [J]. NUCLEIC ACIDS RESEARCH, 1992, 20 : 2013 - 2018
  • [7] An ontology for bioinformatics applications
    Baker, PG
    Goble, CA
    Bechhofer, S
    Paton, NW
    Stevens, R
    Brass, A
    [J]. BIOINFORMATICS, 1999, 15 (06) : 510 - 520
  • [8] Wasp, the Drosophila Wiskott-Aldrich syndrome gene homologue, is required for cell fate decisions mediated by Notch signaling
    Ben-Yaacov, S
    Le Borgne, R
    Abramson, I
    Schweisguth, F
    Schejter, ED
    [J]. JOURNAL OF CELL BIOLOGY, 2001, 152 (01) : 1 - 13
  • [9] Using GeneWise in the Drosophila annotation experiment
    Birney, E
    Durbin, R
    [J]. GENOME RESEARCH, 2000, 10 (04) : 547 - 548
  • [10] The Mouse Genome Database (MGD): expanding genetic and genomic resources for the laboratory mouse
    Blake, JA
    Eppig, JT
    Richardson, JE
    Davisson, MT
    [J]. NUCLEIC ACIDS RESEARCH, 2000, 28 (01) : 108 - 111