Association of genes to genetically inherited diseases using data mining

被引:240
作者
Perez-Iratxeta, C
Bork, P
Andrade, MA
机构
[1] European Mol Biol Lab, D-69012 Heidelberg, Germany
[2] Max Delbruck Ctr Mol Med, Dept Bioinformat, D-13092 Berlin, Germany
关键词
D O I
10.1038/ng895
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Although approximately one-quarter of the roughly 4,000 genetically inherited diseases currently recorded in respective databases (LocusLink(1), OMIM2) are already linked to a region of the human genome, about 450 have no known associated gene. Finding disease-related genes requires laborious examination of hundreds of possible candidate genes (sometimes, these are not even annotated; see, for example, refs 3,4). The public availability of the human genome(5) draft sequence has fostered new strategies to map molecular functional features of gene products to complex phenotypic descriptions, such as those of genetically inherited diseases. Owing to recent progress in the systematic annotation of genes using controlled vocabularies 6, we have developed a scoring system for the possible functional relationships of human genes to 455 genetically inherited diseases that have been mapped to chromosomal regions without assignment of a particular gene. In a benchmark of the system with 100 known disease-associated genes, the disease-associated gene was among the 8 best-scoring genes with a 25% chance, and among the best 30 genes with a 50% chance, showing that there is a relationship between the score of a gene and its likelihood of being associated with a particular disease. The scoring also indicates that for some diseases, the chance of identifying the underlying gene is higher.
引用
收藏
页码:316 / 319
页数:4
相关论文
共 12 条
  • [1] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [2] Gene Ontology: tool for the unification of biology
    Ashburner, M
    Ball, CA
    Blake, JA
    Botstein, D
    Butler, H
    Cherry, JM
    Davis, AP
    Dolinski, K
    Dwight, SS
    Eppig, JT
    Harris, MA
    Hill, DP
    Issel-Tarver, L
    Kasarskis, A
    Lewis, S
    Matese, JC
    Richardson, JE
    Ringwald, M
    Rubin, GM
    Sherlock, G
    [J]. NATURE GENETICS, 2000, 25 (01) : 25 - 29
  • [3] Characterization of single-nucleotide polymorphisms in coding regions of human genes
    Cargill, M
    Altshuler, D
    Ireland, J
    Sklar, P
    Ardlie, K
    Patil, N
    Lane, CR
    Lim, EP
    Kalyanaraman, N
    Nemesh, J
    Ziaugra, L
    Friedland, L
    Rolfe, A
    Warrington, J
    Lipshutz, R
    Daley, GQ
    Lander, ES
    [J]. NATURE GENETICS, 1999, 22 (03) : 231 - 238
  • [4] Autosomal recessive hypercholesterolemia caused by mutations in a putative LDL receptor adaptor protein
    Garcia, CK
    Wilund, K
    Arca, M
    Zuliani, G
    Fellin, R
    Maioli, M
    Calandra, S
    Bertolini, S
    Cossu, F
    Grishin, N
    Barnes, R
    Cohen, JC
    Hobbs, HH
    [J]. SCIENCE, 2001, 292 (5520) : 1394 - 1398
  • [5] Hamosh A, 2000, HUM MUTAT, V15, P57, DOI 10.1002/(SICI)1098-1004(200001)15:1<57::AID-HUMU12>3.0.CO
  • [6] 2-G
  • [7] A comparison of the Celera and Ensembl predicted gene sets reveals little overlap in novel genes
    Hogenesch, JB
    Ching, KA
    Batalov, S
    Su, AI
    Walker, JR
    Zhou, YY
    Kay, SA
    Schultz, PG
    Cooke, MP
    [J]. CELL, 2001, 106 (04) : 413 - 415
  • [8] Initial sequencing and analysis of the human genome
    Lander, ES
    Int Human Genome Sequencing Consortium
    Linton, LM
    Birren, B
    Nusbaum, C
    Zody, MC
    Baldwin, J
    Devon, K
    Dewar, K
    Doyle, M
    FitzHugh, W
    Funke, R
    Gage, D
    Harris, K
    Heaford, A
    Howland, J
    Kann, L
    Lehoczky, J
    LeVine, R
    McEwan, P
    McKernan, K
    Meldrim, J
    Mesirov, JP
    Miranda, C
    Morris, W
    Naylor, J
    Raymond, C
    Rosetti, M
    Santos, R
    Sheridan, A
    Sougnez, C
    Stange-Thomann, N
    Stojanovic, N
    Subramanian, A
    Wyman, D
    Rogers, J
    Sulston, J
    Ainscough, R
    Beck, S
    Bentley, D
    Burton, J
    Clee, C
    Carter, N
    Coulson, A
    Deadman, R
    Deloukas, P
    Dunham, A
    Dunham, I
    Durbin, R
    French, L
    [J]. NATURE, 2001, 409 (6822) : 860 - 921
  • [9] PLAITAKIS A, 1993, CAN J NEUROL SCI, V20, P5109
  • [10] RefSeq and LocusLink: NCBI gene-centered resources
    Pruitt, KD
    Maglott, DR
    [J]. NUCLEIC ACIDS RESEARCH, 2001, 29 (01) : 137 - 140