Clustering microarray-derived gene lists through implicit literature relationships

被引:14
作者
Burkart, Mark F.
Wren, Jonathan D.
Herschkowitz, Jason I.
Perou, Charles M.
Garner, Harold R.
机构
[1] Univ Texas, SW Med Ctr, Div Translat Res,McDermott Ctr Human Growth & Dev, Dept Internal Med, Dallas, TX 75390 USA
[2] Univ Texas, SW Med Ctr, Div Translat Res,McDermott Ctr Human Growth & Dev, Dept Biochem, Dallas, TX 75390 USA
[3] Oklahoma Med Res Fdn, Arthritis & Immunol Program, Oklahoma City, OK 73104 USA
[4] Univ N Carolina, Lineberger Comprehens Canc Ctr, Chapel Hill, NC 27599 USA
[5] Univ N Carolina, Dept Genet, Chapel Hill, NC 27599 USA
[6] Univ N Carolina, Dept Pathol & Lab Med, Chapel Hill, NC 27599 USA
关键词
D O I
10.1093/bioinformatics/btm261
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Microarrays rapidly generate large quantities of gene expression information, but interpreting such data within a biological context is still relatively complex and laborious. New methods that can identify functionally related genes via shared literature concepts will be useful in addressing these needs. Results: We have developed a novel method that uses implicit literature relationships (concepts related via shared, intermediate concepts) to cluster related genes. Genes are evaluated for implicit connections within a network of biomedical objects (other genes, ontological concepts and diseases) that are connected via their co-occurrences in Medline titles and/or abstracts. On the basis of these implicit relationships, individual gene pairs are scored using a probability-based algorithm. Scores are generated for all pairwise combinations of genes, which are then clustered based on the scores. We applied this method to a test set composed of nine functional groups with known relationships. The method scored highly for all nine groups and significantly better than a benchmark co-occurrence-based method for six groups. We then applied this method to gene sets specific to two previously defined breast tumor subtypes. Analysis of the results recapitulated known biological relationships and identified novel pathway relationships unique to each tumor subtype. We demonstrate that this method provides a valuable new means of identifying and visualizing significantly related genes within gene lists via their implicit relationships in the literature.
引用
收藏
页码:1995 / 2003
页数:9
相关论文
共 39 条
[1]   CoPub Mapper: mining MEDLINE based on search term co-publication [J].
Alako, BTF ;
Veldhoven, A ;
van Baal, S ;
Jelier, R ;
Verhoeven, S ;
Rullmann, T ;
Polman, J ;
Jenster, G .
BMC BIOINFORMATICS, 2005, 6 (1)
[2]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[3]  
Barrett T, 2005, NUCLEIC ACIDS RES, V33, pD562
[4]   BACH1, a novel helicase-like protein, interacts directly with BRCA1 and contributes to its DNA repair function [J].
Cantor, SB ;
Bell, DW ;
Ganesan, S ;
Kass, EM ;
Drapkin, R ;
Grossman, S ;
Wahrer, DCR ;
Sgroi, DC ;
Lane, WS ;
Haber, DA ;
Livingston, DM .
CELL, 2001, 105 (01) :149-160
[5]  
Chaussabel D, 2002, GENOME BIOL, V3
[6]   Applications of microarray technology in breast cancer research [J].
Cooper, CS .
BREAST CANCER RESEARCH, 2001, 3 (03) :158-175
[7]   A cell proliferation signature is a marker of extremely poor outcome in a subpopulation of breast cancer patients [J].
Dai, HY ;
van't Veer, L ;
Lamb, J ;
He, YD ;
Mao, M ;
Fine, BM ;
Bernards, R ;
de Vijver, MV ;
Deutsch, P ;
Sachs, A ;
Stoughton, R ;
Friend, S .
CANCER RESEARCH, 2005, 65 (10) :4059-4066
[8]  
Ding J, 2002, Pac Symp Biocomput, P326
[9]   Cluster analysis and display of genome-wide expression patterns [J].
Eisen, MB ;
Spellman, PT ;
Brown, PO ;
Botstein, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (25) :14863-14868
[10]  
Fang ZH, 2006, HISTOL HISTOPATHOL, V21, P403, DOI 10.14670/HH-21.403