PubMatrix: a tool for multiplex literature mining

被引:150
作者
Becker, KG [1 ]
Hosack, DA
Dennis, G
Lempicki, RA
Bright, TJ
Cheadle, C
Engel, J
机构
[1] NIH, Gene Express & Genom Unit, Baltimore, MD USA
[2] NIA, NCTS, NIH, Baltimore, MD 21224 USA
[3] SAIC Frederich Inc, Lab Immunopathogenesis & Bioinformat, Frederick, MD USA
关键词
D O I
10.1186/1471-2105-4-61
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Molecular experiments using multiplex strategies such as cDNA microarrays or proteomic approaches generate large datasets requiring biological interpretation. Text based data mining tools have recently been developed to query large biological datasets of this type of data. PubMatrix is a web-based tool that allows simple text based mining of the NCBI literature search service PubMed using any two lists of keywords terms, resulting in a frequency matrix of term co-occurrence. Results: For example, a simple term selection procedure allows automatic pair-wise comparisons of approximately 1-100 search terms versus approximately 1-10 modifier terms, resulting in up to 1,000 pair wise comparisons. The matrix table of pair-wise comparisons can then be surveyed, queried individually, and archived. Lists of keywords can include any terms currently capable of being searched in PubMed. In the context of cDNA microarray studies, this may be used for the annotation of gene lists from clusters of genes that are expressed coordinately. An associated PubMatrix public archive provides previous searches using common useful lists of keyword terms. Conclusions: In this way, lists of terms, such as gene names, or functional assignments can be assigned genetic, biological, or clinical relevance in a rapid flexible systematic fashion. http://pubmatrix.grc.nia.nih.gov/.
引用
收藏
页数:6
相关论文
共 11 条
  • [1] ANDRADE MA, 1997, ISMB, V5, P25
  • [2] Gene Ontology: tool for the unification of biology
    Ashburner, M
    Ball, CA
    Blake, JA
    Botstein, D
    Butler, H
    Cherry, JM
    Davis, AP
    Dolinski, K
    Dwight, SS
    Eppig, JT
    Harris, MA
    Hill, DP
    Issel-Tarver, L
    Kasarskis, A
    Lewis, S
    Matese, JC
    Richardson, JE
    Ringwald, M
    Rubin, GM
    Sherlock, G
    [J]. NATURE GENETICS, 2000, 25 (01) : 25 - 29
  • [3] Asher B, 2000, J MOL GRAPH MODEL, V18, P79
  • [4] Cluster analysis and display of genome-wide expression patterns
    Eisen, MB
    Spellman, PT
    Brown, PO
    Botstein, D
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (25) : 14863 - 14868
  • [5] Identifying biological themes within lists of genes with EASE
    Hosack, DA
    Dennis, G
    Sherman, BT
    Lane, HC
    Lempicki, RA
    [J]. GENOME BIOLOGY, 2003, 4 (10)
  • [6] A literature network of human genes for high-throughput analysis of gene expression
    Jenssen, TK
    Lægreid, A
    Komorowski, J
    Hovig, E
    [J]. NATURE GENETICS, 2001, 28 (01) : 21 - +
  • [7] A gene expression map for Caenorhabditis elegans
    Kim, SK
    Lund, J
    Kiraly, M
    Duke, K
    Jiang, M
    Stuart, JM
    Eizinger, A
    Wylie, BN
    Davidson, GS
    [J]. SCIENCE, 2001, 293 (5537) : 2087 - 2092
  • [8] Use of keyword hierarchies to interpret gene expression patterns
    Masys, DR
    Welsh, JB
    Fink, JL
    Gribskov, M
    Klacansky, I
    Corbeil, J
    [J]. BIOINFORMATICS, 2001, 17 (04) : 319 - 326
  • [9] Update on XplorMed:: a web server for exploring scientific literature
    Perez-Iratxeta, C
    Pérez, AJ
    Bork, P
    Andrade, MA
    [J]. NUCLEIC ACIDS RESEARCH, 2003, 31 (13) : 3866 - 3868
  • [10] Srinivasan P, 2001, J AM MED INFORM ASSN, P642