Identifying good diagnostic gene groups from gene expression profiles using the concept of emerging patterns

被引:94
作者
Li, JY [1 ]
Wong, LS [1 ]
机构
[1] Kent Ridge Digital Labs, Singapore 119613, Singapore
关键词
D O I
10.1093/bioinformatics/18.5.725
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivations and Results: Gene groups that are significantly related to a disease can be detected by conducting a series of gene expression experiments. This work is aimed at discovering special types of gene groups that satisfy the following property. In each group, its member genes are found to be one-to-one contained in pre-determined intervals of gene expression level with a large frequency in one class of cells but are never found unanimously in these intervals in the other class of cells. We call these gene groups emerging patterns, to emphasize the patterns' frequency changes between two classes of cells. We use effective discretization and gene selection methods to obtain the most discriminatory genes. We also use efficient algorithms to derive the patterns from these genes. According to our studies on the ALL/AML dataset and the colon tumor dataset, some patterns, which consist of one or more genes, can reach a high frequency of 90%, or even 100%. In other words, they nearly or fully dominate one class of cells, even though they rarely occur in the other class. The discovered patterns are used to classify new cells with a higher accuracy than other reported methods. Based on these patterns, we also conjecture the possibility of a personalized treatment plan which converts colon tumor cells into normal cells by modulating the expression levels of a few genes.
引用
收藏
页码:725 / 734
页数:10
相关论文
共 20 条
  • [1] Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays
    Alon, U
    Barkai, N
    Notterman, DA
    Gish, K
    Ybarra, S
    Mack, D
    Levine, AJ
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (12) : 6745 - 6750
  • [2] [Anonymous], 1999, Proceedings of the fifth ACM SIGKDD international conference on knowledge discovery and data mining p, DOI [10.1145/312129., DOI 10.1145/312129, 10.1145/312129, 10.1145/312129.312191]
  • [3] [Anonymous], 1993, P 13 INT JOINT C ART
  • [4] The transcriptional program of sporulation in budding yeast
    Chu, S
    DeRisi, J
    Eisen, M
    Mulholland, J
    Botstein, D
    Brown, PO
    Herskowitz, I
    [J]. SCIENCE, 1998, 282 (5389) : 699 - 705
  • [5] Exploring the metabolic and genetic control of gene expression on a genomic scale
    DeRisi, JL
    Iyer, VR
    Brown, PO
    [J]. SCIENCE, 1997, 278 (5338) : 680 - 686
  • [6] Dong GZ, 1998, LECT NOTES ARTIF INT, V1394, P72
  • [7] DOUGHERTY J, 1995, P 12 INT C MACH LEAR, P94
  • [8] Support vector machine classification and validation of cancer tissue samples using microarray expression data
    Furey, TS
    Cristianini, N
    Duffy, N
    Bednarski, DW
    Schummer, M
    Haussler, D
    [J]. BIOINFORMATICS, 2000, 16 (10) : 906 - 914
  • [9] Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring
    Golub, TR
    Slonim, DK
    Tamayo, P
    Huard, C
    Gaasenbeek, M
    Mesirov, JP
    Coller, H
    Loh, ML
    Downing, JR
    Caligiuri, MA
    Bloomfield, CD
    Lander, ES
    [J]. SCIENCE, 1999, 286 (5439) : 531 - 537
  • [10] KOHAVI R, 1994, PROC INT C TOOLS ART, P740, DOI 10.1109/TAI.1994.346412