Feature (gene) selection in gene expression-based tumor classification

被引:93
作者
Xiong, MM
Li, WJ
Zhao, JY
Jin, L
Boerwinkle, E
机构
[1] Univ Texas, Hlth Sci Ctr, Ctr Human Genet, Houston, TX 77225 USA
[2] Univ Texas, Hlth Sci Ctr, Inst Mol Med, Houston, TX 77225 USA
关键词
gene expression; gene selection; Monte Carlo; microarray; tumor classification;
D O I
10.1006/mgme.2001.3193
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
There is increasing interest in changing the emphasis of tumor classification from morphologic to molecular. Gene expression profiles may offer more information than morphology and provide an alternative to morphology-based tumor classification systems. Gene selection involves a search for gene subsets that are able to discriminate tumor tissue from normal tissue, and may have either clear biological interpretation or some implication in the molecular mechanism of the tumorigenesis. Gene selection is a fundamental issue in gene expression-based tumor classification. In the formation of a discriminant rule, the number of genes is large relative to the number of tissue samples. Too many genes can harm the performance of the tumor classification system and increase the cost as well. In this report, we discuss criteria and illustrate techniques for reducing the number of genes and selecting an optimal (or near optimal) subset of genes from an initial set of genes for tumor classification. The practical advantages of gene selection over other methods of reducing the dimensionality (e.g., principal components), include its simplicity, future cost savings, and higher likelihood of being adopted in a clinical setting. We analyze the expression profiles of 2000 genes in 22 normal and 40 colon tumor tissues, 5776 sequences in 14 human mammary epithelial cells and 13 breast tumors, and 6817 genes in 47 acute lymphoblastic leukemia and 25 acute myeloid leukemia samples. Through these three examples, we show that using 2 or 3 genes can achieve more than 90% accuracy of classification. This result implies that after initial investigation of tumor classification using microarrays, a small number of selected genes may be used as biomarkers for tumor classification, or may have some relevance in tumor development and serve as a potential drug target. In this report we also show that stepwise Fisher's linear discriminant function is a practicable method for gene expression-based tumor classification. (C) 2001 Academic Press.
引用
收藏
页码:239 / 247
页数:9
相关论文
共 18 条
  • [1] Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays
    Alon, U
    Barkai, N
    Notterman, DA
    Gish, K
    Ybarra, S
    Mack, D
    Levine, AJ
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (12) : 6745 - 6750
  • [2] DeRisi J, 1996, NAT GENET, V14, P457
  • [3] Draper N. R., 1966, APPL REGRESSION ANAL
  • [4] Coupled two-way clustering analysis of gene microarray data
    Getz, G
    Levine, E
    Domany, E
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (22) : 12079 - 12084
  • [5] Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring
    Golub, TR
    Slonim, DK
    Tamayo, P
    Huard, C
    Gaasenbeek, M
    Mesirov, JP
    Coller, H
    Loh, ML
    Downing, JR
    Caligiuri, MA
    Bloomfield, CD
    Lander, ES
    [J]. SCIENCE, 1999, 286 (5439) : 531 - 537
  • [6] The transcriptional program in the response of human fibroblasts to serum
    Iyer, VR
    Eisen, MB
    Ross, DT
    Schuler, G
    Moore, T
    Lee, JCF
    Trent, JM
    Staudt, LM
    Hudson, J
    Boguski, MS
    Lashkari, D
    Shalon, D
    Botstein, D
    Brown, PO
    [J]. SCIENCE, 1999, 283 (5398) : 83 - 87
  • [7] Johnson R.A., 1982, APPL MULTIVARIATE ST
  • [8] Expression monitoring by hybridization to high-density oligonucleotide arrays
    Lockhart, DJ
    Dong, HL
    Byrne, MC
    Follettie, MT
    Gallo, MV
    Chee, MS
    Mittmann, M
    Wang, CW
    Kobayashi, M
    Horton, H
    Brown, EL
    [J]. NATURE BIOTECHNOLOGY, 1996, 14 (13) : 1675 - 1680
  • [9] McLachlan, 2004, DISCRIMINANT ANAL ST
  • [10] Distinctive gene expression patterns in human mammary epithelial cells and breast cancers
    Perou, CM
    Jeffrey, SS
    Van de Rijn, M
    Rees, CA
    Eisen, MB
    Ross, DT
    Pergamenschikov, A
    Williams, CF
    Zhu, SX
    Lee, JCF
    Lashkari, D
    Shalon, D
    Brown, PO
    Botstein, D
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (16) : 9212 - 9217