Prediction of biologically significant components from microarray data: Independently Consistent Expression Discriminator (ICED)

被引:21
作者
Bijlani, R
Cheng, YH
Pearce, DA
Brooks, AI [1 ]
Ogihara, M
机构
[1] Univ Rochester, Sch Med & Dent, Ctr Funct Genom, Rochester, NY 14642 USA
[2] Univ Rochester, Sch Med & Dent, Dept Comp Sci, Rochester, NY 14642 USA
[3] Univ Rochester, Sch Med & Dent, Dept Environm Med, Rochester, NY 14642 USA
[4] Univ Rochester, Sch Med & Dent, Ctr Aging & Dev Biol, Rochester, NY 14642 USA
[5] Univ Rochester, Sch Med & Dent, Dept Biochem & Biophys, Rochester, NY 14642 USA
关键词
D O I
10.1093/bioinformatics/19.1.62
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Class distinction is a supervised learning approach that has been successfully employed in the analysis of high-throughput gene expression data. Identification of a set of genes that predicts differential biological states allows for the development of basic and clinical scientific approaches to the diagnosis of disease. The Independent Consistent Expression Discriminator (ICED) was designed to provide a more biologically relevant search criterion during predictor selection by embracing the inherent variability of gene expression in any biological state. The four components of ICED include (i) normalization of raw data; (ii) assignment of weights to genes from both classes; (iii) counting of votes to determine optimal number of predictor genes for class distinction; (iv) calculation of prediction strengths for classification results. The search criteria employed by ICED is designed to identify not only genes that are consistently expressed at one level in one class and at a consistently different level in another class but identify genes that are variable in one class and consistent in another. The result is a novel approach to accurately select biologically relevant predictors of differential disease states from a small number of microarray samples. Results: The data described herein utilized ICED to analyze the large AML/ALL training and test data set (Golub et al., 1999, Science, 286, 531-537) in addition to a smaller data set consisting of an animal model of the childhood neurodegenerative disorder, Batten disease, generated for this study. Both of the analyses presented herein have correctly predicted biologically relevant perturbations that can be used for disease classification, irrespective of sample size. Furthermore, the results have provided candidate proteins for future study in understanding the disease process and the identification of potential targets for therapeutic intervention.
引用
收藏
页码:62 / 70
页数:9
相关论文
共 28 条
  • [1] Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays
    Alon, U
    Barkai, N
    Notterman, DA
    Gish, K
    Ybarra, S
    Mack, D
    Levine, AJ
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (12) : 6745 - 6750
  • [2] Tissue classification with gene expression profiles
    Ben-Dor, A
    Bruhn, L
    Friedman, N
    Nachman, I
    Schummer, M
    Yakhini, Z
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 2000, 7 (3-4) : 559 - 583
  • [3] Knowledge-based analysis of microarray gene expression data by using support vector machines
    Brown, MPS
    Grundy, WN
    Lin, D
    Cristianini, N
    Sugnet, CW
    Furey, TS
    Ares, M
    Haussler, D
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (01) : 262 - 267
  • [4] CALIFANO A, 2000, ISMB, V8, P75
  • [5] An autoantibody inhibitory to glutamic acid decarboxylase in the neurodegenerative disorder Batten disease
    Chattopadhyay, S
    Ito, M
    Cooper, JD
    Brooks, AI
    Curran, TM
    Powers, JM
    Pearce, DA
    [J]. HUMAN MOLECULAR GENETICS, 2002, 11 (12) : 1421 - 1431
  • [6] Efron B., 1982, SOC IND APPL MATH CB, V38, DOI [10.1137/1.9781611970319, DOI 10.1137/1.9781611970319]
  • [7] Cluster analysis and display of genome-wide expression patterns
    Eisen, MB
    Spellman, PT
    Brown, PO
    Botstein, D
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (25) : 14863 - 14868
  • [8] Specific delay in the degradation of mitochondrial ATP synthase subunit c in late infantile neuronal ceroid lipofuscinosis is derived from cellular proteolytic dysfunction rather than structural alteration of subunit c
    Ezaki, J
    Wolfe, LS
    Kominami, E
    [J]. JOURNAL OF NEUROCHEMISTRY, 1996, 67 (04) : 1677 - 1687
  • [9] Serum cystatin C in patients with myeloma
    Finney, H
    Williams, AH
    Price, CP
    [J]. CLINICA CHIMICA ACTA, 2001, 309 (01) : 1 - 6
  • [10] Cathepsin B acts as a dominant execution protease in tumor cell apoptosis induced by tumor necrosis factor
    Foghsgaard, L
    Wissing, D
    Mauch, D
    Lademann, U
    Bastholm, L
    Boes, M
    Elling, F
    Leist, M
    Jäättelä, M
    [J]. JOURNAL OF CELL BIOLOGY, 2001, 153 (05) : 999 - 1009