Discovering local structure in gene expression data: The order-preserving submatrix problem

被引:261
作者
Ben-Dor, A
Chor, B
Karp, R
Yakhini, Z
机构
[1] Agilent Labs, Bellevue, WA 98005 USA
[2] Tel Aviv Univ, Sch Comp Sci, IL-69978 Tel Aviv, Israel
[3] Int Comp Sci Inst, Berkeley, CA 94704 USA
[4] Agilent Labs, IL-67060 Tel Aviv, Israel
关键词
gene expression; data analysis; local structure; local pattern; non-parametric methods;
D O I
10.1089/10665270360688075
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
This paper concerns the discovery of patterns in gene expression matrices, in which each element gives the expression level of a given gene in a given experiment. Most existing methods for pattern discovery in such matrices are based on clustering genes by comparing their expression levels in all experiments, or clustering experiments by comparing their expression levels for all genes. Our work goes beyond such global approaches by looking for local patterns that manifest themselves when we focus simultaneously on a subset G of the genes and a subset T of the experiments. Specifically, we look for order-preserving submatrices (OPSMs), in which the expression levels of all genes induce the same linear ordering of the experiments (we show that the OPSM search problem is NP-hard in the worst case). Such a pattern might arise, for example, if the experiments in T represent distinct stages in the progress of a disease or in a cellular process and the expression levels of all genes in G vary across the stages in the same way. We define a probabilistic model in which an OPSM is hidden within an otherwise random matrix. Guided by this model, we develop an efficient algorithm for finding the hidden OPSM in the random matrix. In data generated according to the model, the algorithm recovers the hidden OPSM with a very high success rate. Application of the methods to breast cancer data seem to reveal significant local patterns.
引用
收藏
页码:373 / 384
页数:12
相关论文
共 15 条
  • [1] Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling
    Alizadeh, AA
    Eisen, MB
    Davis, RE
    Ma, C
    Lossos, IS
    Rosenwald, A
    Boldrick, JG
    Sabet, H
    Tran, T
    Yu, X
    Powell, JI
    Yang, LM
    Marti, GE
    Moore, T
    Hudson, J
    Lu, LS
    Lewis, DB
    Tibshirani, R
    Sherlock, G
    Chan, WC
    Greiner, TC
    Weisenburger, DD
    Armitage, JO
    Warnke, R
    Levy, R
    Wilson, W
    Grever, MR
    Byrd, JC
    Botstein, D
    Brown, PO
    Staudt, LM
    [J]. NATURE, 2000, 403 (6769) : 503 - 511
  • [2] [Anonymous], 2000, P 8 INT C INT SYST M
  • [3] [Anonymous], 1979, Computers and Intractablity: A Guide to the Theoryof NP-Completeness
  • [4] BASSETT D, 1999, NAT GENET, V21, P3
  • [5] Clustering gene expression patterns
    Ben-Dor, A
    Shamir, R
    Yakhini, Z
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 1999, 6 (3-4) : 281 - 297
  • [6] BENDOR A, 2001, RECOMB
  • [7] Molecular classification of cutaneous malignant melanoma by gene expression profiling
    Bittner, M
    Meitzer, P
    Chen, Y
    Jiang, Y
    Seftor, E
    Hendrix, M
    Radmacher, M
    Simon, R
    Yakhini, Z
    Ben-Dor, A
    Sampas, N
    Dougherty, E
    Wang, E
    Marincola, F
    Gooden, C
    Lueders, J
    Glatfelter, A
    Pollock, P
    Carpten, J
    Gillanders, E
    Leja, D
    Dietrich, K
    Beaudry, C
    Berens, M
    Alberts, D
    Sondak, V
    Hayward, N
    Trent, J
    [J]. NATURE, 2000, 406 (6795) : 536 - 540
  • [8] Cluster analysis and display of genome-wide expression patterns
    Eisen, MB
    Spellman, PT
    Brown, PO
    Botstein, D
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (25) : 14863 - 14868
  • [9] Making the most of microarray data
    Gaasterland, T
    Bekiranov, S
    [J]. NATURE GENETICS, 2000, 24 (03) : 204 - 206
  • [10] Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring
    Golub, TR
    Slonim, DK
    Tamayo, P
    Huard, C
    Gaasenbeek, M
    Mesirov, JP
    Coller, H
    Loh, ML
    Downing, JR
    Caligiuri, MA
    Bloomfield, CD
    Lander, ES
    [J]. SCIENCE, 1999, 286 (5439) : 531 - 537