Clustering analysis of gene expression data based on semi-supervised visual clustering algorithm

被引:13
作者
Chung, Fu-lai
Wang, Shitong [1 ]
Deng, Zhaohong
Shu, Chen
Hu, D.
机构
[1] So Yangtze Univ, Sch Informat Engn, Wuxi, Peoples R China
[2] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Hong Kong, Peoples R China
[3] Natl Def Univ Sci & Technol, Sch Automat, Changsha, Peoples R China
关键词
semi-supervised learning; visual clustering; clustering analysis; gene expression data; gradient system;
D O I
10.1007/s00500-005-0025-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
When gene expression datasets contain some labeled data samples, the labeled information should be incorporated into clustering algorithm such that more reasonable clustering results can be achieved. In this paper, a novel semi-supervised clustering algorithm, Semi-supervised Iterative Visual Clustering Algorithm (Semi-IVCA), is presented to tackle with such datasets. The new algorithm first constructs the visual sampling image of the dataset based on visual theorem and obtains its attractors using the gradient learning rules, where each attractor denotes a cluster of the dataset. Then the new algorithm introduces an iterative clustering procedure to realize the semi-supervised learning. The new algorithm is a generalization of the current Visual Clustering Algorithm (VCA) presented by authors. Except for the advantage that Semi-IVCA can effectively utilize the labeled data information in clustering, it is robust and insensitive to initialization, and it has strong parameter learning capability and good interpretation for the clustering results. When the new algorithm Semi-IVCA is applied to the artificial and real gene expression datasets, the experimental results confirm the above advantages of algorithm Semi-IVCA.
引用
收藏
页码:981 / 993
页数:13
相关论文
共 31 条
  • [1] Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays
    Alon, U
    Barkai, N
    Notterman, DA
    Gish, K
    Ybarra, S
    Mack, D
    Levine, AJ
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (12) : 6745 - 6750
  • [2] [Anonymous], P 27 ANN INT C SIGIR
  • [3] [Anonymous], 1982, VISION COMPUTATIONAL
  • [4] Gene Ontology: tool for the unification of biology
    Ashburner, M
    Ball, CA
    Blake, JA
    Botstein, D
    Butler, H
    Cherry, JM
    Davis, AP
    Dolinski, K
    Dwight, SS
    Eppig, JT
    Harris, MA
    Hill, DP
    Issel-Tarver, L
    Kasarskis, A
    Lewis, S
    Matese, JC
    Richardson, JE
    Ringwald, M
    Rubin, GM
    Sherlock, G
    [J]. NATURE GENETICS, 2000, 25 (01) : 25 - 29
  • [5] Median correlation for the analysis of gene expression data
    Bloch, KM
    Arce, GR
    [J]. SIGNAL PROCESSING, 2003, 83 (04) : 811 - 823
  • [6] Knowledge-based analysis of microarray gene expression data by using support vector machines
    Brown, MPS
    Grundy, WN
    Lin, D
    Cristianini, N
    Sugnet, CW
    Furey, TS
    Ares, M
    Haussler, D
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (01) : 262 - 267
  • [7] Projective ART for clustering data sets in high dimensional spaces
    Cao, YQ
    Wu, JH
    [J]. NEURAL NETWORKS, 2002, 15 (01) : 105 - 120
  • [8] Fuzzy kernel perceptron
    Chen, JH
    Chen, CS
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 2002, 13 (06): : 1364 - 1373
  • [9] Integrated genomic and proteomic analyses of a systematically perturbed metabolic network
    Ideker, T
    Thorsson, V
    Ranish, JA
    Christmas, R
    Buhler, J
    Eng, JK
    Bumgarner, R
    Goodlett, DR
    Aebersold, R
    Hood, L
    [J]. SCIENCE, 2001, 292 (5518) : 929 - 934
  • [10] Data clustering: A review
    Jain, AK
    Murty, MN
    Flynn, PJ
    [J]. ACM COMPUTING SURVEYS, 1999, 31 (03) : 264 - 323