Sequential patterns mining and gene sequence visualization to discover novelty from microarray data

被引:23
作者
Sallaberry, A. [4 ]
Pecheur, N. [3 ]
Bringay, S. [2 ,3 ]
Roche, M. [3 ]
Teisseire, M. [1 ]
机构
[1] Irstea, UMR TETIS, Maison Teledetect, F-34093 Montpellier, France
[2] Univ Montpellier 3, MIAp Dept, F-34199 Montpellier 5, France
[3] Univ Montpellier 2, CNRS, LIRMM, F-34095 Montpellier 5, France
[4] INRIA Bordeaux Sud Ouest, LaBRI, F-33405 Talence, France
关键词
Visualization; Data mining; Bioinformatics; Sequential patterns; Microarray data; Gene data; METHODOLOGY; SEARCH; MODEL;
D O I
10.1016/j.jbi.2011.04.002
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Data mining allow users to discover novelty in huge amounts of data. Frequent pattern methods have proved to be efficient, but the extracted patterns are often too numerous and thus difficult to analyze by end users. In this paper, we focus on sequential pattern mining and propose a new visualization system to help end users analyze the extracted knowledge and to highlight novelty according to databases of referenced biological documents. Our system is based on three visualization techniques: clouds, solar systems, and treemaps. We show that these techniques are very helpful for identifying associations and hierarchical relationships between patterns among related documents. Sequential patterns extracted from gene data using our system were successfully evaluated by two biology laboratories working on Alzheimer's disease and cancer. (C) 2011 Elsevier Inc. All rights reserved.
引用
收藏
页码:760 / 774
页数:15
相关论文
共 40 条
  • [1] Can evaluation studies benefit from triangulation? A case study
    Ammenwerth, E
    Iller, C
    Mansmann, U
    [J]. INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2003, 70 (2-3) : 237 - 248
  • [2] [Anonymous], 1952, Psychometrika
  • [3] Brog I., 1997, MODERN MULTIDIMENSIO
  • [4] Bruls M, 2000, SPRING COMP SCI, P33
  • [5] Flexible information visualization of multivariate data from biological sequence similarity searches
    Chi, EHH
    Riedl, J
    Shoop, E
    Carlis, JV
    Retzel, E
    Barry, P
    [J]. VISUALIZATION '96, PROCEEDINGS, 1996, : 133 - +
  • [6] Cong G., 2004, P 2004 ACM SIGMOD IN, P143, DOI DOI 10.1145/1007568.1007587
  • [7] De Leeuw J., 1977, RECENT DEV STAT, V1, P133
  • [8] Delaunay P.B., 1934, Bulletin de l'Academie des Sciences de l'URSS, P793
  • [9] CONVERGENCE OF THE MAJORIZATION METHOD FOR MULTIDIMENSIONAL-SCALING
    DELEEUW, J
    [J]. JOURNAL OF CLASSIFICATION, 1988, 5 (02) : 163 - 180
  • [10] FEKETE JD, 1999, C HUM FACT COMP SYST, P512