Fuzzy and hard clustering analysis for thyroid disease

被引:52
作者
Azar, Ahmad Taher [1 ]
El-Said, Shaimaa Ahmed [2 ]
Hassanien, Aboul Ella [3 ]
机构
[1] Benha Univ, Fac Comp & Informat, Cairo, Egypt
[2] Zagazig Univ, Fac Engn, Elect & Commun Dept, Zagazig, Sharkia, Egypt
[3] Cairo Univ, SRGE, Fac Comp & Informat, Cairo, Egypt
关键词
Thyroid disease; K-means clustering; K-medoids clustering; Fuzzy C-means; Gustafson-Kessel algorithm; Gath-Geva algorithm; RECOGNITION SYSTEM AIRS; C-MEANS; VALIDITY; CLASSIFICATION; SEGMENTATION; ALGORITHMS; DIAGNOSIS;
D O I
10.1016/j.cmpb.2013.01.002
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Thyroid hormones produced by the thyroid gland help regulation of the body's metabolism. A variety of methods have been proposed in the literature for thyroid disease classification. As far as we know, clustering techniques have not been used in thyroid diseases data set so far. This paper proposes a comparison between hard and fuzzy clustering algorithms for thyroid diseases data set in order to find the optimal number of clusters. Different scalar validity measures are used in comparing the performances of the proposed clustering systems. To demonstrate the performance of each algorithm, the feature values that represent thyroid disease are used as input for the system. Several runs are carried out and recorded with a different number of clusters being specified for each run (between 2 and 11), so as to establish the optimum number of clusters. To find the optimal number of clusters, the so-called elbow criterion is applied. The experimental results revealed that for all algorithms, the elbow was located at c = 3. The clustering results for all algorithms are then visualized by the Sammon mapping method to find a low-dimensional (normally 2D or 3D) representation of a set of points distributed in a high dimensional pattern space. At the end of this study, some recommendations are formulated to improve determining the actual number of clusters present in the data set. (C) 2013 Elsevier Ireland Ltd. All rights reserved.
引用
收藏
页码:1 / 16
页数:16
相关论文
共 73 条
[1]  
AAO-HNS, 2012, AM AC OT HEAD NECK S
[2]  
[Anonymous], INT C SOFT COMP HELS
[3]  
[Anonymous], 1973, Pattern Classification and Scene Analysis
[4]  
[Anonymous], 1988, Algorithms for Clustering Data
[5]  
Babuska R, 2002, PROCEEDINGS OF THE 2002 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOL 1 & 2, P1081, DOI 10.1109/FUZZ.2002.1006654
[6]   A CLUSTERING PERFORMANCE-MEASURE BASED ON FUZZY SET DECOMPOSITION [J].
BACKER, E ;
JAIN, AK .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1981, 3 (01) :66-75
[7]  
Balasko B., 2005, FUZZY CLUSTERING DAT, P1
[8]   Validity-guided (re)clustering with applications to image segmentation [J].
Bensaid, AM ;
Hall, LO ;
Bezdek, JC ;
Clarke, LP ;
Silbiger, ML ;
Arrington, JA ;
Murtagh, RF .
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 1996, 4 (02) :112-123
[9]  
Bezdek J. C., 1981, Pattern recognition with fuzzy objective function algorithms
[10]  
Bezdek J. C., 1973, Journal of Cybernetics, V3, P58, DOI 10.1080/01969727308546047