Detecting arbitrarily shaped clusters using ant colony optimization

被引:25
作者
Pei, Tao [1 ]
Wan, You [1 ]
Jiang, Yong [2 ,3 ]
Qu, Chenxu [2 ,3 ]
Zhou, Chenghu [2 ,3 ]
Qiao, Youlin [2 ,3 ]
机构
[1] Chinese Acad Sci, Inst Geog Sci & Nat Resources, State Key Lab Resources & Environm Informat Syst, Beijing 100101, Peoples R China
[2] Chinese Acad Med Sci, Canc Inst & Hosp, Beijing 100730, Peoples R China
[3] Peking Union Med Coll, Beijing 100021, Peoples R China
基金
中国国家自然科学基金;
关键词
spatial data mining; spatial analysis;
D O I
10.1080/13658816.2010.533674
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In the map of geo-referenced population and cases, the detection of the most likely cluster (MLC), which is made up of many connected polygons (e. g., the boundaries of census tracts), may face two difficulties. One is the irregularity of the shape of the cluster and the other is the heterogeneity of the cluster. A heterogeneous cluster is referred to as the cluster containing depression links (a polygon is a depression link if it satisfies two conditions: (1) the ratio between the case number and the population in the polygon is below the average ratio of the whole map; (2) the removal of the polygon will disconnect the cluster). Previous studies have successfully solved the problem of detecting arbitrarily shaped clusters not containing depression links. However, for a heterogeneous cluster, existing methods may generate mistakes, for example, missing some parts of the cluster. In this article, a spatial scanning method based on the ant colony optimization (AntScan) is proposed to improve the detection power. If a polygon can be simplified as a node, the research area consisting of many polygons then can be seen as a graph. So the detection of the MLC can be seen as the search of the best subgraph (with the largest likelihood value) in the graph. The comparison between AntScan, GAScan (the spatial scan method based on the genetic optimization), and SAScan (the spatial scan method based on the simulated annealing optimization) indicates that (1) the performance of GAScan and SAScan is significantly influenced by the parameter of the fraction value (the maximum allowed size of the detected cluster), which can only be estimated by multiple trials, while no such parameter is needed in AntScan; (2) AntScan shows superior power over GAScan and SAScan in detecting heterogeneous clusters. The case study on esophageal cancer in North China demonstrates that the cluster identified by AntScan has the larger likelihood value than that detected by SAScan and covers all high-risk regions of esophageal cancer whereas SAScan misses some high-risk regions (the region in the southwest of Shandong province, eastern China) due to the existence of a depression link.
引用
收藏
页码:1575 / 1595
页数:21
相关论文
共 26 条
[1]  
[Anonymous], 2004, ANT COLONY OPTIMIZAT
[2]  
[Anonymous], 2005, GIS and Crime Mapping, DOI DOI 10.1002/9781118685181
[3]  
[Anonymous], 1999, Swarm Intelligence
[4]   THE DETECTION OF CLUSTERS IN RARE DISEASES [J].
BESAG, J ;
NEWELL, J .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-STATISTICS IN SOCIETY, 1991, 154 :143-155
[5]   A genetic approach to detecting clusters in point data sets [J].
Conley, J ;
Gahegan, M ;
Macgill, J .
GEOGRAPHICAL ANALYSIS, 2005, 37 (03) :286-314
[6]  
Dorigo M, 1992, OPTIMIZATION LEARNIN
[7]   Ant colony optimization -: Artificial ants as a computational intelligence technique [J].
Dorigo, Marco ;
Birattari, Mauro ;
Stuetzle, Thomas .
IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE, 2006, 1 (04) :28-39
[8]   A simulated annealing strategy for the detection of arbitrarily shaped spatial clusters [J].
Duczmal, L ;
Assunçao, R .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2004, 45 (02) :269-286
[9]   Delineation of irregularly shaped disease clusters through multiobjective optimization [J].
Duczmal, Luiz ;
Cancado, Andre L. F. ;
Takahashi, Ricardo H. C. .
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2008, 17 (01) :243-262
[10]   A genetic algorithm for irregularly shaped spatial scan statistics [J].
Duczmal, Luiz ;
Cancado, Andre L. F. ;
Takahashi, Ricardo H. C. ;
Bessegato, Lupercio E. .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2007, 52 (01) :43-52