Unsupervised feature selection applied to content-based retrieval of lung images

被引:174
作者
Dy, JG [1 ]
Brodley, CE
Kak, A
Broderick, LS
Aisen, AM
机构
[1] Northeastern Univ, Dept Elect & Comp Engn, Boston, MA 02115 USA
[2] Indiana Univ, Med Ctr, Dept Radiol, Indianapolis, IN 46202 USA
[3] Univ Wisconsin, Dept Radiol, Madison, WI 53706 USA
[4] Purdue Univ, Sch Elect & Comp Engn, W Lafayette, IN 47907 USA
关键词
image retrieval; feature selection; clustering; expectation-maximization; unsupervised learning;
D O I
10.1109/TPAMI.2003.1182100
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes a new hierarchical approach to content-based image retrieval called the "customized-queries" approach (CQA). Contrary to the single feature vector approach which tries to classify the query and retrieve similar images in one step, CQA uses multiple feature sets and a two-step approach to retrieval. The first step classifies the query according to the class labels of the images using the features that best discriminate the classes. The second step then retrieves the most similar images within the predicted class using the features customized to distinguish "subclasses" within that class. Needing to find the customized feature subset for each class led us to investigate feature selection for unsupervised learning. As a result, we developed a new algorithm called FSSEM (feature subset selection using expectation-maximization clustering). We applied our approach to a database of high resolution computed tomography lung images and show that CQA radically improves the retrieval precision over the single feature vector approach. To determine whether our CBIR system is helpful to physicians, we conducted an evaluation trial with eight radiologists. The results show that our system using CQA retrieval doubled the doctors' diagnostic accuracy.
引用
收藏
页码:373 / 378
页数:6
相关论文
共 36 条
[1]  
ALMUALLIM H, 1991, PROCEEDINGS : NINTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOLS 1 AND 2, P547
[2]  
[Anonymous], 1993, P 10 INT C MACH LEAR
[3]  
CHEN J, 1998, P SPIE IS T C HUM VI, V3299
[4]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[5]  
DEVANEY M., 1997, P 14 INT C MACH LEAR, P92
[6]  
Dy J. G., 2000, ICML '00, P247, DOI DOI 10.5555/645529.657797
[7]   The customized-queries approach to CBIR [J].
Dy, JG ;
Brodley, CE ;
Kak, A ;
Shyu, CR ;
Broderick, LS .
STORAGE AND RETRIEVAL FOR IMAGE AND VIDEO DATABASES VII, 1998, 3656 :22-32
[8]  
DY JG, 2001, THESIS PURDUE U W LA
[9]  
DY JG, 1999, P IEEE C COMP VIS PA, V2, P400
[10]  
Fayyad U., 1998, Proceedings Fourth International Conference on Knowledge Discovery and Data Mining, P194