Feature selection with limited datasets

被引：48

作者：

Kupinski, MA ^{[1
]}

Giger, ML ^{[1
]}

机构：

[1] Univ Chicago, Dept Radiol, Kurt Rossmann Labs, Chicago, IL 60637 USA

来源：

MEDICAL PHYSICS | 1999年 / 26卷 / 10期

关键词：

feature selection; classification; computer-aided diagnosis;

D O I：

10.1118/1.598821

中图分类号：

R8 [特种医学]; R445 [影像诊断学];

学科分类号：

1002 ; 100207 ; 1009 ;

摘要：

Computer-aided diagnosis has the potential of increasing diagnostic accuracy by providing a second reading to radiologists. In many computerized schemes, numerous features can be extracted to describe suspect image regions. A subset of these features is hen employed in a data classifier to determine whether the suspect region is abnormal or normal. Different subsets of features will, in general, result in different classification performances. A feature selection method is often used to determine an "optimal" subset of features to use with a particular classifier. A classifier performance measure (such as the area under the receiver operating characteristic curve) must be incorporated into this feature selection process. With limited datasets, however, there is a distribution in the classifier performance measure for a given classifier and subset of features. In this paper, we investigate the variation in the selected subset of "optimal" features as compared with the true optimal subset of features caused by this distribution of classifier performance. We consider examples in which the probability that the optimal subset of features is selected can be analytically computed. We show the dependence of this probability on the dataset sample size, the total number of features from which to select, the number of features selected, and the performance of the true optimal subset. Once a subset of features has been selected, the parameters of the data classifier must be determined. We show that, with limited datasets and/or a large number of features from which to choose, bias is introduced if the classifier parameters are determined using the same data that were employed to select the "optimal" subset of features. (C) 1999 American Association of Physicists in Medicine. [S0094-2405(99)01010-X].

引用

页码：2176 / 2182

页数：7

共 20 条

[1] Optimization and FROG analysis of rule-based detection schemes using a multiobjective approach [J].

Anastasio, MA ;

Kupinski, MA ;

Nishikawa, RM .

IEEE TRANSACTIONS ON MEDICAL IMAGING, 1998, 17 (06) :1089-1093

[2] IMPROVEMENT IN RADIOLOGISTS DETECTION OF CLUSTERED MICROCALCIFICATIONS ON MAMMOGRAMS - THE POTENTIAL OF COMPUTER-AIDED DIAGNOSIS [J].

CHAN, HP ;

DOI, K ;

VYBORNY, CJ ;

SCHMIDT, RA ;

METZ, CE ;

LAM, KL ;

OGURA, T ;

WU, YZ ;

MACMAHON, H .

INVESTIGATIVE RADIOLOGY, 1990, 25 (10) :1102-1110

[3]

David H.A., 1970, ORDER STAT

[4]

DEVROYE L, 1996, PROBABILITISTIC THEO

[5] ENHANCED INTERPRETATION OF DIAGNOSTIC IMAGES [J].

GETTY, DJ ;

PICKETT, RM ;

DORSI, CJ ;

SWETS, JA .

INVESTIGATIVE RADIOLOGY, 1988, 23 (04) :240-252

[6]

Gibbons J. D., 1977, SELECTING ORDERING P

[7] THE MEANING AND USE OF THE AREA UNDER A RECEIVER OPERATING CHARACTERISTIC (ROC) CURVE [J].

HANLEY, JA ;

MCNEIL, BJ .

RADIOLOGY, 1982, 143 (01) :29-36

[8] Feature selection: Evaluation, application, and small sample performance [J].

Jain, A ;

Zongker, D .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1997, 19 (02) :153-158

[9] Improving breast cancer diagnosis with computer-aided diagnosis [J].

Jiang, YL ;

Nishikawa, RM ;

Schmidt, RA ;

Metz, CE ;

Giger, ML ;

Doi, K .

ACADEMIC RADIOLOGY, 1999, 6 (01) :22-33

[10] A receiver operating: Characteristic partial area index for highly sensitive diagnostic tests [J].

Jiang, YL ;

Metz, CE ;

Nishikawa, RM .

RADIOLOGY, 1996, 201 (03) :745-750

← 1 2 →