Specializing for predicting obesity and its co-morbidities

被引:8
作者
Goldstein, Ira [1 ]
Uzuner, Oezlem [1 ,2 ]
机构
[1] SUNY Albany, Coll Comp & Informat, Albany, NY 12222 USA
[2] Middle E Tech Univ, Dept Comp Engn, TR-10 Kktc, Mersin, Turkey
基金
美国国家卫生研究院;
关键词
Classification; Combination of classifiers; Natural language processing; Machine learning; CLASSIFIER; ALGORITHM; SYSTEM; COMBINATION;
D O I
10.1016/j.jbi.2008.11.001
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
We present specializing, a method for combining classifiers for multi-class classification. Specializing trains one specialist classifier per class and utilizes each specialist to distinguish that class from all others in a one-versus-all manner. It then supplements the specialist classifiers with a catch-all classifier that performs multi-class classification across all classes. We refer to the resulting combined classifier as a specializing classifier. We develop specializing to classify 16 diseases based on discharge summaries. For each discharge summary, we aim to predict whether each disease is present, absent, or questionable in the patient, or unmentioned in the discharge summary. We treat the classification of each disease as an independent multi-class classification task. For each disease, we develop one specialist classifier for each of the present, absent, questionable, and unmentioned classes; we supplement these specialist classifiers with a catch-all classifier that encompasses all of the classes for that disease. We evaluate specializing on each of the 16 diseases and show that it improves significantly over voting and stacking when used for multi-class classification on our data. (C) 2008 Elsevier Inc. All rights reserved.
引用
收藏
页码:873 / 886
页数:14
相关论文
共 45 条
[1]  
[Anonymous], 2004, COMBINING PATTERN CL, DOI DOI 10.1002/0471660264
[2]  
[Anonymous], Data Mining Practical Machine Learning Tools and Techniques with Java
[3]  
[Anonymous], 1993, C4.5: Programs for machine learning
[4]  
[Anonymous], 1964, MATH THEORY COMMUNIC
[5]  
Aronson AR, 2001, J AM MED INFORM ASSN, P17
[6]   Some theory for Fisher's linear discriminant function, 'naive Bayes', and some alternatives when there are many more variables than observations [J].
Bickel, PJ ;
Levina, E .
BERNOULLI, 2004, 10 (06) :989-1010
[7]  
Chan P. K., 1993, AAAI WORKSH KNOWL DI, P227
[8]  
CHAN PK, 1995, P 12 INT C MACH LEAR
[9]  
CHAN PK, 1993, P 2 INT C INF KNOWL
[10]   A simple algorithm for identifying negated findings and diseases in discharge summaries [J].
Chapman, WW ;
Bridewell, W ;
Hanbury, P ;
Cooper, GF ;
Buchanan, BG .
JOURNAL OF BIOMEDICAL INFORMATICS, 2001, 34 (05) :301-310