Assessment of the Risk Factors of Coronary Heart Events Based on Data Mining With Decision Trees

被引:93
作者
Karaolis, Minas A. [1 ]
Moutiris, Joseph A. [2 ]
Hadjipanayi, Demetra [1 ]
Pattichis, Constantinos S. [1 ]
机构
[1] Univ Cyprus, Dept Comp Sci, CY-1678 Nicosia, Cyprus
[2] Paphos Gen Hosp, Dept Cardiol, CY-8100 Paphos, Cyprus
来源
IEEE TRANSACTIONS ON INFORMATION TECHNOLOGY IN BIOMEDICINE | 2010年 / 14卷 / 03期
关键词
Coronary heart disease (CHD); data mining; decision trees; risk factors; CARDIOVASCULAR RISK; SELECTION; DISEASE;
D O I
10.1109/TITB.2009.2038906
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Coronary heart disease (CHD) is one of the major causes of disability in adults as well as one of the main causes of death in the developed countries. Although significant progress has been made in the diagnosis and treatment of CHD, further investigation is still needed. The objective of this study was to develop a data-mining system for the assessment of heart event-related risk factors targeting in the reduction of CHD events. The risk factors investigated were: 1) before the event: a) nonmodifiable-age, sex, and family history for premature CHD, b) modifiable-smoking before the event, history of hypertension, and history of diabetes; and 2) after the event: modifiable-smoking after the event, systolic blood pressure, diastolic blood pressure, total cholesterol, high-density lipoprotein, low-density lipoprotein, triglycerides, and glucose. The events investigated were: myocardial infarction (MI), percutaneous coronary intervention (PCI), and coronary artery bypass graft surgery (CABG). A total of 528 cases were collected from the Paphos district in Cyprus, most of them with more than one event. Data-mining analysis was carried out using the C4.5 decision tree algorithm for the aforementioned three events using five different splitting criteria. The most important risk factors, as extracted from the classification rules analysis were: 1) for MI, age, smoking, and history of hypertension; 2) for PCI, family history, history of hypertension, and history of diabetes; and 3) for CABG, age, history of hypertension, and smoking. Most of these risk factors were also extracted by other investigators. The highest percentages of correct classifications achieved were 66%, 75%, and 75% for the MI, PCI, and CABG models, respectively. It is anticipated that data mining could help in the identification of high and low risk subgroups of subjects, a decisive factor for the selection of therapy, i.e., medical or surgical. However, further investigation with larger datasets is still needed.
引用
收藏
页码:559 / 566
页数:8
相关论文
共 36 条
[1]  
[Anonymous], 1997, Eur Heart J, V18, P1569
[2]  
[Anonymous], 2002, Eur Heart J, V22, P554
[3]  
[Anonymous], 2006, Introduction to Data Mining
[4]  
[Anonymous], PROGR MACHINE LEARNI
[5]  
Attneave F., 1959, Applications of information theory to psychology: A summary of basic concepts, methods, and results
[6]   SmcHD1, containing a structural-maintenance-of-chromosomes hinge domain, has a critical role in X inactivation [J].
Blewitt, Marnie E. ;
Gendrel, Anne-Valerie ;
Pang, Zhenyi ;
Sparrow, Duncan B. ;
Whitelaw, Nadia ;
Craig, Jeffrey M. ;
Apedaile, Anwyn ;
Hilton, Douglas J. ;
Dunwoodie, Sally L. ;
Brockdorff, Neil ;
Kay, Graham F. ;
Whitelaw, Emma .
NATURE GENETICS, 2008, 40 (05) :663-669
[7]   Knowledge acquisition in the fuzzy knowledge representation framework of a medical consultation system [J].
Boegl, K ;
Adlassnig, KP ;
Hayashi, Y ;
Rothenfluh, TE ;
Leitich, H .
ARTIFICIAL INTELLIGENCE IN MEDICINE, 2004, 30 (01) :1-26
[8]   Predictive accuracy of the Framingham coronary risk score in British men: prospective cohort study [J].
Brindle, P ;
Emberson, J ;
Lampe, F ;
Walker, M ;
Whincup, P ;
Fahey, T ;
Ebrahim, S .
BMJ-BRITISH MEDICAL JOURNAL, 2003, 327 (7426) :1267-1270A
[9]  
*BRIT HEART FDN, 2008, EUR CARD VASC DIS ST
[10]   A DISTANCE-BASED ATTRIBUTE SELECTION MEASURE FOR DECISION TREE INDUCTION [J].
DEMANTARAS, RL .
MACHINE LEARNING, 1991, 6 (01) :81-92