Automatic classification using supervised learning in a medical document filtering application

被引:39
作者
Mostafa, J [1 ]
Lam, W
机构
[1] Indiana Univ, Sch Lib & Informat Sci, Bloomington, IN 47405 USA
[2] Chinese Univ Hong Kong, Dept Syst Engn & Engn Management, Shatin, Hong Kong, Peoples R China
关键词
supervised learning; neural networks; document classification; information filtering;
D O I
10.1016/S0306-4573(99)00033-3
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Document classifiers can play an intermediate role in multilevel filtering systems. The effectiveness of a classifier that uses supervised learning was analyzed in terms of its accuracy and ultimately its influence on filtering. The analysis was conducted in two phases. In the first phase, a multilayer feedforward neural network was trained to classify medical documents in the area of cell biology. The accuracy of the supervised classifier was established by comparing its performance with a baseline system that uses human classification information. A relatively high degree of accuracy was achieved by the supervised method. however, classification accuracy varied across classes. In the second phase, to clarify the impact of this performance on filtering, different types of user profiles were created by grouping subsets of classes based on their individual classification accuracy rates. Then, a filtering system with the neural network integrated into it was used to filter the medical documents and this performance was compared with the filtering results achieved using the baseline system. The performance of the system using the neural network classifier was generally satisfactory and, as expected, the filtering performance varied with regard to the accuracy rates of classes. (C) 2000 Elsevier Science Ltd. All rights reserved.
引用
收藏
页码:415 / 444
页数:30
相关论文
共 48 条
[1]  
Bates MJ, 1998, J AM SOC INFORM SCI, V49, P1185, DOI 10.1002/(SICI)1097-4571(1998110)49:13<1185::AID-ASI6>3.0.CO
[2]  
2-V
[3]   INFORMATION FILTERING AND INFORMATION-RETRIEVAL - 2 SIDES OF THE SAME COIN [J].
BELKIN, NJ ;
CROFT, WB .
COMMUNICATIONS OF THE ACM, 1992, 35 (12) :29-38
[4]  
Bigus J.P., 1996, DATA MINING NEURAL N
[5]  
CHEN ALP, 1995, INTEGR COMPUT-AID E, V2, P21
[6]  
DOLIN R, 1998, DLIB MAGAZINE JAN
[7]   PERSONALIZED INFORMATION DELIVERY - AN ANALYSIS OF INFORMATION FILTERING METHODS [J].
FOLTZ, PW ;
DUMAIS, ST .
COMMUNICATIONS OF THE ACM, 1992, 35 (12) :51-60
[8]  
Harman D., 1998, Bulletin of the American Society for Information Science, V24, P11, DOI 10.1002/bult.90
[9]  
HAYES PJ, 1992, TEXT-BASED INTELLIGENT SYSTEMS : CURRENT RESEARCH AND PRACTICE IN INFORMATION EXTRACTION AND RETRIEVAL, P227
[10]  
Hertz J., 1991, Introduction to the Theory of Neural Computation