Automated Classification to Improve the Efficiency of Weeding Library Collections

被引:10
作者
Wagstaff, Kiri L. [1 ]
Liu, Geoffrey Z. [1 ]
机构
[1] San Jose State Univ, Sch Informat, One Washington Sq, San Jose, CA 95192 USA
关键词
DESELECTION;
D O I
10.1016/j.acalib.2018.02.001
中图分类号
G25 [图书馆学、图书馆事业]; G35 [情报学、情报工作];
学科分类号
1205 ; 120501 ;
摘要
Previous studies have shown that weeding a library collection benefits patrons and increases circulation rates. However, the time required to review the collection and make weeding decisions presents a formidable obstacle. This study empirically evaluated methods for automatically classifying weeding candidates. A data set containing 80,346 items from a large-scale weeding project running from 2011 to 2014 at Wesleyan University was used to train six machine learning classifiers to predict a weeding decision of either 'Keep' or 'Weed' for each candidate. The study found statistically significant agreement (p = 0.001) between classifier predictions and librarian judgments for all classifier types. The naive Bayes and linear support vector machine classifiers had the highest recall (fraction of items weeded by librarians that were identified by the algorithm), while the k -nearest neighbor classifier had the highest precision (fraction of recommended candidates that librarians had chosen to weed). The variables found to be most relevant were: librarian and faculty votes for retention, item age, and the presence of copies in other libraries.
引用
收藏
页码:238 / 247
页数:10
相关论文
共 31 条
[1]  
Agee A, 2017, COLLECT MANAG, V42, P59, DOI 10.1080/01462679.2017.1310069
[2]  
[Anonymous], 1973, Pattern Classification and Scene Analysis
[3]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[4]   Active learning with statistical models [J].
Cohn, DA ;
Ghahramani, Z ;
Jordan, MI .
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 1996, 4 :129-145
[5]  
CORTES C, 1995, MACH LEARN, V20, P273, DOI 10.1023/A:1022627411411
[6]   NEAREST NEIGHBOR PATTERN CLASSIFICATION [J].
COVER, TM ;
HART, PE .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1967, 13 (01) :21-+
[7]   Assessment in a Tight Time Frame: Using Readily Available Data to Evaluate Your Collection [J].
Crosetto, Alice ;
Kinner, Laura ;
Duhon, Lucy .
COLLECTION MANAGEMENT, 2008, 33 (1-2) :29-50
[8]   Weed to achieve: a fundamental part of the public library missions [J].
Dilevko, J ;
Gottlieb, L .
LIBRARY COLLECTIONS ACQUISITIONS & TECHNICAL SERVICES, 2003, 27 (01) :73-96
[9]   Weeding: facing the fears [J].
Dubicki, Eleonora .
COLLECTION BUILDING, 2008, 27 (04) :132-135
[10]  
GOLDSTEIN CH, 1981, B MED LIBR ASSOC, V69, P311