An Automatic Method for Extracting Citations From Google Books

被引:37
作者
Kousha, Kayvan [1 ]
Thelwall, Mike [1 ]
机构
[1] Wolverhampton Univ, Sch Technol, Stat Cybermetr Res Grp, Wolverhampton WV1 1LY, W Midlands, England
关键词
citation analysis; experiments; SOCIAL-SCIENCES; HUMANITIES; MONOGRAPHS; CHAPTERS; IMPACT; OUTPUT;
D O I
10.1002/asi.23170
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recent studies have shown that counting citations from books can help scholarly impact assessment and that Google Books (GB) is a useful source of such citation counts, despite its lack of a public citation index. Searching GB for citations produces approximate matches, however, and so its raw results need time-consuming human filtering. In response, this article introduces a method to automatically remove false and irrelevant matches from GB citation searches in addition to introducing refinements to a previous GB manual citation extraction method. The method was evaluated by manual checking of sampled GB results and comparing citations to about 14,500 monographs in the Thomson Reuters Book Citation Index (BKCI) against automatically extracted citations from GB across 24 subject areas. GB citations were 103% to 137% as numerous as BKCI citations in the humanities, except for tourism (72%) and linguistics (91%), 46% to 85% in social sciences, but only 8% to 53% in the sciences. In all cases, however, GB had substantially more citing books than did BKCI, with BKCI's results coming predominantly from journal articles. Moderate correlations between the GB and BKCI citation counts in social sciences and humanities, with most BKCI results coming from journal articles rather than books, suggests that they could measure the different aspects of impact, however.
引用
收藏
页码:309 / 320
页数:12
相关论文
共 35 条
[1]  
[Anonymous], CITATION ANAL RES EV
[2]  
[Anonymous], 1996, CONS C THEOR PRACT R
[3]   Benchmarking scientific output in the social sciences and humanities:: The limits of existing databases [J].
Archambault, Eric ;
Vignola-Gagne, Etienne ;
Cote, Gregoire ;
Lariviere, Vincent ;
Gingras, Yves .
SCIENTOMETRICS, 2006, 68 (03) :329-342
[5]   Extending citation analysis to non-source items [J].
Butler, L ;
Visser, MS .
SCIENTOMETRICS, 2006, 66 (02) :327-343
[6]  
Cabezas-Clavijo A., 2013, P 14 INT C INT SOC S, V2, P1237
[7]   Google Books and WorldCat: a comparison of their content [J].
Chen, Xiaotian .
ONLINE INFORMATION REVIEW, 2012, 36 (04) :507-516
[8]   Comparative citation rankings of authors in monographic and journal literature: A study of sociology [J].
Cronin, B ;
Snyder, H ;
Atkins, H .
JOURNAL OF DOCUMENTATION, 1997, 53 (03) :263-273
[9]   Citation characteristics of English-language monographs in philosophy [J].
Cullars, JM .
LIBRARY & INFORMATION SCIENCE RESEARCH, 1998, 20 (01) :41-68
[10]  
Darnton R, 2013, NEW YORK REV BOOKS, V60, P4