A Heuristic Approach to Author Name Disambiguation in Bibliometrics Databases for Large-Scale Research Assessments

被引:141
作者
D'Angelo, Ciriaco Andrea [1 ]
Giuffrida, Cristiano [2 ]
Abramo, Giovanni [3 ,4 ]
机构
[1] Univ Roma Tor Vergata, Lab Studies Res & Technol Transfer, I-00133 Rome, Italy
[2] Vrije Univ Amsterdam, Dept Comp Sci, NL-1081 HV Amsterdam, Netherlands
[3] Natl Res Council Italy, I-00133 Rome, Italy
[4] Univ Roma Tor Vergata, Dipartimento Ingn Impresa, I-00133 Rome, Italy
来源
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY | 2011年 / 62卷 / 02期
关键词
RESEARCH PRODUCTIVITY; CITATIONS; IMPACT;
D O I
10.1002/asi.21460
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
National exercises for the evaluation of research activity by universities are becoming regular practice in ever more countries. These exercises have mainly been conducted through the application of peer-review methods. Bibliometrics has not been able to offer a valid large-scale alternative because of almost overwhelming difficulties in identifying the true author of each publication. We will address this problem by presenting a heuristic approach to author name disambiguation in bibliometric datasets for large-scale research assessments. The application proposed concerns the Italian university system, comprising 80 universities and a research staff of over 60,000 scientists. The key advantage of the proposed approach is the ease of implementation. The algorithms are of practical application and have considerably better scalability and expandability properties than state-of-the-art unsupervised approaches. Moreover, the performance in terms of precision and recall, which can be further improved, seems thoroughly adequate for the typical needs of large-scale bibliometric research assessments.
引用
收藏
页码:257 / 269
页数:13
相关论文
共 25 条
  • [1] The measurement of Italian universities' research productivity by a non parametric-bibliometric methodology
    Abramo, Giovanni
    D'Angelo, Ciriaco Andrea
    Pugini, Fabio
    [J]. SCIENTOMETRICS, 2008, 76 (02) : 225 - 244
  • [2] Assessment of sectoral aggregation distortion in research productivity measurements
    Abramo, Giovanni
    D'Angelo, Ciriaco Andrea
    Di Costa, Flavia
    [J]. RESEARCH EVALUATION, 2008, 17 (02) : 111 - 121
  • [3] Assessing public-private research collaboration: is it possible to compare university performance?
    Abramo, Giovanni
    D'Angelo, Ciriaco Andrea
    Solazzi, Marco
    [J]. SCIENTOMETRICS, 2010, 84 (01) : 173 - 197
  • [4] When different persons have an identical author name. How frequent are homonyms?
    Aksnes, Dag W.
    [J]. JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2008, 59 (05): : 838 - 841
  • [5] Networks of inventors and the role of academia: an exploration of Italian patent data
    Balconi, M
    Breschi, S
    Lissoni, F
    [J]. RESEARCH POLICY, 2004, 33 (01) : 127 - 145
  • [6] DUPLICATION OF JAPANESE NAMES - A PROBLEM IN CITATIONS AND BIBLIOGRAPHIES
    CORNELL, LL
    [J]. JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE, 1982, 33 (02): : 102 - 104
  • [7] CULOTTA A, 2007, P AAAI 6 INT WORKSH, P32
  • [8] Name disambiguation spectral in author citations using a K-way clustering method
    Han, H
    Zha, HY
    Giles, CL
    [J]. PROCEEDINGS OF THE 5TH ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES, PROCEEDINGS, 2005, : 334 - 343
  • [9] Two supervised learning approaches for name disambiguation in author citations
    Han, H
    Giles, L
    Zha, H
    Li, C
    Tsioutsiouliklis, K
    [J]. JCDL 2004: PROCEEDINGS OF THE FOURTH ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES: GLOBAL REACH AND DIVERSE IMPACT, 2004, : 296 - 305
  • [10] Han H., 2005, P 2005 ACM S APPL CO, P1065, DOI DOI 10.1145/1066677.1066920