Disambiguation and co-authorship networks of the US patent inventor database (1975-2010)

被引:267
作者
Li, Guan-Cheng [1 ]
Lai, Ronald [2 ]
D'Amour, Alexander [3 ]
Doolin, David M. [4 ]
Sun, Ye [5 ]
Torvik, Vetle I. [6 ]
Yu, Amy Z. [7 ]
Fleming, Lee [1 ]
机构
[1] UC Berkeley Coll Engn, Fung Inst Engn Leadership, Berkeley, CA 94550 USA
[2] Harvard Univ, Inst Quantitat Social Sci, Cambridge, MA 02138 USA
[3] Harvard Univ, Dept Stat, Cambridge, MA 02138 USA
[4] CloudPassage Inc, San Francisco, CA 94026 USA
[5] Grantham Mayo Van Otterloo & Co LLC, Boston, MA 02110 USA
[6] Univ Illinois, Grad Sch Lib & Informat Sci, Champaign, IL USA
[7] MIT, MIT Media Lab, Cambridge, MA 02139 USA
基金
美国国家科学基金会;
关键词
Disambiguation; Patents; Networks; Inventors; Careers; KNOWLEDGE; MOBILITY;
D O I
10.1016/j.respol.2014.01.012
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
Research into invention, innovation policy, and technology strategy can greatly benefit from an accurate understanding of inventor careers. The United States Patent and Trademark Office does not provide unique inventor identifiers, however, making large-scale studies challenging. Many scholars of innovation have implemented ad-hoc disambiguation methods based on string similarity thresholds and string comparison matching; such methods have been shown to be vulnerable to a number of problems that can adversely affect research results. The authors address this issue contributing (1) an application of the Author-ity disambiguation approach (Torvik et al., 2005; Torvik and Smalheiser, 2009) to the US utility patent database, (2) a new iterative blocking scheme that expands the match space of this algorithm while maintaining scalability, (3) a public posting of the algorithm and code, and (4) a public posting of the results of the algorithm in the form of a database of inventors and their associated patents. The paper provides an overview of the disambiguation method, assesses its accuracy, and calculates network measures based on co-authorship and collaboration variables. It illustrates the potential for large-scale innovation studies across time and space with visualizations of inventor mobility across the United States. The complete input and results data from the original disambiguation are available at (http://dvn.iq.harvard.edu/dvn/dv/patent); revised data described here are at (http://funglab.berkeley.edu/pub/disamb_no_postpolishing.csv); original and revised code is available at (https://github.com/funginstitute/disambiguator); visualizations of inventor mobility are at (http://funglab.berkeley.edu/mobility/). (C) 2014 Elsevier B.V. All rights reserved.
引用
收藏
页码:941 / 955
页数:15
相关论文
共 38 条
[31]   Lone Inventors as Sources of Breakthroughs: Myth or Reality? [J].
Singh, Jasjit ;
Fleming, Lee .
MANAGEMENT SCIENCE, 2010, 56 (01) :41-56
[32]  
Smalheiser NR, 2009, ANNU REV INFORM SCI, V43, P287
[33]   Bibliometric fingerprints: name disambiguation based on approximate structure equivalence of cognitive maps [J].
Tang, Li ;
Walsh, John P. .
SCIENTOMETRICS, 2010, 84 (03) :763-784
[34]   Author Name Disambiguation in MEDLINE [J].
Torvik, Vetle I. ;
Smalheiser, Neil R. .
ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2009, 3 (03)
[35]   A probabilistic similarity metric for Medline records: A model for author name disambiguation [J].
Torvik, VI ;
Weeber, M ;
Swanson, DR ;
Smalheiser, NR .
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2005, 56 (02) :140-158
[36]  
Trajtenberg M., 2006, NAMES GAME HARNESSIN
[37]   Collective dynamics of 'small-world' networks [J].
Watts, DJ ;
Strogatz, SH .
NATURE, 1998, 393 (6684) :440-442
[38]  
Zhang H, 2004, P 17 INT FLORIDA ART