Effects of missing data in social networks

被引:556
作者
Kossinets, Gueorgi
机构
[1] Columbia Univ, Dept Sociol, New York, NY 10027 USA
[2] Columbia Univ, Inst Social & Econ Res & Policy, New York, NY 10027 USA
关键词
missing data; sensitivity analysis; graph theory; collaboration networks; bipartite graphs;
D O I
10.1016/j.socnet.2005.07.002
中图分类号
Q98 [人类学];
学科分类号
030303 ;
摘要
We perform sensitivity analyses to assess the impact of missing data on the structural properties of social networks. The social network is conceived of as being generated by a bipartite graph, in which actors are linked together via multiple interaction contexts or affiliations. We discuss three principal missing data mechanisms: network boundary specification (non-inclusion of actors or affiliations), survey non-response, and censoring by vertex degree (fixed choice design), examining their impact on the scientific collaboration network from the Los Alamos E-print Archive as well as random bipartite graphs. The simulation results show that network boundary specification and fixed choice designs can dramatically alter estimates of network-level statistics. The observed clustering and assortativity coefficients are overestimated via omission of affiliations or fixed choice thereof, and underestimated via actor non-response, which results in inflated measurement error. We also find that social networks with multiple interaction contexts may have certain interesting properties due to the presence of overlapping cliques. In particular, assortativity by degree does not necessarily improve network robustness to random omission of nodes as predicted by current theory. (c) 2005 Elsevier B.V. All rights reserved.
引用
收藏
页码:247 / 268
页数:22
相关论文
共 48 条
  • [1] Error and attack tolerance of complex networks
    Albert, R
    Jeong, H
    Barabási, AL
    [J]. NATURE, 2000, 406 (6794) : 378 - 382
  • [2] Classes of small-world networks
    Amaral, LAN
    Scala, A
    Barthélémy, M
    Stanley, HE
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (21) : 11149 - 11152
  • [3] [Anonymous], THESIS OHIO STATE U
  • [4] Emergence of scaling in random networks
    Barabási, AL
    Albert, R
    [J]. SCIENCE, 1999, 286 (5439) : 509 - 512
  • [5] THE PROBLEM OF INFORMANT ACCURACY - THE VALIDITY OF RETROSPECTIVE DATA
    BERNARD, HR
    KILLWORTH, P
    KRONENFELD, D
    SAILER, L
    [J]. ANNUAL REVIEW OF ANTHROPOLOGY, 1984, 13 : 495 - 517
  • [6] Boguñá M, 2003, LECT NOTES PHYS, V625, P127
  • [7] Bollobas B., 2001, Random Graphs, V21
  • [8] Forgetting of friends and its effects on measuring friendship networks
    Brewer, DD
    Webster, CM
    [J]. SOCIAL NETWORKS, 1999, 21 (04) : 361 - 373
  • [9] A NOTE ON MISSING NETWORK DATA IN THE GENERAL SOCIAL SURVEY
    BURT, RS
    [J]. SOCIAL NETWORKS, 1987, 9 (01) : 63 - 73
  • [10] Network inference, error, and informant (in)accuracy: a Bayesian approach
    Butts, CT
    [J]. SOCIAL NETWORKS, 2003, 25 (02) : 103 - 140