Improved Inference for Respondent-Driven Sampling Data With Application to HIV Prevalence Estimation

被引:197
作者
Gile, Krista J. [1 ]
机构
[1] Univ Massachusetts, Dept Math & Stat, Amherst, MA 01003 USA
关键词
Hidden population sampling; Link-tracing; Markov chain; Network sampling; PPSWOR; Snowball sampling; Social networks; Successive sampling; MAXIMUM-LIKELIHOOD-ESTIMATION; BEHAVIORAL SURVEILLANCE; DRUG-USERS; REPLACEMENT;
D O I
10.1198/jasa.2011.ap09475
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Respondent-driven sampling is a form of link-tracing network sampling, which is widely used to study hard-to-reach populations, often to estimate population proportions. Previous treatments of this process have used a with-replacement approximation, which we show induces bias in estimates for large sample fractions and differential network connectedness by characteristic of interest. We present a treatment of respondent-driven sampling as a successive sampling process. Unlike existing representations, our approach respects the essential without-replacement feature of the process, while converging to an existing with-replacement representation for small sample fractions, and to the sample mean for a full-population sample. We present a successive-sampling based estimator for population means based on respondent-driven sampling data, and demonstrate its super,or performance when the size of the hidden population is known. We present sensitivity analyses for unknown population sizes. In addition, we note that like other existing estimators, our new estimator is subject to bias induced by the selection of the initial sample. Using data collected among three populations in two countries, we illustrate the application of this approach to populations with varying characteristics. We conclude that the successive sampling estimator improves on existing estimators, and can also be used as a diagnostic tool when population size is not known. This article has supplementary material online.
引用
收藏
页码:135 / 146
页数:12
相关论文
共 47 条
[1]   Effectiveness of respondent-driven sampling for recruiting drug users in New York city: Findings from a pilot study [J].
Abdul-Quader, Abu S. ;
Heckathorn, Douglas D. ;
McKnight, Courtney ;
Bramson, Heidi ;
Nemeth, Chris ;
Sabin, Keith ;
Gallagher, Kathleen ;
Des Jarlais, Don C. .
JOURNAL OF URBAN HEALTH-BULLETIN OF THE NEW YORK ACADEMY OF MEDICINE, 2006, 83 (03) :459-476
[2]   ESTIMATION OF FINITE POPULATION PROPERTIES WHEN SAMPLING IS WITHOUT REPLACEMENT AND PROPORTIONAL TO MAGNITUDE [J].
ANDREATTA, G ;
KAUFMAN, GM .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1986, 81 (395) :657-666
[3]  
[Anonymous], 2008, Journal of Statistical Software, V24, P1, DOI DOI 10.18637/JSS.V024.I03
[4]  
[Anonymous], 2003, statnet: Software Tools for the Statistical Modeling of Network Data
[5]  
[Anonymous], 1991, Metrika
[6]  
[Anonymous], 1978, WILEY SERIES PROBABI
[7]   NONPARAMETRIC-INFERENCE UNDER BIASED SAMPLING FROM A FINITE POPULATION [J].
BICKEL, PJ ;
NAIR, VN ;
WANG, PCC .
ANNALS OF STATISTICS, 1992, 20 (02) :853-878
[8]   Cut-offs and finite size effects in scale-free networks [J].
Boguña, M ;
Pastor-Satorras, R ;
Vespignani, A .
EUROPEAN PHYSICAL JOURNAL B, 2004, 38 (02) :205-209
[9]   Uncorrelated random networks [J].
Burda, Z ;
Krzywicki, A .
PHYSICAL REVIEW E, 2003, 67 (04) :7
[10]   Generation of uncorrelated random scale-free networks -: art. no. 027103 [J].
Catanzaro, M ;
Boguñá, M ;
Pastor-Satorras, R .
PHYSICAL REVIEW E, 2005, 71 (02)