Logo image
Bias decomposition and estimator performance in respondent-driven sampling
Journal article   Open access   Peer reviewed

Bias decomposition and estimator performance in respondent-driven sampling

Antonio D. Sirianni, Christopher J. Cameron, Yongren Shi and Douglas D. Heckathorn
Social networks, Vol.64, pp.109-121
01/2021
DOI: 10.1016/j.socnet.2020.08.002
url
https://doi.org/10.1016/j.socnet.2020.08.002View
Published (Version of record) Open Access

Abstract

•Bias from respondent-driven samples derives mainly from differences in respondent degree, and differential recruitment.•Two different ways of analytically decomposing bias into a degree component (DC) and a recruitment component (RC) are shown.•Simulated RDS samples from empirical networks show that estimates of DC and RC can predict which RDS estimators perform best.•A key implication is that data from the sample itself can be used for RDS estimator selection. Respondent-Driven Sampling (RDS) is a method of network sampling that is used to sample hard-to-reach populations. The resultant sample is non-random, but different weighting methods can account for the over-sampling of (1) high-degree individuals and (2) homophilous groups that recruit members more effectively. While accounting for degree-bias is almost universally agreed upon, accounting for recruitment-bias has been debated as it can further increase estimate variance without substantially reducing bias. Simulation-based research has examined which weighting procedures perform best given underlying population network structures, group recruitment differences, and sampling processes. Yet, in the field, analysts do not have a priori knowledge of the network they are sampling. We show that the RDS sample data itself can determine whether a degree-based estimator is sufficient. Formulas derived from the decomposition of a ‘dual-component’ estimator can approximate the ‘recruitment component’ (RC) and ‘degree component’ (DC) of a sample’s bias. Simulations show that RC and DC values can predict the performance of different classes of estimators. Samples with extreme ‘RC’ values, a consequence of network homophily and differential recruitment, are better served by a classical estimator. The use of sample data to improve estimator selection is a promising innovation for RDS, as the population network features that should guide estimator selection are typically unknown.
Network sampling Respondent-driven sampling

Details

Metrics

Logo image