Evaluation of Respondent-Driven Sampling Estimators for Age Distribution in Hidden Populations: A Comparative Study of Naive, SH-RDS, VH-RDS, G-SS, and RDS_proposed Estimators

Authors

  • Y. Anjikwi
  • D. Jibasen
  • I.J. Dike
  • E. Torsen

Abstract

This study systematically evaluates the performance of multiple estimators Naive, SH-RDS, VH-RDS, G-SS, and RDS_proposed in estimating the age distribution of Facebook user responses across varying sample sizes and degree distributions. Results indicate that, for small sample sizes (n = 500), traditional estimators such as VH-RDS and SH-RDS exhibit lower bias, variance, and mean squared error (MSE), while the RDS_proposed estimator performs suboptimally. However, as sample sizes increase (n = 1000 to n > 2000), the RDS_proposed estimator demonstrates clear superiority, achieving minimal bias and variance along with the lowest MSE, thus providing the most reliable estimates. Sectoral analysis further reveals that responses are more stable and precise in the Company and Politician sectors under very high and moderate degree categories, with increased variability observed in the Government sector, particularly at very low degrees. These findings are consistent with existing literature, which emphasizes the importance of estimator selection, sample size, and network structure on the accuracy and precision of respondent-driven sampling results. Limitations include reliance on simulated datasets, focus on age distribution as the primary parameter, and incomplete reporting of all metrics for every sector. Overall, the study highlights optimal scenarios for estimator application and the critical role of degree distribution in ensuring robust network-based survey inferences.

Published

2026-05-13