• • 上一篇    下一篇

基于DCSBM模型的受访者驱动抽样调查估计量改进

蒋妍1,2,3,孟珠峰2,王天佳2,刘晓宇2   

  1. 1. 中国人民大学应用统计科学研究中心, 北京 100872; 2. 中国人民大学统 计学院, 北京 100872;3. 中国人民大学调查技术研究所, 北京 100872
  • 出版日期:2021-12-28 发布日期:2021-12-28

蒋妍, 孟珠峰, 王天佳, 刘晓宇. 基于DCSBM模型的受访者驱动抽样调查估计量改进[J]. 系统科学与数学, 2022, 42(1): 85-99.

JIANG Yan, MENG Zhufeng, WANG Tianjia, LIU Xiaoyu. Improved Estimator Based on DCSBM in Respondent-Driven Sampling[J]. Journal of Systems Science and Mathematical Sciences, 2022, 42(1): 85-99.

Improved Estimator Based on DCSBM in Respondent-Driven Sampling

JIANG Yan1,2,3 ,MENG Zhufeng2 ,WANG Tianjia2 ,LIU Xiaoyu2   

  1. 1. Center for Applied Statistics, Renmin University of China, Beijing 100872; 2. School of Statistics, Renmin University of hina, Beijing 100872; 3. Institute of Survey Technology, Renmin University of China, Beijing 100872
  • Online:2021-12-28 Published:2021-12-28
大数据背景下, 将受访者驱动抽样(RDS)用于网络抽样调查, 解决了传 统抽样调查难以获得可用抽样框、难以接触被调查者以及难以获得回答等问题, 也使得网络调查可以实现概率抽样, 得到一定误差范围内的总体参数估计. 然而, 在实 际抽样过程中, 同质性问题(即样本单元在推荐同伴时倾向于推荐那些与自己有相同属 性的同伴)会导致RDS估计量的方差增大. 为解决该问题, 文章假定目标总体服从度修 正随机块模型(DCSBM), 利用区块间的经验转移概率对样本进行区块的事后分层, 提出了事后分层与逆概率加权相结合的PS-IPW估计量. 通过模拟不同的同质性水平的目标总体社交网络和RDS抽样, 比较PS-IPW估计量的相对效率; 并通过实证分析, 利用样本分块矩阵的谱性质选择分层变量, 进一步验证RDS抽样的适用性以及PS-IPW估计量的有效性.
In Big Data era, Respondent-Driven Sampling (RDS) is more often applied network sampling with general population. Such optimization offers a possible solution for problems in traditional sampling investigation, including the difficulties to obtains usable sampling frames, respondents or the responds themselves. Moreover, it also enables network survey to be probabilistic and obtain overall parameter estimation within a certain error range. However, homogeneity in statistical research always deviates RDS estimating result (when recommending a companion for the research, the respondent is more likely to introduce someone with whom he/she shares similar qualities). In order to offer a practical solution, this paper assumes that population obeys the Degree-Corrected Stochastic Block Models (DCSBM). We post-stratify the sample based on transition probability and propose an inverse probability weighted PS-IPW estimator. By simulation analysis, we compare the relative efficiency between different network population with varied homogeneity. By empirical study, we sort out stratifies variables of our sample based on the characteristics of spectral of block matrix, which further verifies the usability of RDS sampling and the efficiency of PS-IPW estimator.
()
No related articles found!
阅读次数
全文


摘要