Journey to the Centre of Cluster: Harnessing Interior Nodes for A/B Testing under Network Interference
AI 摘要
提出一种基于内部节点的A/B测试估计器,并使用预测器进行偏差校正,提升网络干扰下的测试效果。
主要贡献
- 提出Mean-in-Interior (MII)估计器,降低方差
- 利用counterfactual predictor校正内部节点的偏差
- 将估计器与prediction-powered inference框架联系
方法论
通过内部节点取平均构建MII估计器,再用counterfactual predictor调整内部节点和总体之间的协变量差异。
原文摘要
A/B testing on platforms often faces challenges from network interference, where a unit's outcome depends not only on its own treatment but also on the treatments of its network neighbors. To address this, cluster-level randomization has become standard, enabling the use of network-aware estimators. These estimators typically trim the data to retain only a subset of informative units, achieving low bias under suitable conditions but often suffering from high variance. In this paper, we first demonstrate that the interior nodes - units whose neighbors all lie within the same cluster - constitute the vast majority of the post-trimming subpopulation. In light of this, we propose directly averaging over the interior nodes to construct the mean-in-interior (MII) estimator, which circumvents the delicate reweighting required by existing network-aware estimators and substantially reduces variance in classical settings. However, we show that interior nodes are often not representative of the full population, particularly in terms of network-dependent covariates, leading to notable bias. We then augment the MII estimator with a counterfactual predictor trained on the entire network, allowing us to adjust for covariate distribution shifts between the interior nodes and full population. By rearranging the expression, we reveal that our augmented MII estimator embodies an analytical form of the point estimator within prediction-powered inference framework. This insight motivates a semi-supervised lens, wherein interior nodes are treated as labeled data subject to selection bias. Extensive and challenging simulation studies demonstrate the outstanding performance of our augmented MII estimator across various settings.