首页> 外文期刊>JMLR: Workshop and Conference Proceedings >Why Adaptively Collected Data Have Negative Bias and How to Correct for It
【24h】

Why Adaptively Collected Data Have Negative Bias and How to Correct for It

机译:为什么自适应收集的数据具有负偏差,以及如何纠正它

获取原文
           

摘要

From scientific experiments to online A/B testing, the previously observed data often affects how future experiments are performed, which in turn affects which data will be collected. Such adaptivity introduces complex correlations between the data and the collection procedure. In this paper, we prove that when the data collection procedure satisfies natural conditions, then sample means of the data have systematic negative biases. As an example, consider an adaptive clinical trial where additional data points are more likely to be tested for treatments that show initial promise. Our surprising result implies that the average observed treatment effects would underestimate the true effects of each treatment. We quantitatively analyze the magnitude and behavior of this negative bias in a variety of settings. We also propose a novel debiasing algorithm based on selective inference techniques. In experiments, our method can effectively reduce bias and estimation error.
机译:从科学实验到在线A / B测试,先前观察到的数据通常会影响未来的实验是如何进行的,这反过来影响将收集哪些数据。这种适应性在数据和收集过程之间引入了复杂的相关性。在本文中,我们证明,当数据收集程序满足自然条件时,数据的样本装置具有系统负偏差。例如,考虑一个自适应临床试验,其中更容易测试额外的数据点以进行初始承诺的治疗。我们令人惊讶的结果意味着平均观察到的治疗效果将低估每种治疗的真实效果。我们定量分析各种设置中这种负偏差的幅度和行为。我们还提出了一种基于选择性推理技术的新型脱叠算法。在实验中,我们的方法可以有效地减少偏差和估计误差。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号