Overcoming the winner’s curse: Leveraging Bayesian inference to improve estimates of the impact of features launched via A/B tests
2024
Many data-driven companies measure the impact of product groups and allocate resources across them based 2 on the estimated impacts of features they launch via A/B tests. In this doc, we show that, when based on a standard 3 frequentist estimator of the impact of features, this practice can significantly overstate the impact of product groups and 4 distort the allocation of resources. When this practice is instead based on a Bayesian estimator of the impact of features, 5 there are no such problems when the underlying prior beliefs regarding the distribution of true impacts are correctly 6 specified. To help assess performance of the estimators in practice, we conduct simulations, allowing for different forms 7 of misspecification in prior beliefs regarding the distribution of true impacts. In these simulations, we find that the 8 Bayesian estimator generally outperforms the frequentist estimator, even under certain forms of misspecification. We use 9 both the frequentist and Bayesian estimators to measure cumulative impacts across A/B tests at Amazon, highlighting 10 differences in their overall magnitude and their distribution across product groups.
Research areas