Scalable heterogeneity detection in online experiments

Hammaad Adam; Merlin Heidemanns; Doug Hains; James McQueen

Publication

Scalable heterogeneity detection in online experiments

By Hammaad Adam, Merlin Heidemanns, Doug Hains, James McQueen

2024

Download Copy BibTeX

Share

Download

Copy BibTeX

Share

Online sites typically evaluate the impact of new product features on customer behavior using online controlled experiments (or A/B tests). For many business applications, it is important to detect heterogeneity in these experiments [1], as new features often have a differential impact by customer segment, product group, and other variables. Understanding heterogeneity can provide key insights into causal pathways, enable differential launches to subgroups of customers and products, and motivate future experiments.

Recent methods have focused on estimating heterogeneous impacts [1], but they are difficult to apply to online experiments for two reasons. First, existing methods scale poorly in industry settings. With hundreds of thousands of experiments and billions of observations per year, many existing methods are impractical due to their computational speed and need for supervision. Second, existing methods typically detect heterogeneity based on customer features. However, heterogeneity also exists on other axes. For example, a new online retail feature may increase e-book revenue, but reduce revenue from physical books, which reflects heterogeneity in a dimension of the outcome (i.e., revenue broken out by product group). Current approaches for such heterogeneity (e.g., fixed effects) scale poorly with large numbers of customers and categories. There is thus a need for a scalable method that can detect different types of heterogeneity in online experiments.

Scalable heterogeneity detection in online experiments

Latest news

Work with us