Value of stratification in cluster-randomized experiments
2023
There are many experimental settings that may suffer from cross-unit (customers, seller, advertiser, etc.) spillovers, for instance through network effects. Such effects introduce bias and prevent the experimenter from drawing trustworthy insights on the data. One approach to dealing with such spillovers is to group units into clusters and randomize treatment status at the cluster level. Examples of clusters are groups of advertisers or sellers, or geographic clusters (ZIP codes, Designated Marketing Areas (DMAs)) that group customers. While clustering helps to reduce bias in the form of spill-over effects, on its own, it can present other challenges. Clusters of advertisers, sellers, or customers are often highly skewed in size and other characteristics (Chung and Lu (2006)). Naively randomizing treatment or control allocations at the cluster level can lead to unchecked experiment imbalance if, for example, the largest clusters are all allocated to either treatment or control. In this paper we assess the value of using a stratified randomization approach in the context of cluster-randomized experiments and demonstrate how doing so can improve statistical power while also reducing imbalance in the treatment allocation compared to traditional randomization schemes. In a stratified experiment, clusters of individual units (e.g., a single advertiser) are stratified (i.e., grouped or blocked) according to the values of a set of pre-experiment covariates. Clusters are then randomized within each of the strata, with the goal of helping to mitigate pre-period covariate imbalances between treated and control clusters. By randomizing at the strata level, the allocations for any two different strata are pairwise independent. Ultimately, this improves statistical power in the post-experiment analysis.
Research areas