Most linear experimental design problems assume homogeneous variance, even though heteroskedastic noise is present in many realistic settings. Let a learner have access to a finite set of measurement vectors X ⊂ ℝd that can be probed to receive noisy linear responses of the form y = x⊤θ* + η. Here θ* ∈ ℝd is an unknown parameter vector, and η is independent mean-zero σx2 -strictly-sub-Gaussian noise defined by a flexible heteroskedastic variance model, σx2 = x⊤Σ*x. Assuming that Σ* ∈ ℝdxd is an unknown matrix, we propose, analyze and empirically evaluate a novel design for uniformly bounding estimation error of the variance parameters, σx2. We demonstrate the benefits of this method with two adaptive experimental design problems under heteroskedastic noise, fixed-confidence transductive best-arm identification, and level-set identification; proving the first instance-dependent lower bounds in these settings. Lastly, we construct near-optimal algorithms and empirically demonstrate the large improvements in sample complexity gained from accounting for heteroskedastic variance in these designs.
Research areas