Last week, at the seventh Workshop on Automated Machine Learning (AutoML) at the International Conference on Machine Learning, Amazon researchers won a best-paper award for a paper titled “Fair Bayesian Optimization”.
The paper addresses the problem of ensuring the fairness of AI systems, a topic that has drawn increasing attention in recent years.
Most research on computational approaches to fairness in AI (as opposed to policy approaches or sociological approaches) has focused on specific criteria of fairness or particular types of machine learning systems.
Click the play button to view the presentation of the award-winning paper, "Fair Bayesian Optimization"
The Amazon researchers — machine learning scientist Valerio Perrone, applied scientist Michele Donini, principal applied scientist Krishnaram Kenthapadi, and principal applied scientist Cédric Archambeau, all with Amazon Web Services (AWS) — set out to develop an approach that would work with any fairness criterion or any type of machine learning system.
Their approach is to enforce fairness constraints while optimizing a machine learning model’s hyperparameters: structural features of the model, or parameters of the model’s learning algorithm, that can be tuned to particular tasks or sets of training data.
In a neural-network model, for instance, the hyperparameters include the number of network layers and the number of processing nodes per layer; in a decision-tree-based model, by contrast, hyperparameters might include the number of decision trees or their depth.
In their paper, the researchers show that modifying hyperparameters is enough to provide assurances of accuracy and fairness and that this approach is as effective as prior methods, such as enforcing fairness constraints during model training or preprocessing training data to reduce bias. Unlike those methods, however, it’s applicable across a spectrum of tasks and models.
Dual distributions
The hyperparameters of a machine learning model are often tuned using Bayesian optimization (BO), a method for finding the optimal value of an unknown and expensive-to-sample function. In the case of hyperparameter optimization, the function relates particular hyperparameter settings to the accuracy of the resulting model.
Because the function is unknown, the optimization procedure explores it by taking samples of input-output pairs. BO provides a way to guide the sampling procedure, so that it quickly converges on a likely optimal value.
In particular, BO assumes that the outputs of the function fit some probability distribution — often, a Gaussian distribution. As the sampling procedure proceeds, BO refines its estimate of the distribution, which in turn guides the selection of new sample points.
The Amazon researchers modify this procedure so that it involves two distributions, one that describes the accuracy of the model output and one that describes the model’s score on some fairness measure. New samples are then selected with the aim of finding hyperparameters that simultaneously maximize accuracy and satisfy a specific fairness constraint.
Typically, fairness constraints require the model to exhibit similar performance across different subgroups of users. The researchers’ approach works with any fairness constraint that has this general form.
In exploring BO using Gaussian distributions to approximate function outputs, the researchers’ paper is a sequel of sorts to another paper honored at this year’s ICML, “Gaussian process optimization in the bandit setting: no regret and experimental design”, whose coauthors include Matthias Seeger, a principal applied scientist with AWS. Seeger’s paper won the conference’s test-of-time award, specifically because BO has proved so useful for hyperparameter optimization.