NimbleLearn: A scalable and fast batch-mode active learning approach

By Ruoyan Kong, Zhanlong Qiu, Yang Liu, Qi Zhao
2021
Download Copy BibTeX
Copy BibTeX
Batch-mode active learning iteratively selects a batch of unlabeled samples for labelling to maximize model performance and reduce total runtime. To select the most informative and diverse batch, existing methods usually calculate the correlation between samples within a batch, leading to combinatorial optimization problems which are inefficient, complex, and limited to linear models for approximated solutions. In this paper, we propose NimbleLearn, a scalable deep imitation batch-mode active learning approach to address these drawbacks. NimbleLearn sequentially predicts an “ideal sample” by a deep policy network for each batch. Such ideal sample maximizes the model performance when combined with the labeled samples and the already-selected samples in the current batch. Unlike the existing batch-mode active learning methods which directly select one batch of samples from unlabeled ones, NimbleLearn reduces the dimension of the policy network output to the number of features (assuming the number of unlabeled samples is much greater than the number of features). In addition, NimbleLearn is a general framework and can be applied in both linear and nonlinear models. Experiments conducted on 4 public datasets show NimbleLearn can achieve similar or better performance as existing SOTA algorithms, while reducing the number of labeled samples and runtime by over 50%.
Research areas

Latest news

GB, MLN, Edinburgh
We’re looking for a Machine Learning Scientist in the Personalization team for our Edinburgh office experienced in generative AI and large models. You will be responsible for developing and disseminating customer-facing personalized recommendation models. This is a hands-on role with global impact working with a team of world-class engineers and scientists across the Edinburgh offices and wider organization. You will lead the design of machine learning models that scale to very large quantities of data, and serve high-scale low-latency recommendations to all customers worldwide. You will embody scientific rigor, designing and executing experiments to demonstrate the technical efficacy and business value of your methods. You will work alongside aRead more