On primes, log-loss scores and (no) privacy

Abhinav Aggarwal; Zekun Xu; Oluwaseyi Feyisetan; Nathanael Teissier

Publication

On primes, log-loss scores and (no) privacy

By Abhinav Aggarwal, Zekun Xu, Oluwaseyi Feyisetan, Nathanael Teissier

2020

Download Copy BibTeX

Share

Download

Copy BibTeX

Share

A common metric for assessing the performance of binary classiﬁers is the Log-Loss score, which is a real number indicating the cross entropy distance between the predicted distribution over the labels and the true distribution (a point distribution deﬁned by the ground truth labels). In this paper, we show that a malicious modeler, upon obtaining access to the Log-Loss scores on its predictions, can exploit this information to infer all the ground truth labels of arbitrary test datasets with full accuracy. We provide an efﬁcient algorithm to perform this inference. A particularly interesting application where this attack can be exploited is to breach privacy in the setting of Membership Inference Attacks. These attacks exploit the vulnerabilities of exposing models trained on customer data to queries made by an adversary. Privacy auditing tools for measuring leakage from sensitive datasets assess the total privacy leakage based on the adversary’s predictions for datapoint membership. An instance of the proposed attack can hence, cause complete membership privacy breach, obviating any attack model training or access to side knowledge with the adversary. Moreover, our algorithm is agnostic to the model under attack and hence, enables perfect membership inference even for models that do not memorize or overﬁt. In particular, our observations provide insight into the extent of information leakage from statistical aggregates and how they can be exploited.

On primes, log-loss scores and (no) privacy

Latest news

Work with us