We consider the task of predicting subjective fashion traits from images using neural networks. Specifically, we are interested in training a network for ranking outfits according to how well they fit the user. In order to capture the variability induced by human subjective considerations, each training example is annotated by a panel of fashion experts. Similarly to previous works on subjective data, the panel votes are converted to a classification or regression problem and the corresponding network is trained and evaluated using standard objective metrics. The question is which objective metric, if any, is most suitable to measure the performance of a network trained for subjective tasks? In this paper, we conducted human approval tests for outfit ranking networks trained using various objective metrics. We show that these metrics do not adequately estimate the human approval of subjective tasks. Instead, we introduce a supervising network that unlike objective metrics, is designed to capture the variability induced by human subjectivity. We use it to supervise our outfit ranking network and we demonstrate empirically, that training our outfit ranking network with the suggested supervising network achieves greater approval ratings from human subjects.
Research areas