Generating diverse and informative natural language fashion feedback

Gil Sadeh; Lior Fritz; Gabi Shalev; Eduard Oks

Publication

Generating diverse and informative natural language fashion feedback

By Gil Sadeh, Lior Fritz, Gabi Shalev, Eduard Oks

2019

Download Copy BibTeX

Share

Download

Copy BibTeX

Share

Recent advances in multi-modal vision and language tasks enable a new set of applications. In this paper, we consider the task of generating natural language fashion feedback on outfit images. We collect a unique dataset, which contains outfit images and corresponding positive and constructive fashion feedback. We treat each feedback type separately, and train deep generative encoder-decoder models with visual attention, similar to the standard image captioning pipeline. Following this approach, the generated sentences tend to be too general and non-informative. We propose an alternative decoding technique based on the Maximum Mutual Information objective function, which leads to more diverse and detailed responses. We evaluate our model with common language metrics, and also show human evaluation results. This technology is applied within the “Alexa, how do I look?” feature, publicly available in Echo Look devices.

Generating diverse and informative natural language fashion feedback

Latest news

Work with us