Personalizing natural-language understanding using multi-armed bandits and implicit feedback
2020
Natural-language-understanding (NLU) models on voice-controlled speakers face several challenges. In particular, music streaming services have large catalogues, often containing millions of songs, artists, and albums and several thousands of custom playlists and stations. In many cases there is ambiguity and little structural difference between carrier phrases and entity names. In this work, we describe how we leveraged multi-armed bandits in combination with implicit customer feedback to improve accuracy and personalization of responses to voice request in the music domain. Our models are tested in a large-scale industrial system containing several other components. In particular, we focused on using this technology to correct errors made by upstream NLU models and personalize responses based on customer preferences and music provider functionality. The models resulted in significant improvement of playback rate for Amazon Music and are deployed in systems serving several countries and languages. We further used the implicit feedback of the customers to generate weakly labeled training data for the NLU models. This improved the experience for customers using other music providers on all Alexa devices.
Research areas