A search on the major eCommerce platforms returns up to thousands of relevant products making it impossible for an average customer to audit all the results. Browsing the list of relevant items can be simplified using search filters for specific requirements (e.g., shoes of the wrong size). The complete list of available filters is often overwhelming and hard to visualize. Thus, successful user interfaces desire to display only the ones relevant to customer queries.
In this work, we frame the filter selection task as an extreme multi-label classification (XMLC) problem based on historical interaction with eCommerce sites. We learn from customers’ clicks and purchases which subset of filters is most relevant to their queries treating the relevant/not-relevant signal as binary labels.
A common problem in classification settings with a large number of classes is that some classes are underrepresented. These rare categories are difficult to predict. Building on previous work we show that classification performance for rare classes can be improved by accounting for the language structure of the class labels. Furthermore, our results demonstrate that including language structure in category names enables relatively simple deep learning models to achieve better predictive performance than transformer networks with much higher capacity.
Search filter ranking with language-aware label embeddings
2022
Research areas