Query-to-Product Type (Q2PT) is a crucial e-commerce query understanding signal, which directly influences search results relevance and customer UX experience. This imposes high standards on the industrial Q2PT classification models, which have to be regularly monitored for quality among all predicted product types and use cases at scale.
Existing solutions for such Q2PT model evaluation involve human-labeled datasets, which are usually small-scale and are costly to collect and refresh. Moreover, it is unrealistic to create ample human annotations for all e-commerce product categories, which can span several thousands.
To address these drawbacks, we propose a method sQuIrRel (Query Intent from Relevance) to automatically collect an evaluation dataset for monitoring e-commerce query classification models, which ensures large-scale analysis and full coverage of all existing category labels. sQuIrRel is constructed using distant supervision from a high-precision query-item relevance classifier, allowing to quickly collect and refresh query labels at scale.
While sQuIrRel method can be applied to any query classification task, across various e-commerce stores, our study focuses on using sQuIrRel for Q2PT prediction. We provide comparisons with alternative dataset collection methods and show how the obtained dataset can be used to analyze the performance of a commercial Q2PT model.
sQuIrRel: Large-scale evaluation of e-commerce query classification models
2025
Research areas