Query understanding plays a key role in the search process, and accurate understanding of search queries is the first step toward high-quality search results on e-commerce websites. While head queries with abundant historical data can be easier to interpret, tail queries pose a challenge to accurate understanding. To tackle the challenge, we focus on query rewriting to transform a tail query into a query with similar linguistic characteristics as head queries and preserve the shopping intent. In this work, we present a new training data construction process and extend the vanilla Seq2Seq model with multiple auxiliary tasks to achieve some desirable features for e-commerce applications. For the training data of query rewriting, we only rely on widely available search logs to generate (source, target) query pairs and additional shopping intent information. This additional information provides two auxiliary prediction tasks on product name and category into our model to fully capture the shopping intent. For the model, we propose a query matching loss based on a novel co-attention scheme to improve the source query representations, so that the overall model can be built and trained end-to-end with standard components and training protocols. The resulting model provides significant advantages over the vanilla Seq2Seq model in a range of experiments on rewriting quality. We also demonstrate the practical value of our query rewriting
model with an application in sponsored search.
Advancing query rewriting in e-commerce via shopping intent learning
2022
Research areas