Enhancing catalog relationship problems with heterogeneous graphs and graph neural networks distillation
2023
Traditionally, catalog relationship problems in e-commerce stores have been handled as pairwise classification tasks, which limit the ability of machine learning models to learn from the diverse relationships among different entities in the catalog. In this paper, we leverage heterogeneous graphs and Graph Neural Networks (GNNs) for improving catalog relationship inference. We start from investigating how to create multi-entity, multi-relationship graphs from diverse relationship data sources, and then explore how to utilizing GNNs to leverage the knowledge of the constructed graph in a self-supervised fashion. We finally propose a distillation approach to transfer the knowledge learned by GNNs into a pairwise neural network for seamless deployment in the catalog pipeline that relies on pairwise input for inductive relationship inference. Our experiments exhibit that in two of the representative catalog relationship problems, Title Authority/Contributor Authority and Broken Variation, the proposed framework is able to improve the recall at 95% precision of a pairwise baseline by up to 33.6% and 14.0%, respectively. Our findings highlight the effectiveness of this approach in advancing catalog quality maintenance and accurate relationship modeling, with potential for broader industry adoption.
Research areas