Recent studies on pre-trained vision/language models such as BERT [6] and GPT [26] have demonstrated the benefit of a promising solution-building paradigm where models can be pre-trained on broad data describing a generic task space and then adapted successfully to solve a wide range of downstream tasks, even when training data of downstream task is limited. Inspired by such progress, we investigate the possibilities and challenges of adapting such a paradigm to the context of recommender systems, and propose a causal recommender model named PreRec. It captures generic interaction patterns by training on diverse user-item interaction data extracted from different domains, which can then be fast adapted to improve zero- and few-shot learning performance in unseen new domains (with no or limited data).
The key to learning a generalizable model lies in capturing knowledge grounded on a universal feature space. To bridge domain discrepancies, ZESRec [7] first proposed to use generic item textual description to produce the item universal embedding, while the user universal embedding is computed via a sequential model that aggregates item universal embeddings for items in the user history. Despite the promising results, ZESRec limits itself to pre-training on a single source domain and inference on a single target domain. As a follow-up work, UniSRec [12] further extends support for multi-domain pre-training and evaluate the pre-trained model on multiple target domains. However, UniSRec fails to consider the bias either within each domain or across domains during pre-training, while both may lead to drift in user interests, item properties, and user behavioral patterns.
In this work, we aim to design an universally generalizable recommender that can be pre-trained on multiple source domains and fine-tuned on different target domains. We start by identifying two types of bias: In-domain bias considers noises taking effects within each domain, and one example is the popularity bias that affects not only the item exposure rate, but also the user behavioral pattern due to users tend to follow the majority and interact with trending items.Cross-domain bias considers noises introduced by the unique domain properties. For instance, each domain has distinctive user community, which causes the shift in user interests across different domains.
Pre-trained recommender systems: A causal perspective
2023
Research areas