Self-supervised representation learning across sequential and tabular features using transformers
2022
Machine learning models used for predictive modeling tasks spanning across personalization, recommender systems, ad response prediction, fraud detection etc. typically require a variety of tabular as well as sequential activity features about the user. For tasks like click-through or conversion (purchase) rate prediction where labeled data is available at scale, popular methods use deep sequence models (sometimes pre-trained) to encode sequential inputs, followed by concatenation with tabular features and optimization of a supervised training objective. For tasks like bot and fraud detection, where labeled data is sparse and incomplete, the typical approach is to use self-supervision to learn user embeddings from their historical activity sequence. However, these models are not equipped to handle tabular input features during self-supervised learning. In this paper, we propose a novel Transformer architecture that can jointly learn embeddings on both sequential and tabular input features. Our model learns self-supervised user embeddings using masked token prediction objective on a rich variety of features without relying on any labeled data. We demonstrate that user embeddings generated by the proposed technique are able to successfully encode information from a combination of sequential and tabular features, improving AUC-ROC for linear separability for a downstream task label by 5% over embeddings generated using sequential features only. We also benchmark the efficacy of the embeddings on the bot detection task for a large-scale digital advertising program, where the proposed model improves recall over known bots by 10% over the sequential only baseline at the same False Positive Rate (FPR).
Research areas