Bi-CAT: Improving robustness of LLM-based text rankers to conditional distribution shifts

Sriram Srinivasan; Stephen Sheng; Rishabh Deshmukh; Chen Luo; Yesh Dattatreya; Subhajit Sanyal; S. V. N. Vishwanathan

Publication

Bi-CAT: Improving robustness of LLM-based text rankers to conditional distribution shifts

By Sriram Srinivasan, Stephen Sheng, Rishabh Deshmukh, Chen Luo, Yesh Dattatreya, Subhajit Sanyal, S. V. N. Vishwanathan

2024

Download Copy BibTeX

Share

Download

Copy BibTeX

Share

Retrieval and ranking lie at the heart of several applications like search, question-answering, and recommendations. The use of Large language models (LLMs) such as BERT in these applications have shown promising results in recent times. Recent works on text-based retrievers and rankers show promising results by using bi-encoders (BE) architecture with BERT like LLMs for retrieval and a cross-attention transformer (CAT) architecture BERT or other LLMs for ranking the results retrieved. Although the use of CAT architecture for re-ranking improves ranking metrics, their robust-ness to data shifts is not guaranteed. In this work we analyze the robustness of CAT-based rankers. Specifically, we show that CAT rankers are sensitive to item distribution shifts conditioned on a query, we refer to this as conditional item distribution shift (CIDS). CIDS naturally occurs in large online search systems as the retriev-ers keep evolving, making it challenging to consistently train and evaluate rankers with the same item distribution. In this paper, we formally define CIDS and show that while CAT rankers are sensitive to this, BE models are far more robust to CIDS. We pro-pose a simple yet effective approach referred to as Bi-CAT which augments BE model outputs with CAT rankers, to significantly improve the robustness of CAT rankers without any drop in in-distribution performance. We conducted a series of experiments on two publicly available ranking datasets and one dataset from a large e-commerce store. Our results on dataset with CIDS demonstrate that the Bi-CAT model significantly improves the robustness of CAT rankers by roughly 100-1000bps in F1 without any reduction in in-distribution model performance.

Bi-CAT: Improving robustness of LLM-based text rankers to conditional distribution shifts

Latest news

Work with us