Pre-trained language models (PLM) excel at capturing semantic similarity in language, while in e-commerce, customer shopping behavior data (e.g., clicks, add-to-cart, purchases) helps establish connections between similar queries based on behavior on products. This work addressed the challenges of using sparse behavior data to build a robust query-to-query similarity prediction model and apply it to a product