Large Language Models (LLMs) can be leveraged to improve performance in various stages of the search pipeline – the indexing stage, the query understanding stage, and the ranking or re-ranking stage. The latter two stages involve invoking a LLM during inference, adding latency in fetching the final ranked list of documents. Index enhancement, on the other hand, can be done in the indexing stage, in near