Offline Reinforcement Learning (RL) has emerged as a promising approach to address real-world challenges where online interactions with the environment are limited, risky, or costly. Although, recent advancements produce high quality policies from offline data, currently, there is no systematic methodology to continue to improve them without resorting to online fine-tuning. This paper proposes to repurpose