Warning: This paper includes content that may be considered inappropriate or offensive to some readers. Viewer discretion is advised. Language Model Models (LLMs) have improved dramatically in the past few years, increasing their adoption and the scope of their capabilities over time. A significant amount of work is dedicated to “model alignment”, i.e., preventing LLMs to generate unsafe responses when