> Once we have identified the refusal direction, we can "ablate" it, effectively removing the model's ability to represent this feature