←back to thread

586 points mizzao | 2 comments | | HN request time: 0.434s | source
1. seydor ◴[] No.40666318[source]
The word is 'ablation'. Do not butcher it
replies(1): >>40666485 #
2. girvo ◴[] No.40666485[source]
Amusingly, they explicitly call it this in the article itself:

> Once we have identified the refusal direction, we can "ablate" it, effectively removing the model's ability to represent this feature