←back to thread

Claude for Chrome

(www.anthropic.com)
795 points davidbarker | 1 comments | | HN request time: 0s | source
Show context
biggestfan ◴[] No.45030868[source]
According to their own blog post, even after mitigations, the model still has an 11% attack success rate. There's still no way I would feel comfortable giving this access to my main browser. I'm glad they're sticking to a very limited rollout for now. (Sidenote, why is this page so broken? Almost everything is hidden.)
replies(5): >>45030924 #>>45031456 #>>45031949 #>>45033353 #>>45034111 #
mark242 ◴[] No.45033353[source]
11% success rate for what is effectively a spear-phishing attempt isn't that terrible and tbh it'll be easier to train Claude not to get tricked than it is to train eg my parents.
replies(4): >>45033380 #>>45033454 #>>45033795 #>>45039212 #
1. asdff ◴[] No.45033380[source]
>Claude not to get tricked than it is to train eg my parents.

One would think but apparently from this blog post it is still succeptible to the same old prompt injections that have always been around. So I'm thinking it is not very easy to train Claude like this at all. Meanwhile with parents you could probably eliminate an entire security vector outright if you merely told them "bank at the local branch," or "call the number on the card for the bank don't try and look it up."