Doesn't this seem like a remarkably small set of tests? And the fact that it took this testing to realize that prompt injection and giving the reigns to the AI agent is dangerous strikes me as strange that this wasn't anticipated while building the tool in the first place, before it even went to their red team.
Move fast and break things I guess. Only it is the worlds largest browser and the risk of breaking things means financial ruin and/or the end of the internet as we know it as a human to human communication tool.
I think this is more akin to say a theoretical browser not implementing HTTPS properly so people's credentials/sessions can be stolen with MiTM attacks or something. Clearly the bad behavior is in the toolchain and not the user here, and I'm not sure how much you can wave away claiming "We told you it's not fully safe." You can't sell tomatoes that have a 10% chance of giving you food poisoning, even if you declare that chance on the label, you know?
To misquote the IRA - "[Scammers] only need to be lucky once, you need to be lucky every time." Even a 1% chance of getting pwned every time you get sent a malicious email is way too high. Plus the scammers aren't gonna rest on their laurels - they'll be iterating too.