Anthropic’s paper smells like bullshit

(djnn.sh)

1160 points vxvxvx | 1 comments | 16 Nov 25 11:32 UTC | HN request time: 0.22s | source

Earlier thread: Disrupting the first reported AI-orchestrated cyber espionage campaign - https://news.ycombinator.com/item?id=45918638 - Nov 2025 (281 comments)

Show context

KaiserPro ◴[16 Nov 25 12:33 UTC] No.45944641[source]▶

>>45944296 (OP) #

When I worked at a FAANG with a "world leading" AI lab (now run by a teenage data labeller) as an SRE/sysadmin I was asked to use a modified version of a foundation model which was steered towards infosec stuff.

We were asked to try and persuade it to help us hack into a mock printer/dodgy linux box.

It helped a little, but it wasn't all that helpful.

but in terms of coordination, I can't see how it would be useful.

the same for claude, you're API is tied to a bankaccount, and vibe coding a command and control system on a very public system seems like a bad choice.

replies(12): >>45944770 #>>45944798 #>>45945052 #>>45945088 #>>45945276 #>>45948858 #>>45949298 #>>45949721 #>>45950366 #>>45951433 #>>45958070 #>>45961167 #

Milderbole ◴[16 Nov 25 13:38 UTC] No.45945052[source]▶

>>45944641 #

If the article is not just marketing fluff, I assume a bad actor would select Claude not because it’s good at writing attacks, instead a bad actor code would choose it because Western orgs chose Claude. Sonnet is usually the go-to on most coding copilot because the model was trained on good range of data distribution reflecting western coding patterns. If you want to find a gap or write a vulnerability, use the same tool that has ingested patterns that wrote code of the systems you’re trying to break. Or use Claude to write a phishing attack because then output is more likely similar to what our eyes would expect.

replies(2): >>45945323 #>>45945926 #

1. KaiserPro ◴[16 Nov 25 15:47 UTC] No.45945926[source]▶

>>45945052 #

What your describing would be plausible if this was about exploiting claude to get access to organisations that use it.

The gist of the anthropic thing is that "claude made, deployed and coordinated" a standard malware attack. Which is a _very_ different task.

Side note, most code assistants are trained on broadly similar coding datasets (ie github scrapes.)

↑