4 points iosifnicolae2 | 2 comments | | HN request time: 0.441s | source
1. tao_oat ◴[] No.45614644[source]
Skills were released in Claude Code, what, yesterday? I doubt there's a simple answer to this -- it'll depend on the model, task, etc.

You could try to get your agent to test its own skills. From https://blog.fsck.com/2025/10/09/superpowers:

> As Claude and I build new skills, one of the things I ask it to do is to "test" the skills on a set of subagents to ensure that the skills were comprehensible, complete, and that the subagents would comply with them. (Claude now thinks of this as TDD for skills and uses its RED/GREEN TDD skill as part of the skill creation skill.)

> The first time we played this game, Claude told me that the subagents had gotten a perfect score. After a bit of prodding, I discovered that Claude was quizzing the subagents like they were on a gameshow. This was less than useful. I asked to switch to realistic scenarios that put pressure on the agents, to better simulate what they might actually do.