(minusx.ai)

469 points samuelstros | 1 comments | 23 Aug 25 19:07 UTC | HN request time: 0s | source

Show context

OtherShrezzing ◴[23 Aug 25 20:07 UTC] No.44998715[source]▶

I think it’s just that the base model is good at real world coding tasks - as opposed to the types of coding tasks in the common benchmarks.

If you use GitHub Copilot - which has its own system level prompts - you can hotswap between models, and Claude outperforms OpenAI’s and Google’s models by such a large margin that the others are functionally useless in comparison.

replies(4): >>44998798 #>>44998867 #>>45001236 #>>45001252 #

1. ec109685 ◴[23 Aug 25 20:18 UTC] No.44998798[source]▶

>>44998715 #

Anthropic has opportunities to optimize their models / prompts during reinforcement learning, so the advice from the article to stay close to what works in Claude code is valid and probably has more applicability for Anthropic models than applying the same techniques to others.

With a subscription plan, Anthropic is highly incentivized to be efficient in their loops beyond just making it a better experience for users.

↑

What makes Claude Code so damn good