←back to thread

469 points samuelstros | 1 comments | | HN request time: 0.214s | source
Show context
OtherShrezzing ◴[] No.44998715[source]
I think it’s just that the base model is good at real world coding tasks - as opposed to the types of coding tasks in the common benchmarks.

If you use GitHub Copilot - which has its own system level prompts - you can hotswap between models, and Claude outperforms OpenAI’s and Google’s models by such a large margin that the others are functionally useless in comparison.

replies(4): >>44998798 #>>44998867 #>>45001236 #>>45001252 #
1. ◴[] No.44998867[source]