←back to thread

469 points samuelstros | 1 comments | | HN request time: 0.291s | source
Show context
OtherShrezzing ◴[] No.44998715[source]
I think it’s just that the base model is good at real world coding tasks - as opposed to the types of coding tasks in the common benchmarks.

If you use GitHub Copilot - which has its own system level prompts - you can hotswap between models, and Claude outperforms OpenAI’s and Google’s models by such a large margin that the others are functionally useless in comparison.

replies(4): >>44998798 #>>44998867 #>>45001236 #>>45001252 #
badestrand ◴[] No.45001236[source]
I read all the praise about Claude Code, tried it for a month and was very disappointed. For me it doesn't work any better than Cursor's sidebar and has worse UX on top. I wonder if I am doing something wrong because it just makes lots of stupid mistakes when coding for me, in two different code bases.
replies(1): >>45004195 #
1. mnvrth ◴[] No.45004195[source]
I'll suggest giving it another shot. It really is a game changer (I can't tell what you're doing wrong, but in a few people I've seen it has been about doing a psychological switch. I wrote about it a bit here - https://mnvr.in/beginners-mind, sharing in case it helps you see how you might approach it differently)