Claude Sonnet will ship in Xcode

(developer.apple.com)

485 points zora_goron | 2 comments | 29 Aug 25 00:44 UTC | HN request time: 0.41s | source

Show context

not_your_vase ◴[29 Aug 25 05:23 UTC] No.45060519[source]▶

3 days ago I saw another Claude praising submission on HN, and finally I signed up for it, to compare it with copilot.

I asked 2 things.

1. Create a boilerplate Zephyr project skeleton, for Pi Pico with st7789 spi display drivers configured. It generated garbage devicetree which didn't even compile. When I pointed it out, it apologized and generated another one that didn't compile. It configured also non-existent drivers, and for some reason it enabled monkey test support (but not test support).

2. I asked it to create 7x10 monochromatic pixelmaps, as C integer arrays, for numeric characters, 0-9. I also gave an example. It generated them, but number eight looked like zero. (There was no cross in ether 0 nor 8, so it wasn't that. Both were just a ring)

What am I doing wrong? Or is this really the state of the art?

replies(34): >>45060525 #>>45060544 #>>45060555 #>>45060577 #>>45060624 #>>45060626 #>>45060633 #>>45060639 #>>45060698 #>>45060709 #>>45060755 #>>45060762 #>>45060775 #>>45060786 #>>45060791 #>>45060838 #>>45060880 #>>45060887 #>>45060965 #>>45061012 #>>45061017 #>>45061122 #>>45061192 #>>45061240 #>>45061304 #>>45061524 #>>45061531 #>>45061711 #>>45061803 #>>45061865 #>>45062688 #>>45065504 #>>45066597 #>>45067821 #

1. postalcoder ◴[29 Aug 25 05:56 UTC] No.45060698[source]▶

>>45060519 #

You can no longer answer "what is the state of the art” by pointing to a model.

Generating a state-of-the-art response to your request involves a back-and-forth with the agent about your requirements, having a agent generate and carry out a deep research plan to collect documentation, then having the agent generate and carry out a development plan to carry it out.

So while Claude is not the best model in terms of raw IQ, the reason why it's considered the best coding model is because of its ability to execute all these steps in one go which, in aggregate, generates a much better result (and is less likely to lose its mind).

replies(1): >>45061139 #

2. adastra22 ◴[29 Aug 25 07:14 UTC] No.45061139[source]▶

>>45060698 (TP) #

> So while Claude is not the best model in terms of raw IQ

Which one is, and by what metric? I always end up back at Claude after trying other models because it is so much better at real world applications.

↑