Claude 3.7 Sonnet and Claude Code

(www.anthropic.com)

Show context

jumploops ◴[24 Feb 25 19:09 UTC] No.43163548[source]▶

> "[..] in developing our reasoning models, we’ve optimized somewhat less for math and computer science competition problems, and instead shifted focus towards real-world tasks that better reflect how businesses actually use LLMs.”

This is good news. OpenAI seems to be aiming towards "the smartest model," but in practice, LLMs are used primarily as learning aids, data transformers, and code writers.

Balancing "intelligence" with "get shit done" seems to be the sweet spot, and afaict one of the reasons the current crop of developer tools (Cursor, Windsurf, etc.) prefer Claude 3.5 Sonnet over 4o.

replies(4): >>43163694 #>>43164052 #>>43164203 #>>43164889 #

eschluntz ◴[24 Feb 25 19:59 UTC] No.43164203[source]▶

>>43163548 #

Thanks! We all dogfood Claude every day to do our own work here, and solving our own pain points is more exciting to us than abstract benchmarks.

Getting things done require a lot of booksmarts, but also a lot of "street smarts" - knowing when to answer quickly, when to double back, etc

replies(2): >>43164322 #>>43164660 #

LouisSayers ◴[24 Feb 25 20:08 UTC] No.43164322[source]▶

>>43164203 #

Could you tell us a bit about the coding tools you use and how you go about interacting with Claude?

replies(1): >>43164561 #

catherinewu ◴[24 Feb 25 20:28 UTC] No.43164561[source]▶

>>43164322 #

We find that Claude is really good at test driven development, so we often ask Claude to write tests first and then ask Claude to iterate against the tests

replies(1): >>43164780 #

1. Kerrick ◴[24 Feb 25 20:48 UTC] No.43164780[source]▶

>>43164561 #

Write tests (plural) first, as in write more than one failing test before making it pass?

replies(1): >>43167085 #

2. zarmin ◴[25 Feb 25 01:40 UTC] No.43167085[source]▶

>>43164780 (TP) #

Time to look up TDD, my friend.

replies(2): >>43173040 #>>43175211 #

3. DrammBA ◴[25 Feb 25 15:22 UTC] No.43173040[source]▶

>>43167085 #

One of today's lucky 10,000. His mind is about to expand beyond imagination.

replies(1): >>43180000 #

4. Kerrick ◴[25 Feb 25 18:03 UTC] No.43175211[source]▶

>>43167085 #

Time to actually read Test-Driven Development By Example, my friend. Or if you can't stomach reading a whole book, read this: https://tidyfirst.substack.com/p/canon-tdd

TL;DR - If you're writing more than one failing test at a time, you are not doing Test-Driven Development.

replies(1): >>43178829 #

5. zarmin ◴[25 Feb 25 23:24 UTC] No.43178829{3}[source]▶

>>43175211 #

oh my god, your comment was just a setup for you to be pedantic? all discourse on the internet is worthless. i don't know why i keep engaging.

6. DrammBA ◴[26 Feb 25 02:37 UTC] No.43180000{3}[source]▶

>>43173040 #

I wish I could delete my original comment now that I found out that Kerric wasn't a lucky 10,000, he's just an asshole...

replies(1): >>43180953 #

7. zarmin ◴[26 Feb 25 05:35 UTC] No.43180953{4}[source]▶

>>43180000 #

Well, you lucky-10,000'd people who didn't know about the 10,000 thing. That's not nothing.

↑