Claude 3.7 Sonnet and Claude Code

1. meetpateltech ◴[24 Feb 25 19:25 UTC] No.43163764[source]▶

>>43163011 (OP) #

When you ask: 'How many r's are in strawberry?'

Claude 3.7 Sonnet generates a response in a fun and cool way with React code and a preview in Artifacts

check out some examples:

[1]https://claude.ai/share/d565f5a8-136b-41a4-b365-bfb4f4400df5

[2]https://claude.ai/share/a817ac87-c98b-4ab0-8160-feefd7f798e8

replies(3): >>43163937 #>>43164129 #>>43164464 #

2. jasonjmcghee ◴[24 Feb 25 19:39 UTC] No.43163937[source]▶

>>43163764 (TP) #

I'm guessing this is an easter egg, but this was a huge gripe I had with artifacts and eventually disabled it (now impossible to disable afaict) as I'd ask question completely unrelated to code or clearly not wanting code as an output, and I'd have to wait for it to write a program (which you can't stop afaict, it stops the current artifact then starts a new one)

(still claude sonnet is my go-to and favorite model)

3. falcor84 ◴[24 Feb 25 19:53 UTC] No.43164129[source]▶

>>43163764 (TP) #

A shame the underlying issue still persists:

> There is exactly 1 'r' in "blueberry" [0]

[0] https://claude.ai/share/9202007a-9d85-49e6-9883-a8d8305cd29f

4. OsrsNeedsf2P ◴[24 Feb 25 20:20 UTC] No.43164464[source]▶

>>43163764 (TP) #

This test has always been so stupid since models work at the token level. Claude 3.5 already 5xs your frontend dev speed but people still say "hurr durr it can't count strawberry" as if that's a useful problem

replies(3): >>43164613 #>>43165334 #>>43167381 #

5. dannyw ◴[24 Feb 25 20:32 UTC] No.43164613[source]▶

>>43164464 #

The problem also comes to LLMs being confidently wrong when it’s wrong.

6. bufferoverflow ◴[24 Feb 25 21:47 UTC] No.43165334[source]▶

>>43164464 #

This test isn't stupid. If it can't count the number of letters in a text, can you rely on it with more important calculations?

replies(2): >>43165641 #>>43166849 #

7. stnmtn ◴[24 Feb 25 22:23 UTC] No.43165641{3}[source]▶

>>43165334 #

You can rely on it for anything that you can validate quickly. And it turns out, there are a lot of problems which are trivial to validate the solution to, but difficult to build the solution.

replies(1): >>43165963 #

8. 101008 ◴[24 Feb 25 23:00 UTC] No.43165963{4}[source]▶

>>43165641 #

Coding is not one of those cases or edge cases wouldn't exists

9. TeMPOraL ◴[25 Feb 25 01:03 UTC] No.43166849{3}[source]▶

>>43165334 #

Not on calculations that involve counting at a sub-token level. Otherwise, it depends.

10. elicksaur ◴[25 Feb 25 02:28 UTC] No.43167381[source]▶

>>43164464 #

“Already 5xs”

Even AI marketing doesn’t claim this. Totally baseless claim given how many people report negative experiences trying to use AI.

replies(1): >>43169619 #

11. ssijak ◴[25 Feb 25 08:51 UTC] No.43169619{3}[source]▶

>>43167381 #

Some people report some negative experiences for any tool ever brought into existence.