Most active commenters
  • bcherny(10)
  • (10)
  • eschluntz(8)
  • simonw(7)
  • Aeolun(5)
  • punkpeye(4)
  • logicallee(3)
  • keithwhor(3)
  • airstrike(3)
  • throw83288(3)

←back to thread

2127 points bakugo | 257 comments | | HN request time: 1.175s | source | bottom
1. bcherny ◴[] No.43163488[source]
Hi everyone! Boris from the Claude Code team here. @eschluntz, @catherinewu, @wolffiex, @bdr and I will be around for the next hour or so and we'll do our best to answer your questions about the product.
replies(82): >>43163527 #>>43163532 #>>43163549 #>>43163554 #>>43163555 #>>43163576 #>>43163585 #>>43163588 #>>43163589 #>>43163592 #>>43163593 #>>43163632 #>>43163642 #>>43163664 #>>43163677 #>>43163733 #>>43163758 #>>43163789 #>>43163803 #>>43163813 #>>43163821 #>>43163893 #>>43163909 #>>43163915 #>>43163921 #>>43163957 #>>43163958 #>>43163992 #>>43164069 #>>43164089 #>>43164102 #>>43164103 #>>43164104 #>>43164111 #>>43164127 #>>43164158 #>>43164329 #>>43164353 #>>43164424 #>>43164482 #>>43164514 #>>43164585 #>>43164616 #>>43164768 #>>43164797 #>>43164819 #>>43164899 #>>43165002 #>>43165057 #>>43165065 #>>43165088 #>>43165091 #>>43165187 #>>43165308 #>>43165355 #>>43165409 #>>43165468 #>>43165499 #>>43165516 #>>43165570 #>>43165578 #>>43165592 #>>43165836 #>>43165884 #>>43165965 #>>43165976 #>>43165995 #>>43166183 #>>43166711 #>>43166748 #>>43167130 #>>43167804 #>>43168626 #>>43168836 #>>43169047 #>>43169107 #>>43169119 #>>43169294 #>>43169310 #>>43173097 #>>43174353 #>>43192161 #
2. ◴[] No.43163527[source]
3. frankfrank13 ◴[] No.43163532[source]
Congrats on the launch! You said its an important tool for you (Claude Code) how does this fit in with Co-Pilot, Cursor, etc. Do you/your teammates only rely on Claude Code? What do you reach for for different tasks?
replies(1): >>43163636 #
4. 420gunna ◴[] No.43163549[source]
Are you guys paying Claude for its assistance with your products
5. pookieinc ◴[] No.43163554[source]
The biggest complaint I (and several others) have is that we continuously hit the limit via the UI after even just a few intensive queries. Of course, we can use the console API, but then we lose ability to have things like Projects, etc.

Do you foresee these limitations increasing anytime soon?

Quick Edit: Just wanted to also say thank you for all your hard work, Claude has been phenomenal.

replies(4): >>43163771 #>>43163889 #>>43164021 #>>43167940 #
6. light_triad ◴[] No.43163555[source]
Thanks for this - exciting launch. Do you have examples of cool applications or demos that the HN crowd should check out?
replies(3): >>43163691 #>>43163748 #>>43164257 #
7. mike_hearn ◴[] No.43163576[source]
Great, thanks! Could you compare this new tool to Aider?
8. thegeomaster ◴[] No.43163585[source]
Thank you to the team. Looks like a great release. Already switching existing prompts to Claude 3.7 to see the eval results :)
9. oofbaroomf ◴[] No.43163588[source]
Do you think Claude Code is "better", in terms of capabilities and token efficiency, than other tools such as Cline, Cursor, or Aider?
replies(1): >>43163860 #
10. curl-up ◴[] No.43163589[source]
In the console, TPM limit for 3.7 is not shown (I'm tier 4). Does it mean there is no limit, or is it just pending and is "variable" until you set it to some value?
replies(1): >>43164072 #
11. neoromantique ◴[] No.43163592[source]
Thanks for the product! Glad to hear the (so called) "safety" is being walked back on, previously Claude has been feeling a little like it is treating me as a child, excited to try it out now.
12. jumploops ◴[] No.43163593[source]
From the release you say: "[..] in developing our reasoning models, we’ve optimized somewhat less for math and computer science competition problems, and instead shifted focus towards real-world tasks that better reflect how businesses actually use LLMs."

Can you tell us more about the trade-offs here?

Also, are you using synthetic data for improving the responses here, or are you purely leveraging data from usage/partner's usage?

13. davely ◴[] No.43163632[source]
I'm in the middle of a particularly nasty refactor of some legacy React component code (hasn't been touched in 6 years, old class based pattern, tons of methods, why, oh, why did we do XYZ) at work and have been using Aider for the last few days and have been hitting a wall. I've been digging through Aider's source code on Github to pull out prompts and try to write my own little helper script.

So, perfect timing on this release for me! I decided to install Claude Code and it is making short work of this. I love the interface. I love the personality ("Ruminating", "Schlepping", etc).

Just an all around fantastic job!

(This makes me especially bummed that I really messed up my OA awhile back for you guys. I'll try again in a few months!)

Keep on doing great work. Thank you!

replies(1): >>43163662 #
14. bcherny ◴[] No.43163636[source]
Claude Code is super popular internally at Anthropic. Most engineers like to use it together with an IDE like Cursor, Windsurf, VS Code, Zed, Xcode, etc. Personally I usually start most coding tasks in Code, then move to an IDE for finishing touches.
15. fsndz ◴[] No.43163642[source]
Anthropic is back and cementing its place as the creator of the best coding models—bravo!

With Claude Code, the goal is clearly to take a slice of Cursor and its competitors' market share. I expected this to happen eventually.

The app layer has barely any moat, so any successful app with the potential to generate significant revenue will eventually be absorbed by foundation model companies in their quest for growth and profits.

replies(3): >>43163777 #>>43163840 #>>43165178 #
16. bcherny ◴[] No.43163662[source]
Hey thanks so much! <3
17. Attummm ◴[] No.43163664[source]
Hi Boris,

Would it be possible to bring back sonnet 2024 June?

That model was the most attentive.

Because we lost that model this release a value loss for me personally.

replies(1): >>43163952 #
18. joshuabaker2 ◴[] No.43163677[source]
Hi Boris, love working with Claude! I do have a question—is there a plan to have Claude 3.5 Sonnet (or even 3.7!) made available on ca-central-1 for Amazon Bedrock anytime soon? My company is based in Canada and we deal with customer information that is required to stay within Canada, and the most recent model from Anthropic we have available to us is Claude 3.
replies(1): >>43164092 #
19. eschluntz ◴[] No.43163691[source]
hi! I've been working on demos where I let Claude Code run for hours at a time on a sandboxed project: https://x.com/ErikSchluntz/status/1894104265817284770

TLDR: asking claude to speed up my code once 1.8x'd perf, but putting it in a loop telling it to make it faster for 2 hours led to a 500x speedup!

replies(3): >>43163852 #>>43163913 #>>43180228 #
20. matznerd ◴[] No.43163733[source]
Hi Boris et al, can you comment on increased conversation lengths or limits through the UI? I didn't see that mentioned in the blog post, but it is a continued major concern of $20/month Claude.ai users. Is this an issue that should be fixed now or still waiting on a larger deployment via Amazon or something? If not now, when can users expect the conversation length limitations will be increased?
21. catherinewu ◴[] No.43163748[source]
We built Claude Code with Claude Code!
replies(2): >>43163874 #>>43163895 #
22. LouisSayers ◴[] No.43163758[source]
Awesome work, Claude is amazingly good at writing code that is pretty much plug and play.

Could you speak at all about potential IDE integrations? An integration into Jetbrains IDEs would be super useful - I imagine being able to highlight a bit of code and having a plugin check the code graph to see dependencies, tests etc that might be affected by a change.

Copying and pasting code constantly is starting to seem a bit primitive.

replies(2): >>43163804 #>>43163911 #
23. eschluntz ◴[] No.43163771[source]
We are definitely aware of this (and working on it for the web UI), and that's why Claude Code goes directly through the API!
replies(3): >>43163984 #>>43164057 #>>43167719 #
24. keithwhor ◴[] No.43163777[source]
I think an argument could be reasonably made that the app layer is the only moat. It’s more likely Anthropic eventually has to acquire Cursor to cement a position here than they out-compete it. Where, why, what brand and what product customers swipe their credit cards for matters — a lot.
replies(2): >>43164087 #>>43168006 #
25. Falimonda ◴[] No.43163789[source]
CLAUDE NUMBA ONE!!!

Congrats on the new release!

26. Flux159 ◴[] No.43163803[source]
Is there a way to always accept certain commands across sessions? Specifically for things like reading or updating files I don't want to have to approve that each time I open a new repl.

Also, is there a way to switch models between 3.5-sonnet and 3.5-sonnet-thinking? Got the initial impression that the thinking model is using an excessive amount of tokens on first use.

replies(2): >>43164042 #>>43164053 #
27. eschluntz ◴[] No.43163804[source]
Part of our vision is that because Claude Code is just in the terminal, you can bring it into any IDE (or server) you want! Obviously that has tradeoffs of not having a full GUI of the IDE though
replies(2): >>43163856 #>>43165391 #
28. logicallee ◴[] No.43163813[source]
Can you give some insight into how you chose the reply limit length? It seems to cut off many useful programs that are 80%-90% done and if the limit were just a little higher it would be a source of extraordinary benefit.
replies(1): >>43163845 #
29. bakugo ◴[] No.43163821[source]
Can you let the API team know that the /v1/models endpoint has been broken for hours? Thanks.
replies(1): >>43163976 #
30. eschluntz ◴[] No.43163840[source]
hi! I've been using Claude Code in a very complementary way to my IDE, and one of the reasons we chose the terminal is because you can open it up inside whichever IDE you want!
31. bcherny ◴[] No.43163845[source]
If you can reproduce that, would you mind reporting it with /bug?
replies(1): >>43164708 #
32. LouisSayers ◴[] No.43163852{3}[source]
I assume you had a comprehensive test suite?
replies(1): >>43166590 #
33. elliot07 ◴[] No.43163856{3}[source]
I much prefer the standalone design to being editor integrated.
34. bcherny ◴[] No.43163860[source]
Claude Code is a research preview -- it's more rough, lets you see model errors directly, etc. so it's not as polished as something like Cline. Personally I use all of the above. Engineers here at Anthropic also tend to use Claude Code alongside IDEs like Cursor.
35. Karrot_Kream ◴[] No.43163874{3}[source]
This is super cool and I hope y'all highlight it prominently!
36. clangfan ◴[] No.43163889[source]
this is also my problem, ive only used the UI with $20 subscription, can I use the same subscription to use the cli? I'm afraid its like those aws api billing where there is no limit to how much I can use then get a surprise bill
replies(2): >>43164409 #>>43166498 #
37. kevinz3 ◴[] No.43163893[source]
hey guys! i was wondering why you chose to build Claude code via CLI when many popular choices like cursor and windsurf fork VScode. do you envision the future of Claude code to abstract away the codebase entirely?
replies(1): >>43164004 #
38. light_triad ◴[] No.43163895{3}[source]
Best demo - it's Claude Code all the way down. Claude Code === Claude Code
39. babyshake ◴[] No.43163909[source]
One thing I would love to have fixed - I type in a prompt, the model produces 90% or even 100% of the answer, and then shows an error that the system is at capacity and can't produce an answer. And then the response that has already been provided is removed! Please just make it where I can still have access to the response that has been provided, even if it is incomplete.
replies(5): >>43164526 #>>43168390 #>>43168413 #>>43169421 #>>43187266 #
40. ben30 ◴[] No.43163911[source]
Jetbrains have an official mcp plugin
replies(1): >>43164521 #
41. light_triad ◴[] No.43163913{3}[source]
YES!! I need infinite credits for infinite Claude Code. Will try it to get Claude to do all my work.
42. pbor ◴[] No.43163915[source]
Hi and congrats on the launch!

Will check out Claude Code soon, but in the meantime one unrelated other feature request: Moving existing chats into a project. I have a number of old-ish but super-useful and valuable chats (that are superficially unrelated) that I would like to bring together in a project.

43. ◴[] No.43163921[source]
44. ac29 ◴[] No.43163952[source]
Seems to still be available via API as claude-3-5-sonnet-20240620
45. ipsum2 ◴[] No.43163957[source]
Why gatekeep Claude Code, instead of releasing the code for it? It seems like a direct increase in revenue/API sales for your company.
replies(1): >>43165177 #
46. Ninjinka ◴[] No.43163958[source]
How is your largest customer, Cursor, taking the news that you'll be competing directly with them?
replies(4): >>43164044 #>>43164273 #>>43165246 #>>43168327 #
47. latetomato ◴[] No.43163976[source]
Hello! Member of the API team here. We're unable to find issues with the /v1/models endpoint—can you share more details about your request? Feel free to email me at suzanne@anthropic.com. Thank you!
replies(1): >>43164023 #
48. smallerfish ◴[] No.43163984{3}[source]
I'm sure many of us would gladly pay more to get 3-5x the limit.

And I'm also sure that you're working on it, but some kind of auto-summarization of facts to reduce the context in order to avoid penalizing long threads would be sweet.

I don't know if your internal users are dogfooding the product that has user limits, so you may not have had this feedback - it makes me irritable/stressed to know that I'm running up close to the limit without having gotten to the bottom of a bug. I don't think stress response in your users is a desirable thing :).

replies(2): >>43165060 #>>43165318 #
49. themgt ◴[] No.43163992[source]
Is there / are you planning a way to set $ limits per API key? Far as I can tell the "Spend limits" are currently per-org only which seems problematic.
replies(2): >>43164404 #>>43165478 #
50. bcherny ◴[] No.43164004[source]
We wanted to bring the model to people where they are without having to commit to a specific tool or radically change their workflows. We also wanted to make a way that lets people experience the model’s coding abilities as directly as possible. This has tradeoffs: it uses a lot of tokens, and is rough (eg. it shows you tool errors and model weirdness), but it also gives you a lot of power and feels pretty awesome to use.
replies(2): >>43165397 #>>43165399 #
51. punkpeye ◴[] No.43164021[source]
If you are open to alternatives, try https://glama.ai/gateway

We currently serve ~10bn tokens per day (across all models). OpenAI compatible API. No rate limits. Built in logging and tracing.

I work with LLMs every day, so I am always on top of adding models. 3.7 is also already available.

https://glama.ai/models/claude-3-7-sonnet-20250219

The gateway is integrated directly into our chat (https://glama.ai/chat). So you can use most of the things that you are used to having with Claude. And if anything is missing, just let me know and I will prioritize it. If you check our Discord, I have a decent track record of being receptive to feedback and quickly turning around features.

Long term, Glama's focus is predominantly on MCPs, but chat, gateway and LLM routing is integral to the greater vision.

I would love feedback if you are going to give a try frank@glama.ai

replies(5): >>43164075 #>>43164764 #>>43167057 #>>43173593 #>>43174149 #
52. bakugo ◴[] No.43164023{3}[source]
It always returns a Not Found error for me. Using the curl command copied directly from the docs:

$ curl https://api.anthropic.com/v1/models --header "x-api-key: $ANTHROPIC_API_KEY" --header "anthropic-version: 2023-06-01"

{"type":"error","error":{"type":"not_found_error","message":"Not found"}}

Edit: Tried creating a different API key and it works with that one. Weird.

replies(1): >>43164202 #
53. eschluntz ◴[] No.43164042[source]
Right now no, but if you run in docker, you can use `--dangerously-skip-permissions`

Some commands could be totally fine in one context, but bad in a different i.e. pushing to master

54. behnamoh ◴[] No.43164044[source]
honestly, is this something that anthropic should be worried about? you could ask the same question from all the startups that were destroyed by OpenAI.
55. bcherny ◴[] No.43164053[source]
When you are prompted to accept a bash command, we should be giving you the option to not ask again. If you're not seeing that for a specific bash command, would you mind running /bug or filing an issue on Github? https://github.com/anthropics/claude-code/issues

Thinking and not thinking is actually the same model! The model thinks automatically when you ask it to. If you don't explicitly ask it to think, it won't use thinking.

replies(1): >>43166607 #
56. sealthedeal ◴[] No.43164057{3}[source]
I haven't been able to find ClaudeCLI for pubic access yet. Would love to use.
replies(2): >>43164217 #>>43165657 #
57. nprateem ◴[] No.43164069[source]
Does this actually have an 8k (or more) output context via the API?

3.5 did with a beta header but while 3.6 claimed to, it always cut its responses after 4k.

IIRC someone reported it on GH but had no reply.

58. catherinewu ◴[] No.43164072[source]
We set the Claude Code rate limits to be usable as a daily driver. We expect hitting rate limits for synchronous usage to be uncommon. Since this is a research preview, we recommend you start small as you try the product though.
replies(1): >>43164162 #
59. airstrike ◴[] No.43164075{3}[source]
The issue isn't API limits, but web UI limits. We can always get around the web interface's limits by using the claude API directly but then you need to have some other interface...
replies(1): >>43164415 #
60. fsndz ◴[] No.43164087{3}[source]
if Claude Code offers a better experience, users will rapidly move from cursor to Claude Code.

Claude is for Code: https://medium.com/thoughts-on-machine-learning/claude-is-fo...

replies(1): >>43164160 #
61. antirez ◴[] No.43164089[source]
One of the silver bullets of Claude, in the context of coding, is that it does NOT use RAG when you use it via the web interface. Sure, you burn your tokens but the model sees everything and this let it reply in a much better way. Is Claude Code doing the same and just doing document-level RAG, so that if a document is relevant and if it fits, all the document will be put inside the context window? I really hope so! Also, this means that splitting large code bases into manageable file sizes will make more and more sense. Another Q: is the context size of Sonnet 3.7 the same of 3.5? Btw Thanks you so much for Claude Sonnet, in the latest months it changed the way I work and I'm able to do a lot more, now.
replies(1): >>43164253 #
62. pbronez ◴[] No.43164092[source]
Concur. Models aren’t real until I can run them inside my perimeter.
63. siva7 ◴[] No.43164102[source]
Will Claude be available on Azure?
64. rgomez ◴[] No.43164103[source]
What kind of sorcery did you use to create Claude? Honest question :)
replies(1): >>43164299 #
65. TIPSIO ◴[] No.43164104[source]
What are your thoughts on having a UI/design benchmark?
66. riku_iki ◴[] No.43164111[source]
Is there plans to add websearch function over some core websites (SO, API docs)? Competitors have it, and in my experience this provide very good grounding for coding tasks (way less API functions hallucinated).
67. artvandalai ◴[] No.43164127[source]
Any updates on web search?
68. adastra22 ◴[] No.43164158[source]
When are you providing an alternative to email magic login links?
69. keithwhor ◴[] No.43164160{4}[source]
(1) That's a big if. It requires building a team specialized in delivering what Cursor has already delivered which is no small task. There are probably only a handful of engineers on the planet that have or can be incentivized to develop the product intuition the Cursor founders have developed in the market already. And even then; I'm an aspiring engineer / PM at Anthropic. Why would I choose to spend all of my creative energy copying what somebody else is doing for the same pay I'd get working on something greenfield, or more interesting to me, or more likely to get me a promotion?

(2) It's not clear to me that users (or developers) actually behave this way in practice. Engineering is a bit of a cargo cult. Cursor got popular because it was good but it also got popular because it got popular.

replies(2): >>43164578 #>>43165517 #
70. curl-up ◴[] No.43164162{3}[source]
Sorry, I completely missed you're from the Code team. I was actually asking about the vanilla API. Any insights into those limits? It's still missing the TPM number in the console.
71. lebovic ◴[] No.43164202{4}[source]
If you can reproduce the issue with the other API key, I'd also love to debug this! Feel free to share the curl -vv output (excluding the key) with the Anthropic email address in my profile
72. eschluntz ◴[] No.43164217{4}[source]
>>> npm install -g @anthropic-ai/claude-code

>>> claude

73. bcherny ◴[] No.43164253[source]
Right -- Claude Code doesn't use RAG currently. In our testing we found that agentic search out-performed RAG for the kinds of things people use Code for.
replies(1): >>43164503 #
74. logicallee ◴[] No.43164257[source]
>Do you have examples of cool applications or demos that the HN crowd should check out?

Not OP obviously, but I've built so many applications with Claude, here are just a few:

[1]

Mockup of Utopian infrastructure support button (this is just a mockup, the buttons don't do anything): https://claude.site/artifacts/435290a1-20c4-4b9b-8731-67f5d8...

[2]

Robot body simulation: https://claude.site/artifacts/6ffd3a73-43d6-4bdb-9e08-02901d...

[3]

15-piece slider puzzle: https://claude.site/artifacts/4504269b-69e3-4b76-823f-d55b3e...

[4]

Canada joining the U.S., checklist: https://claude.site/artifacts/6e249e38-f891-4aad-bb47-2d0c81...

[5]

Secure encryption and decryption with AES-256-GCM with password-based key derivation:

https://claude.site/artifacts/cb0ac898-e5ad-42cf-a961-3c4bf8...

(Try to decrypt this message

kFIxcBVRi2bZVGcIiQ7nnS0qZ+Y+1tlZkEtAD88MuNsfCUZcr6ujaz/mtbEDsLOquP4MZiKcGeTpBbXnwvSLLbA/a2uq4QgM7oJfnNakMmGAAtJ1UX8qzA5qMh7b5gze32S5c8OpsJ8=

With the password "Hello Hacker News!!" (without quotation marks))

[6]

Supply-demand visualizer under tariffs and subsidies: https://claude.site/artifacts/455fe568-27e5-4239-afa4-051652...

[7]

fortune cookie program: https://claude.site/artifacts/d7cfa4ae-6946-47af-b538-e6f992...

[8]

Household security training for classified household members (includes self-assessment and certificate): https://claude.site/artifacts/7754dae3-a095-4f02-b4d3-26f1a5...

[9]

public service accountability training program: https://claude.site/artifacts/b89a69fb-1e46-4b5c-9e96-2c29dd...

[10]

Nuclear non-proliferation "big brother" agent technical demonstration: https://claude.site/artifacts/555d57ba-6b0e-41a1-ad26-7c90ca...

Dating stuff:

[11]

Dating help: Interest Level Assessment Game (is she interested?) https://claude.site/artifacts/523c935c-274e-4efa-8480-1e09e9...

[12]

Dating checklist: https://claude.site/artifacts/10bf8bea-36d5-407d-908a-c1e156...

75. sebzim4500 ◴[] No.43164273[source]
They probably aren't thrilled, but a lot of users will prefer a UI and I doubt Anthropic has the spare cycles to make a full Cursor competitor.
76. bcherny ◴[] No.43164299[source]
Reticulating...
77. sebzim4500 ◴[] No.43164329[source]
Did you guys ever fix the issue where if UK users wanted to use the API they have to provide a VAT number?
78. posix86 ◴[] No.43164353[source]
Claude is my go to llm for everything, sounds corny but it's literally expanding the circle of what I can reasonably learn, manyfold. Right now I'm attempting to read old philosophical texts (without any background in similar disciplines), and without claude's help to explain the dense language in simpler terms & discuss its ideas, give me historical contexts, explaining why it was written this or that way, compare it against newer ideas - I would've given up many times.

At work I used it many times daily in development. It's concise mode is a breath of fresh air compared to any other llm I've tried. It has helped me find bugs in foreign code bases, explain me the techstack, written bash scripts, saving me dozens of hours of work & many nerves. It generally makes me reach places I wouldn't without due to time constraints & nerves.

The only nitpick is that the service reliability is a bit worse than others, forcing me sometimes to switch to others. This is probably a hard to answer question, but are there plans to improve that?

79. bcherny ◴[] No.43164404[source]
Good idea! Tracking here: https://github.com/anthropics/claude-code/issues/16
80. eschluntz ◴[] No.43164409{3}[source]
It is API billing like AWS - you pay for what you use. Every time you exit a session we print the cost, and in the middle of a session you can do /cost to see your cost so far that session!

You can track costs in a few ways and set spend limits to avoid surprises: https://docs.anthropic.com/en/docs/agents-and-tools/claude-c...

replies(2): >>43164841 #>>43165653 #
81. punkpeye ◴[] No.43164415{4}[source]
The API still has limits. Even if you are on the highest tier, you will quickly run into those limits when using coding assistants.

The value proposition of Glama is that it combines UI and API.

While everyone focuses on either one or the other, I've been splitting my time equally working on both.

Glama UI would not win against Anthropic if we were to compare them by the number of features. However, the components that I developed were created with craft and love.

You have access to:

* Switch models between OpenAI/Anthropic, etc.

* Side-by-side conversations

* Full-text search of all your conversations

* Integration of LaTeX, Mermaid, rich-text editing

* Vision (uploading images)

* Response personalizations

* MCP

* Every action has a shortcut via cmd+k (ctrl+k)

replies(3): >>43164969 #>>43165930 #>>43166283 #
82. throwaway0123_5 ◴[] No.43164424[source]
I'm curious why there are no results for the "Claude 3.7 Extended Thinking" on SWE-Bench and Agentic tool use.

Are you finding that extended thinking helps a lot when the whole problem can be posed in the prompt, but that it isn't a major benefit for agentic tasks?

It would be a bit surprising, but it would also mirror my experiences, and the benchmarks which show Claude 3.5 being better at agentic tasks and SWE tasks than all other models, despite not being a reasoning model.

83. danso ◴[] No.43164482[source]
Been a long time casual — i.e. happy to fix my code by asking questions and copy/pasting individual snippets via the chat interface. Decided to give the `claude` terminal tool a run and have to admit it looks like a fantastic tool.

Haven't tried to build a modern JS web app in years — it took the claude tool just a few minutes of prompting to convert and refactor an old clunky tool into a proper project structure, and using svelte and vite and tailwind (which I haven't built with before). Trying to learn how to even scaffold a modern app has felt daunting and this eliminates 99% of that friction.

One funny quirk: I asked it to build a test suite (I know zilch about JS testing frameworks, so it picked vitest for me) for the newly refactored app. I noticed that 3 of the 20 tests failed and so I asked it to run vitest for itself and fix the failing things. 2 minutes later, and now 7 tests were failing...

Which is very funny to me, but also not a big deal. Again, it's such a chore to research test libs and then set things up to their conventions. That the claude tool built a very usable scaffold that I can then edit and iterate on is such a huge benefit by itself, I don't need (nor desire) the AI to be complete turnkey solution.

84. marlott ◴[] No.43164503{3}[source]
Interesting - can you elaborate a little on what you mean by agentic search here?
replies(2): >>43164993 #>>43166130 #
85. bhouston ◴[] No.43164514[source]
Have you seen https://mycoder.ai? Seems quite similar. It was my own invention and it seems that you guys are thinking along similar lines - incredibly similar lines.
replies(1): >>43165107 #
86. LouisSayers ◴[] No.43164521{3}[source]
Thanks, I wasn't aware of the Model Context Protocol!

For anyone interested - you can extend Claude's functionality by allowing it to run commands via a local "MCP server" (e.g. make code commits, create files, retrieve third party library code etc).

Then when you're running Claude it asks for permission to run a specific tool inside your usual Claude UI.

https://www.anthropic.com/news/model-context-protocol

https://github.com/modelcontextprotocol/servers

87. rishikeshs ◴[] No.43164526[source]
This. Claude team, please fix this!
replies(1): >>43166435 #
88. CharlesW ◴[] No.43164578{5}[source]
> It requires building a team specialized in delivering what Cursor has already delivered which is no small task.

There are several AIDEs out there, and based on working with Cursor, VS Code, and Windsurf there doesn't seem to be much of a difference (although I like Windsurf best). What moat does Cursor have?

replies(1): >>43164988 #
89. farco12 ◴[] No.43164585[source]
Thank you for the update!

I recently attempted to use the Google Drive integration but didn't follow through with connecting because Claude wanted access to my entire Google Drive. I understand this simplifies the user experience and reduced time to ship, but is there anyway the team can add "reduce the access scope of Google Drive integration" to your backlog. Thank you!

Also, I just caught the new Github integration. Awesome.

90. lintaho ◴[] No.43164616[source]
For the pokemon benchmark, what happened after the Lt Surge gym? Did the model stall or run out of context or something similar?
91. logicallee ◴[] No.43164708{3}[source]
Just tried it with claude 3.7 sonnet, here is the share: https://claude.ai/share/68db540d-a7ba-4e1f-882e-f10adf64be91 and it doesn't finish outputing the program. (It's missing the rest of the application function and the main function).

Here are steps to reproduce.

Background/environment:

ChatGPT helped me build this complete web browser in Python:

https://taonexus.com/publicfiles/feb2025/71toy-browser-with-...

It looks like this, versus the eventual goal: https://imgur.com/a/j8ZHrt1

in 1055 lines. But eventually it couldn't improve on it anymore, ChatGPT couldn't modify it at my request so that inline elements would be on the same line.

If you want to run it just download it and rename it to .py, I like Anaconda as an environment, after reading the code you can install the required libraries with:

conda install -c conda-forge requests pillow urllib3

then run the browser from the Anaconda prompt by just writing "python " followed by the name of the file.

2.

I tried to continue to improve the program with Claude, so that in-line elements would be on the same line.

I performed these reproduceable steps:

1. copied the code and pasted it into a Claude chat window with ctrl-v. This keeps it in the chat as paste.

2. Gave it the prompt "This complete web browser works but doesn't lay out inline elements inline, it puts them all on a new line, can you fix it so inline elements are inline?"

It spit out code until it hit section 8 out of 9 which is 70% of the way through and gave the error message "Claude hit the max length for a message and has paused its response. You can write Continue to keep the chat going". Screenshot:

https://imgur.com/a/oSeiA4M

So I wrote "Continue" and it stops when it is 90% of the way done.

Again it got stuck at 90% of the way done, second screenshot in the above album.

So I wrote "Continue" again.

It just gave an answer but it never finished the program. There's no app entry in the program, it completely omited the rest of the main class itself and the callback to call it, which would be like:

        def run(self):
            self.root.mainloop()
    
    ###############################################################################
    # main
    ###############################################################################
    
    if __name__=="__main__":
        sys.setrecursionlimit(10**6)
        app=ToyBrowser()
        app.run()
so it only output a half-finished program. It explained that it was finished.

I tried telling it "you didn't finish the program, output the rest of it" but doing so just got it stuck rewriting it without finishing it. Again it said it ran into the limit, again I said Continue, and again it didn't finish it.

The program itself is only 1055 lines, it should be able to output that much.

replies(1): >>43166219 #
92. cmdtab ◴[] No.43164764{3}[source]
Do you have deepseek r1 support? I need it for a current product I’m working on.
replies(2): >>43165021 #>>43165209 #
93. swairshah ◴[] No.43164768[source]
Why not just open source Claude Code? people have tried to reverse eng the minified version https://gist.githubusercontent.com/1rgs/e4e13ac9aba301bcec28...
replies(2): >>43166476 #>>43166956 #
94. cowpig ◴[] No.43164797[source]
It would be great if we could upgrade API rate limits. I've tried "contacting sales" a few times and never received a response.

edit: note that my team mostly hits rate limits using things like aider and goose. 80k input token is not enough when in a flow, and I would love to experiment with a multi-agent workflow using claude

95. levocardia ◴[] No.43164819[source]
Which starter pokemon does Claude typically choose?
replies(1): >>43165644 #
96. mindok ◴[] No.43164841{4}[source]
Which is theoretically great, but if anyone can get an Aussie credit card to work, please let me know.
replies(2): >>43165019 #>>43178752 #
97. gwd ◴[] No.43164899[source]
Just started playing with the command-line tool. First reaction (after using it for 5 minutes): I've been using `aider` as a daily driver, with Claude 3.5, for a while now. One of the things I appreciate about aider is that it tells you how much each query cost, and what your total cost is this session. This makes it low-key easy to keep tabs on the cost of what I'm doing. Any chance you could add that to claude-code?

I'd also love to have it in a language that can be compiled, like golang or rust, but I recognize a rewrite might be more effort than it's worth. (Although maybe less with claude code to help you?)

EDIT: OK, 10 minutes in, and it seems to have major issues doing basic patches to my Golang code; the most recent thing it did was add a line with incorrect indentation, then try three times to update it with the correct indentation, getting "String to replace not found in file" each time. Aider with claude 3.5 does this really well -- not sure what the counfounding issue is here, but might be worth taking a look at their prompt & patch format to see how they do it.

replies(2): >>43164923 #>>43164931 #
98. davidbarker ◴[] No.43164923[source]
If you do `/cost` it will tell you how much you've spent during that session so far.
99. eschluntz ◴[] No.43164931[source]
hi! You can do /cost at any time to see what the current session has cost
100. airstrike ◴[] No.43164969{5}[source]
Ok, but that's not the issue the parent was mentioning. I've never hit API limits but, like the original comment mentioned, I too constantly hit the web interface limits particularly when discussing relatively large modules.
replies(1): >>43165299 #
101. aquariusDue ◴[] No.43164988{6}[source]
Just chiming in to say that AIDEs (Artificial Intelligence Development Environments, I suppose) is such a good term for these new tools imo.

It's one thing to retrofit LLMs into existing tools but I'm more curious how this new space will develop as time goes on. Already stuff like the Warp terminal is pretty useful in day to day use.

Who knows, maybe this time next year we'll see more people programming by voice input instead of typing. Something akin to Talon Voice supercharged by a local LLM hopefully.

102. antirez ◴[] No.43164993{4}[source]
I guess it's what sometimes it's called "self RAG", that is, the agent looks inside the files how a human would be to find that's relevant.
replies(1): >>43165401 #
103. ◴[] No.43165002[source]
104. robbiep ◴[] No.43165019{5}[source]
I haven’t had an issue with Aussie cards?

But I still hit limits, I use Claudemind with jetbrains stuff and there is a max of input tokens (j believe), I am ‘tier 2’ but doesn’t look like I can go past this without an enterprise agreement

105. pclmulqdq ◴[] No.43165021{4}[source]
They are just selling a frontend wrapper on other people's services, so if someone else offers deepseek, I'm sure they will integrate it.
106. xianshou ◴[] No.43165057[source]
Any way to parallelize tool use? When I go into a repo and ask "what's in here", I'm aiming for a summary that returns in 20 seconds.
107. andrewchilds ◴[] No.43165065[source]
Hi Boris! Thank you for your work on Claude! My one pet peeve with Claude specifically, if I may: I might be working on a Svelte codebase and Claude will happily ignore that context and provide React code. I understand why, but I’d love to see much less of a deep reliance on React for front-end code generation.
108. PKop ◴[] No.43165088[source]
It would be great to have a C# / .NET SDK available for Claude so it can be integrated into Semantic Kernel [0][1]. Are there any plans for this?

[0] https://github.com/microsoft/semantic-kernel/issues/5690#iss...

[1] https://github.com/microsoft/semantic-kernel/pull/7364

109. timojaask ◴[] No.43165091[source]
Hi! I’ve been using Claude for macOS and iOS coding for a while, and it’s mostly great, but it’s always using deprecated APIs, even if I instruct it not to. It will correct the mistake if I ask it to, but then in later iterations, it will sometimes switch back to using a deprecated API. It also produces a lot of code that just doesn’t compile, so a lot of time is spent fixing the made up or deprecated APIs.
110. handfuloflight ◴[] No.43165107[source]
Have you seen https://www.codebuff.com?
replies(1): >>43167418 #
111. sangnoir ◴[] No.43165177[source]
I'm not affiliated with Anthropic, but it seems like doing this will commoditize Claude (the AIaaS). Hosted AI providers are doing all they can to move away from being interchangeable commodities; it's not good for Anthropic's revenue for users to be able to easily swap-out the backend of Cloud Code to a local Olama backend, or a cheaper hosted DeepSeek. Open sourcing Claude Code would make this option 1 or 2 forks/PRs away.
replies(1): >>43167155 #
112. biker142541 ◴[] No.43165178[source]
I wonder if they will offer competitive request counts against Cursor. Right now, at least for me, the biggest downside to Claude is how fast I blow through the limits (Pro) and hit a wall.

At least with Cursor, I can use all "premium" 500 completions and either buy more, or be patient for throttled responses.

replies(1): >>43166379 #
113. kapnap ◴[] No.43165187[source]
Any change there will be a way to copy and paste the responses into other text boxes (i.e., a new email) and not have to re-jig the formatting?

Lists, numbers, tabs, etc. are all a little time consuming... minor annoyance but thought I'd share.

114. punkpeye ◴[] No.43165209{4}[source]
Indeed we do https://glama.ai/models/deepseek-r1

It is provided by DeepSeek and Avian.

I am also midway of enabling a third-provider (Nebius).

You can see all models/providers over at https://glama.ai/models

As another commenter in this tread said, we are just a 'frontend wrapper' around other people services. Therefore, it is not particularly difficult to add models that are already supported by other providers.

The benefit of using our wrapper is that you can use a single API key and you get one bill for all your AI bills, you don't need to hack together your own logic for routing requests between different providers, failovers, keeping track of their costs, worry what happens if a provider goes down, etc.

The market at the moment is hugely fragmented, with many providers unstable, constantly shifting prices, etc. The benefit of a router is that you don't need to worry about those things.

replies(1): >>43165480 #
115. alienthrowaway ◴[] No.43165246[source]
Unless Cursor had agreed to an exclusivity agreement with Anthropic, Antropic was (and still is) at risk of Cursor moving to a different provider or using their middleman position to train/distill their own model that competes with Anthropic.
116. glenstein ◴[] No.43165299{6}[source]
Right, that's how I read it also. It's not that there's no limits with the API, but that they're appreciably different.
117. wellthisisgreat ◴[] No.43165308[source]
Hi, what are the privacy terms for Claude Code? Is it memorizing the codebase it’s helping with? From an enterprise standpoint
118. justinbaker84 ◴[] No.43165318{4}[source]
This is the main point I always want to communicate to the teams building foundation models.

A lot of people just want the ability to pay more in order to get more.

I would gladly pay 10x more to get relatively modest increases in performance. That is how important the intelligence is.

replies(1): >>43166135 #
119. joevandyk ◴[] No.43165355[source]
It would be amazing to be able to use an API key to submit prompts that use our Project Knowledge. That doesn't seem to be currently possible, right?
120. unshavedyak ◴[] No.43165391{3}[source]
Anyone know how to get access to it? Notably i'm debating purchasing for Claude Code, but being on NixOS i want to make sure i can install it first.

If this Code preview is only open to subscribers it means i have to subscribe before i can even see if the binary works for me. Hmm

edit: Oh, there's a link to "joining the preview" which points to: https://docs.anthropic.com/en/docs/agents-and-tools/claude-c...

121. ◴[] No.43165397{3}[source]
122. unshavedyak ◴[] No.43165399{3}[source]
I like this quite a bit, thank you! I prefer Helix editor and i hate the idea of running VSCode just to access some random Code assistant
123. kadushka ◴[] No.43165401{5}[source]
As opposed to vector search, or…?
replies(2): >>43165926 #>>43166422 #
124. ◴[] No.43165409[source]
125. robbomacrae ◴[] No.43165468[source]
Awesome to see a new Claude model - since 3.5 its been my go-to for all code related tasks.

I'd really like to use Claude Code in some of my projects vs just sharing snippets via the UI but I'm curious how might doing this from our source directory affect our IP including NDA's, trade secret protections, prior disclosure rules on (future) patents, open source licensing restrictions re: redistribution etc?

Also hi Erik! - Rob

126. l1n ◴[] No.43165478[source]
You can with Workspaces - https://support.anthropic.com/en/articles/9796807-creating-a...
127. cmdtab ◴[] No.43165480{5}[source]
Yeah I am aware. I use open router at the moment but I find it lacks a good UX.
replies(1): >>43165659 #
128. dailykoder ◴[] No.43165499[source]
Folks, let me tell you, AI is a big league player, it's a real winner, believe me. Nobody knows more about AI than I do, and I can tell you, it's going to be huge, just huge. The advancements we're seeing in AI are tremendous, the best, the greatest, the most fantastic. People are saying it's going to change the world, and I'm telling you, they're right, it's going to be yuge. AI is a game-changer, a real champion, and we're going to make America great again with the help of this incredible technology, mark my words.
129. fragmede ◴[] No.43165516[source]
Now that the world's gotten used to the existence of AI, any hope on removing the guardrails on Claude? I don't need it to answer "How do I make meth", but I would like to not have to social engineer my prompts. I'd like it to just write the code I asked for and not judge me on how ethical the code might be.

Eg Claude will refuse to write code to wget a website and parse the html if you ask it to scrape your ex girlfriend's Instagram profile, for ethical and tos reasons, but if you phrase the request differently, it'll happily go off and generate code that does that exact thing.

Asking it to scrape my ex girlfriend's Instagram profile is just a stand in for other times I've hit a problem where I've had to social engineer my way past those guard rails, but does having those guard rails really provide value on a professional level?

replies(1): >>43167006 #
130. Etheryte ◴[] No.43165517{5}[source]
In my opinion you're vastly overestimating how much of a moat Cursor has. In broad strokes, in builds an index of your repo for easier referencing and then adds some handy UI hooks so you can talk to the model, there really isn't that much more going on. Yes, the autocomplete is nice at times, but it's at best like pair programming with a new hire. Every big player in the AI space could replicate what they've done, it's only a matter of whether they consider it worth the investment or not given how fast the whole field is moving.
replies(2): >>43165829 #>>43166057 #
131. luke-stanley ◴[] No.43165570[source]
My key got killed months ago when I tested it on a PDF, and support never got back to me so I am waiting for OpenRouter support!
132. throw83288 ◴[] No.43165578[source]
Serious question: What advice would you give to a Computer Science student in light of these tools?
replies(1): >>43165745 #
133. _cs2017_ ◴[] No.43165592[source]
Your footnote 3 seems to imply that the low number for o1 and Grok3 is without parallelism, but I don't think it's publicly known whether they use internal parallelism? So perhaps the low number already uses parallelism, while the high number uses even more parallelism?

Also, curious if you have any intuition as to why the no-parallelism number for AIME with Claude (61.3%) is quite low (e.g., relative to R1 87.3% -- assuming it is an apples to apples comparison)?

134. lcnPylGDnU4H9OF ◴[] No.43165644[source]
I'd also be interested in stats on Helix Fossil vs. Dome Fossil.
135. danw1979 ◴[] No.43165653{4}[source]
What I really want (as a current Pro subscriber) is a subscription tier ("Ultimate" at ~$120/month ?) that gives me priority access to the usual chat interface, but _also_ a bunch of API credits that would ensure Claude and I can code together for most of the average working month (reasonable estimate would be 4 hours a day, 15 days a month).

i.e I'd like my chat and API usage to be all included under a flat-rate subscription.

Currenty Pro doesn't give me any API credits to use with coding assistants (Claude Code included ?) which is completely disjointed. And I need to be a business to use the API still ?

Honestly, Claude is so good, just please take my money and make it easy to do the above !

replies(3): >>43165991 #>>43166003 #>>43166054 #
136. kkarpkkarp ◴[] No.43165657{4}[source]
see https://docs.anthropic.com/en/docs/agents-and-tools/claude-c...
137. punkpeye ◴[] No.43165659{6}[source]
Open router is great.

They have a very solid infrastructure.

Scaling infrastructure to handle billions of tokens is no joke.

I believe they are approaching 1 trillion tokens per week.

Glama is way smaller. We only recently crossed 10bn tokens per day.

However, I have invested a lot more into UX/UI of that chat itself, i.e. while OpenRouter is entirely focused on API gateway (which is working for them), I am going for a hybrid approach.

The market is big enough for both projects to co-exist.

138. danw1979 ◴[] No.43165745[source]
Serious answer: learn to code.

You still need to know what good code looks like to use these tools. If you go forward in your career trusting the output of LLMs without the skills to evaluate the correctness, style, functionality of that code then you will have problems.

People still write low level machine code today, despite compilers having existed for 70+ (?) years.

We'll always need full-stack humans who understand everything down to the electrons even in the age of insane automation that we're entering.

replies(3): >>43166122 #>>43166157 #>>43168981 #
139. keithwhor ◴[] No.43165829{6}[source]
Conversely, I think you're overestimating the impact of the value (or lack thereof) of technology over distribution and market timing.
140. failerk ◴[] No.43165836[source]
I tried signing up to use Claude about 6 months ago and ran into an error on the signup page. For some reason this completely locked me out from signing up since a phone number was tied to the login. I have submitted requests to get removed from this blacklist and heard nothing. The times I have tried to reach out on Twitter were never responded to. Has the customer support improved in the last 6 months?
replies(1): >>43166075 #
141. galaxyLogic ◴[] No.43165884[source]
The thing I would like automated is highlighting a function in my code then ask the AI to move it to a new module-file and import that new module.

I would like this to happen easily like hitting a menu or button without having to write an elaborate "prompt" every time.

Is this possible?

replies(1): >>43166081 #
142. FeepingCreature ◴[] No.43165926{6}[source]
To my knowledge these are the options:

1. RAG: A simple model looks at the question, pulls up some associated data into the context and hopes that it helps.

2. Self-RAG: The model "intentionally"/agentically triggers a lookup for some topic. This can be via a traditional RAG or just string search, ie. grep.

3. Full Context: Just jam everything in the context window. The model uses its attention mechanism to pick out the parts it needs. Best but most expensive of the three, especially with repeated queries.

Aider uses kind of a hybrid of 2 and 3: you specify files that go in the context, but Aider also uses Tree-Sitter to get a map of the entire codebase, ie. function headers, class definitions etc., that is provided in full. On that basis, the model can then request additional files to be added to the context.

replies(1): >>43166736 #
143. Aeolun ◴[] No.43165930{5}[source]
> Even if you are on the highest tier, you will quickly run into those limits when using coding assistants.

Even heavy coding sessions never run into Claude limits, and I’m nowhere near the highest tier.

replies(1): >>43167484 #
144. ◴[] No.43165965[source]
145. cpeterso ◴[] No.43165976[source]
A minor ChatGPT feature I miss with Claude is temporary chats. I use ChatGPT for a lot of random one-off questions and don’t want them filling up my chat history with so many conversations.
146. Aeolun ◴[] No.43165991{5}[source]
I don’t think you need to be a business to use the API? At least I’m fairly certain I’m using it in a personal capacity. You are never going to hit $120/month even with full-time usage (no guarantees of course, but I get to like $40/month).
replies(1): >>43166709 #
147. sha16 ◴[] No.43165995[source]
When I first started using Cursor the default behavior was for Claude to make a suggestion in the chat, and if the user agreed with it, they could click apply or cut and paste the part of it they wanted to use in their larger project. Now it seems the default behavior is for Claude to start writing files to the current working directory without regard for app structure or context (e.g., config files that are defined elsewhere claude likes to create another copy of). Why change the default to this? I could be wrong but I would guess most devs would want to review changes to their repo first.
replies(2): >>43166593 #>>43168098 #
148. istjohn ◴[] No.43166003{5}[source]
You don't need to be a business to use the API.
149. dghlsakjg ◴[] No.43166054{5}[source]
You can do this yourself. Anyone can buy API credits. I literally just did this with my personal credit card using my gmail based account earlier today.

1. Subscribe to Claude Pro for $20 month

2. Separately, Buy $100 worth of API credits.

Now you have a Claude "ultimate" subscription where the credits roll over as an added bonus.

As someone who only uses the APIs, and not the subscription services for AI, I can tell you that $100 is A LOT of usage. Quite frankly, I've never used anywhere close to $20 in a month which is why I don't subscribe. I mostly just use text though, so if you do a lot of image generation that can add up quickly

replies(2): >>43166309 #>>43166332 #
150. Aeolun ◴[] No.43166057{6}[source]
If Zed gets its agentice editing mode in I’m moving away from Cursor again. I’m only with them because they currently have the best experience there. Their moat is zero, and I’d much rather use purely API models than a Cursor subscription.
151. Aeolun ◴[] No.43166075[source]
You can try using it through Github Copilot? Just as a different avenue for usage.
replies(1): >>43168042 #
152. Aeolun ◴[] No.43166081[source]
I think most language servers have a feature like this right?
replies(1): >>43166326 #
153. jackjeff ◴[] No.43166122{3}[source]
Could not agree more! I have 20+ years experience and use Cursor/Sonnet daily. It saves huge amounts of time.

But I can’t imagine this tool in the hands of someone who does not have a solid understanding of programming.

You need to understand when to push back and why. It’s like doing mini code reviews all the time. LLMs are very convincing and will happily generate garbage with the utmost authority.

Don’t trust and absolutely verify.

154. simonw ◴[] No.43166130{4}[source]
Since the Claude Code docs suggest installing Ripgrep, my guess is that they mean that Claude Code often runs searches to find snippets to improve in the context.

I would argue that this is still RAG. There's a common misconception (or at least I think it's a misconception) that RAG only counts if you used vector search - I like to expand the definition of RAG to include non-vector search (like Ripgrep in this case), or any other technique where you use Retrieval techniques to Augment the Generation phase.

IR (Information Retrieval) has been around for many decades before vector search become fashionable: https://en.wikipedia.org/wiki/Information_retrieval

replies(2): >>43168083 #>>43168903 #
155. willsmith72 ◴[] No.43166135{5}[source]
As a growth company, they likely would prefer a larger amount of users even with occasional rate limits, vs smaller pool of power users.

As long as capacity is an issue, you can't have both

replies(1): >>43166920 #
156. simonw ◴[] No.43166157{3}[source]
+1 to this. There has never been a better time to learn to code - the learning curve is being shaved down by these new LLM-based tools, and the amount of value people with programming literacy can produce is going up by an order of magnitude.

People who know both coding and LLMs will be a whole lot more attractive to hire to build software than people who just know LLMs for many years to come.

replies(1): >>43178228 #
157. jiggawatts ◴[] No.43166183[source]
I really want to try your AI models, but "You must have a valid phone number to use Anthropic's services." is a show-stopper for me.

It's the only mainstream AI service that requests this information. After a string of security lapses by many of your competitors, I have zero faith in the ability of a "fast moving" AI-focused company to keep my PII data secure.

replies(2): >>43166848 #>>43167516 #
158. istjohn ◴[] No.43166219{4}[source]
You don't want all that code in one file anyway. Have Claude write the code as several modules. You'll put each module in its own file and then you can import functions and classes from one module to another. Claude can walk you through it.
159. m_kos ◴[] No.43166283{5}[source]
Your chat idea is a little similar to Abacus AI. I wish you had a similarly affordable monthly plan for chat only, but your UI seems much better. I may give it a try!
160. numba888 ◴[] No.43166309{6}[source]
I don't think you can generate images with claude. just asked it for pink elephant: "I can't generate images directly, but I can create an SVG representation of a pink elephant for you." And it did it :)
161. hassleblad23 ◴[] No.43166326{3}[source]
Moving a function or class? Yes. But moving arbitrary lines of code into their own function in a new module is still a PITA, particularly when the lines of code are not consecutive.
replies(1): >>43177802 #
162. dr_kiszonka ◴[] No.43166332{6}[source]
That is a good idea. For something like Claude Code, $100 is not a lot, though.
163. biker142541 ◴[] No.43166379{3}[source]
Reread the blog post, and I suspect Cursor will remain much more competitive on pricing! No specifics, but likely far exceeding typical Cursor costs for a typical developer. Maybe it's worth it, though? Look forward to trying.

>Claude Code consumes tokens for each interaction. Typical usage costs range from $5-10 per developer per day, but can exceed $100 per hour during intensive use.

replies(1): >>43168109 #
164. numba888 ◴[] No.43166422{6}[source]
Does it make sense to use vector search for code? It's more for vague texts. In the code relevant parts can be found by exact name match. (in most cases. both methods aren't exclusive)
replies(1): >>43166718 #
165. cat-snatcher ◴[] No.43166435{3}[source]
The UX team would never allow it. You gotta stay minimal and and definitely can't have any acknowledgement that a non-ideal user experience exists.
166. seunosewa ◴[] No.43166476[source]
Claude Code is on github: https://github.com/anthropics/claude-code
replies(2): >>43166710 #>>43166717 #
167. edmundsauto ◴[] No.43166498{3}[source]
I use AnythingLLM so you can still have a "Projects" like RAG.
168. scubbo ◴[] No.43166590{4}[source]
Lol, good one.
169. frohrer ◴[] No.43166593[source]
Cursor has two LLM interaction modes, chat and composer. The chat does what you described first and composer can create/edit/delete files directly. Have you checked which mode you're on? It should be a tab above your chat window.
replies(1): >>43170245 #
170. trees101 ◴[] No.43166607{3}[source]
with Claude coder, how does history work? I used it with my account, ran out of credit then switched to a work account but there was no chat history or other saved context of the work that had been done. I logged back in with my account to try copy it but it was gone.
171. Terretta ◴[] No.43166709{6}[source]
Careful -- a solo dev using it professionally, meaning, coding with it as a pair coder (XP style), can easily spend $1500/week.
replies(1): >>43185623 #
172. simonw ◴[] No.43166710{3}[source]
That repo is just there for issue reporting right now - https://github.com/anthropics/claude-code/issues - it doesn't contain the tool's source code.
173. theptip ◴[] No.43166711[source]
> We’ve also improved the coding experience on Claude.ai. Our GitHub integration is now available on all Claude plans—enabling developers to connect their code repositories directly to Claude

Would love to learn a bit more about how the GitHub integration works. From https://support.anthropic.com/en/articles/10167454-using-the... it seems it’s read only.

Does Claude Code let me take a generated/edited artifact and commit it back as a PR?

replies(1): >>43166750 #
174. rafram ◴[] No.43166717{3}[source]
There’s no source code in that repo.
replies(1): >>43169376 #
175. simonw ◴[] No.43166718{7}[source]
Vector search for code can be quite interesting - I've used it for things like "find me code that downloads stuff" and it's worked well. I think text search is usually better for code though.
176. kadushka ◴[] No.43166736{7}[source]
I'm still not sure I get the difference between 1 and 2. What is "pulls up some associated data into the context" vs ""intentionally"/agentically triggers a lookup for some topic"?
replies(2): >>43168010 #>>43169602 #
177. darkotic ◴[] No.43166748[source]
Love the UI so far. The experience feels very inspired by Aider, which is my current choice. Thanks!
178. simonw ◴[] No.43166750[source]
The https://claude.io/ integration is read-only. Basically you OAuth with GitHub and now you can select a repository, then select files or directories within it to add to either a Claude Project or to an individual prompt.

Claude Code can run commands including "git" commands, so it can create a branch, commit code to that branch and push that branch to GitHub - at which point point you can create a PR.

179. AdrianEGraphene ◴[] No.43166848[source]
It's a phone number. It's probably been bought / sold a few times already. Unless you're on the level of Edward Snowden, I wouldn't worry about it. But maybe your sense of privacy is more valuable than the outcome you'd get from Claude. That's fine too.
replies(1): >>43167338 #
180. cruffle_duffle ◴[] No.43166920{6}[source]
If people are paying for use, then why can’t you have both?
replies(1): >>43167140 #
181. bhl ◴[] No.43166956[source]
Paste it into Claude and ask it to made the minified code more readable ;)

Agree the code should just be open source but there's nothing secretive that you can't extract manually.

replies(1): >>43175755 #
182. vohk ◴[] No.43167006[source]
Not having headlines like "Claude Gives Stalker Instructions" has a significant value to their business I would wager.

I'm very much in favour of removing the guardrails but I understand why they're in place. The problem is attribution. You can teach yourself how to engage in all manner of dark deeds with a library or wikipedia or a search engine and some time, but any resulting public outcry is usually diffuse or targeted at the sources rather than the service. When Claude or GPT or Stable Diffusion are used to generate something judged offensive, the outcry becomes an existential threat to the provider.

183. thrdbndndn ◴[] No.43167057{3}[source]
Just tried it, is there a reason why the webUI is so slow?

Try to delete (close) the panel on the right on a side-by-side view. It took a good second to actually close. Creating one isn't much faster.

This is unbearably slow, to be blurt.

184. trees101 ◴[] No.43167130[source]
with Claude coder, how does history work? I used it with my account, ran out of credit then switched to a work account but there was no chat history or other saved context of the work that had been done. I logged back in with my account to try copy it but it was gone.
185. saulpw ◴[] No.43167140{7}[source]
It takes time to grow capacity to meet growing revenue/usage. As parent is saying, if you are in a growth market at time T with capacity X, you would rather have more people using it even if that means they can each use less.
replies(1): >>43168074 #
186. ipsum2 ◴[] No.43167155{3}[source]
It's not hard to make, its a relatively simple CLI tool so there's no moat. Also, the minified source code is available.
replies(1): >>43168289 #
187. jiggawatts ◴[] No.43167338{3}[source]
It's my phone number... linked to my Google identity... linked to every submitted user prompt... linked to my source code.

There's also been a spate of AI companies rushing to release products and having "oops" moments where they leaked customer chats or whatever.

They're not run like a FAANG, they don't have the same security pedigree, and they generally don't have any real guarantee of privacy.

So yes, my privacy is more valuable.

Conversely: Why is my non-privacy so valuable to Anthropic? Do they plan on selling my data? Maybe not now... but when funding gets a bit tight? Do they plan on selling my information to the likes of Cambridge Analytica? Not just superficial metadata, but also an AI-summarised history of my questions?

The best thing to do would be not to ask. But they are asking.

Why?

Why only them?

replies(2): >>43168081 #>>43169592 #
188. bhouston ◴[] No.43167418{3}[source]
Nice!

It seems very very similar. I open sourced the code to MyCoder here: https://github.com/drivecore/mycoder I'll compare them. Off hand I think both CodeBuff and Claude Coder are missing the web debugging tools I added to MyCoder.

replies(1): >>43188305 #
189. smokeydoe ◴[] No.43167484{6}[source]
I think it’s based on the tools you’re using. If I’m using Cline I don't have to try very hard to hit limits. I’m on the second tier.
190. czk ◴[] No.43167516[source]
I pay for a number from voip.ms and use sms forwarding. Its very cheap and it works on telegram as well which seemed fairly strict at detecting most voips.
191. raylad ◴[] No.43167719{3}[source]
The problem with the API is that it, as it says in the documentation, could cost $100/hr.

I would pay $50/mo or something to be able to have reasonable use of Claude Code in a limited (but not as limited) way as through the web UI, but all of these coding tools seem to work only with the API and are therefore either too expensive or too limited.

replies(1): >>43167844 #
192. koolala ◴[] No.43167804[source]
Does the fact its so ungodly expensive and highly rate limited kind of prove the modern point that AI actually uses tons of water and electricity per prompt? People are used to streaming YouTube while they sleep and it's hard to think of other web technology this intensive. OpenAI is hostile to this subject. Does Claude have plans to tackle this?
replies(1): >>43167832 #
193. golergka ◴[] No.43167832[source]
> People are used to streaming YouTube while they sleep

Youtube is used to showing them ads while they sleep

194. rudedogg ◴[] No.43167844{4}[source]
> The problem with the API is that it, as it says in the documentation, could cost $100/hr.

I've used https://github.com/cline/cline to get a similar workflow to their Claude Code demo, and yes it's amazing how quickly the token counts add up. Claude seems to have capacity issues so I'm guessing they decided to charge a premium for what they can serve up.

+1 on the too expensive or too limited sentiment. I subscribed to Claude for quite a while but got frustrated the few times I would use it heavily I'd get stuck due to the rate limits.

I could stomach a $20-$50 subscription for something like 3.7 that I could use a lot when coding, and not worry about hitting limits (or I suspect being pushed on to a quantized/smaller model when used too much).

replies(1): >>43172588 #
195. mianos ◴[] No.43167940[source]
I paid for it for a while, but I kept running out of usage limits right in the middle of work every day. I'd end up pasting the context into ChatGPT to continue. It was so frustrating, especially because I really liked it and used it a lot.

It became such an anti-pattern that I stopped paying. Now, when people ask me which one to use, I always say I like Claude more than others, but I don’t recommend using it in a professional setting.

replies(2): >>43170211 #>>43170510 #
196. neal_ ◴[] No.43168006{3}[source]
Cursor has no models, they dont even have an editor its just vscode
replies(2): >>43168866 #>>43168921 #
197. throwaway314155 ◴[] No.43168010{8}[source]
1. Tends to use embeddings with a similarity search. Sometimes called "retrieval". This is faster but similarity search doesn't alway work quite as well as you might want it to.

2. Instead lets the agent decide what to bring into context by using tools on the codebase. Since the tools used are fast enough, this gives you effectively "verified answers" so long as the agent didn't screw up its inputs to the tool (which will happen, most likely).

198. failerk ◴[] No.43168042{3}[source]
I don't want use the product after having a bad experience. If they cannot create a sign up page without it breaking for me why would I want to use this service? Things happen and bugs can occur, but the amount of effort I have put in to resolve the issue outweighs the alternatives that I have had no issues using.
199. brador ◴[] No.43168074{8}[source]
If you can’t scale with your customer base fire your CTO.
200. goatsi ◴[] No.43168081{4}[source]
It's an anti abuse method. A valid phone number will always have a cost for spammers/multi accounters to obtain in mass, but will have no cost for the desired user base (the assumption is that every worthwhile user already has a phone).

Captchas are trivially broken and you can get access to millions of residential IP addresses, but phone numbers (especially if you filter out VOIP providers) still have a cost.

201. wegfawefgawefg ◴[] No.43168083{5}[source]
rag is an acronym with a pinned meaning now. just like the word drone. drone didnt really mean drone, but drone means drone now. no amount of complaining will fix it. :[
202. sumedh ◴[] No.43168098[source]
This is a question for Cursor team.
203. re-thc ◴[] No.43168109{4}[source]
> Reread the blog post, and I suspect Cursor will remain much more competitive on pricing!

Until Cursor burns through their funding and gives up or increases their price.

replies(1): >>43169597 #
204. sangnoir ◴[] No.43168289{4}[source]
> It's not hard to make, its a relatively simple CLI tool so there's no moat

There are similar open source CLI tools that predate Claude Coder. Its reasonable to assume Anthropic chose not to contribute to those projects for reasons other than complexity, and charitably Anthropic likely plans for differentiating features.

> Also, the minified source code is available

The redistribution license - or lack thereof - will be the stumbling block to directly reusing code authored by Anthropic without authorization.

205. aizk ◴[] No.43168327[source]
Anthropic is still making the shovels
206. Imustaskforhelp ◴[] No.43168390[source]
Yup. Its a great issue which messes like , cmon you were there at the last line.
207. allpratik ◴[] No.43168413[source]
Plus one for this.
208. tayo42 ◴[] No.43168626[source]
will you guys allow remote work ever for engineers?
209. bluerobotcat ◴[] No.43168836[source]
What do I need to do to get unbanned? I have filled in the provided Google Docs form 3-4 times to no avail. I got banned almost immediately after joining. My best guess is that I got banned because I used a VPN. https://news.ycombinator.com/item?id=40808815
210. mattwad ◴[] No.43168866{4}[source]
And Typescript simply doesn't work for me. I have tried uninstalling extensions. It is always "Initializing". I reload windows, etc. It eventually might get there, I can't tell what's going on. At the moment, AI is not worth the trade-off of no Typescript support.
replies(1): >>43180609 #
211. jcheng ◴[] No.43168903{5}[source]
I agree that retrieval can take many forms besides vector search, but do we really want to call it RAG if the model is directing the search using a tool call? That like an important distinction to me and the name "agentic search" makes a lot more sense IMHO.
replies(1): >>43169099 #
212. tomduncalf ◴[] No.43168921{4}[source]
They do actually have custom models for autocomplete (which requires very low latency) and applying edits from the LLM (which turns out to require another LLM step, as they can’t reliably output perfect diffs)
213. pzo ◴[] No.43168981{3}[source]
I will give a little more pessimistic answer. If someone is right now studying CS then probably have expectation that can work with this profession for 30-40 years until retirement and this profession will still pay much more than average salary for most of devs anywhere (instead only of elite devs or those in US) and easily to find such job or easily switch employer.

I think the best period of Software Devs will be gone in few years. Knowing how how to code and fix things will be important still but more important to be also Jack-of-Many-Trades to provide more value: know a little about SEO, have a good taste of design and be able to tweak simple design, good taste how to organise code, better soft skills and managing or educating less tech-savvy stuff.

Another option is to specialise in some currently difficult subfield: robotics, ML, CUDA, rust and try to be this elite dev with expectation would have to move to SV or any such tech hub.

Best general recommendation I would give right now (especially for someone who is not from US) to someone who is currently studying is to use that a lot of time you have right now with not much responsibility to make some product that can provide you semi-passive income on a monthly basis ($5k-$10k) to drag yourself out of this rat race. Even if you not succeed or revenue stream will run out eventually you will learn those other skills that will be more important later if wanna be employed (SEO, code & design taste, marketing, soft skills).

Because most likely this window of opportunity might be only for the next few years in similar way when the best window for Mobile Apps was first ~2 years when App Store started

replies(1): >>43171846 #
214. anonym29 ◴[] No.43169047[source]
Not a question but thank you for helping make awesome software that helps us make awesome software, too :)
215. simonw ◴[] No.43169099{6}[source]
Yes, I think that's RAG. It's Retrieval Augmented Generation - you're retrieving content to augment the generation.

Who cares if you used vector search for the retrieval?

The best vector retrieval implementations are already switching to a hybrid between vector and FTS, because it turns out BM25 etc is still a better algorithm for a lot of use-cases.

"Agentic search" makes much less sense to me because the term "agentic" is so incredibly vague.

replies(1): >>43169580 #
216. answer123128 ◴[] No.43169107[source]
Hi @eschluntz, @catherinewu, @wolffiex, @bdr. Glad that you are so plucky and upbeat!

How do you feel about raking in millions while attempting to make us all unemployed?

How do you feel about stealing open source code and stripping the copyright?

217. nomilk ◴[] No.43169119[source]
Small UX suggestion, but could you make submission of prompt via URL parameter work? It used to be possible via https://claude.ai/new?q={query}, but that stopped working. It works for ChatGPT, Grok, and DeepSeek. With Claude you have to go and manually click the submit button.
218. kiraaa ◴[] No.43169294[source]
when there are two commands in a prompt example

do A and then do B.

the model completely ignores the second task B.

219. antouank ◴[] No.43169310[source]
Hi there. There are lots of phrases/patterns that Claude always uses when writing and it was very frustrating with 3.5. I can see with 3.7 those persist. Is there any way for me to contact you and show those so you can hopefully address them?
220. ◴[] No.43169376{4}[source]
221. throwaway454647 ◴[] No.43169421[source]
I'll be publishing a Firefox extension as a temporary fix, will post it here. (I don't use Chrome.)
replies(2): >>43169763 #>>43239657 #
222. regularfry ◴[] No.43169580{7}[source]
I think it depends who "you" is. In classic RAG the search mechanism is preordained, the search is done up front and the results handed to the model pre-baked. I'd interpret "agentic search" as anything where the model has potentially a collection of search tools that it can decide how to use best for a given query, so the search algorithm, the query, and the number of searches are all under its own control.
replies(2): >>43174140 #>>43183513 #
223. dist-epoch ◴[] No.43169592{4}[source]
Just buy a $5 burner phone number. No need to use your real one.
224. ◴[] No.43169597{5}[source]
225. ◴[] No.43169602{8}[source]
226. ZeroTalent ◴[] No.43169763{3}[source]
I think tampermonkey code is a better solution?
227. zaptrem ◴[] No.43170211{3}[source]
I have substantial usage via their API using LibreChat and have never run into rate limits. Why not just use that?
replies(1): >>43170881 #
228. ◴[] No.43170245{3}[source]
229. divan ◴[] No.43170510{3}[source]
Same.
230. yarbas89 ◴[] No.43170881{4}[source]
That sounds more expensive than the £18/mo Claude Pro costs?
replies(1): >>43178922 #
231. throw83288 ◴[] No.43171846{4}[source]
I would love to make "side revenue", but frankly I am awful at practical idea generation. I'm not a founder type I think, maybe a technical co-founder I guess.
232. jasonjmcghee ◴[] No.43172588{5}[source]
Claude Code does caching well fwiw. Looking my costs after a few code sessions (totaling $6 or so) the vast majority is cache read, which is great to see. Without caching it'd be wildly more expensive.

Like $5+ was cache read ($0.05/token vs $3/token) so it would have cost $300+

233. createaccount99 ◴[] No.43173097[source]
Did you run the Aider benchmarks to get a comparison of Claude Code vs. Aider?
234. tesch1 ◴[] No.43173593{3}[source]
Who is glama.ai though? Could not find company info on the site, the Frank name writing the blog posts seems to be an alias for Popeye the sailor. Am I missing something there? How can a user vet the company?
235. jcheng ◴[] No.43174140{8}[source]
Exactly. Was the extra information pushed to the model as part of the query? It’s RAG. Did the model pull the extra information in via a tool call? Agentic search.
replies(1): >>43186485 #
236. Daniel_Van_Zant ◴[] No.43174149{3}[source]
I see Cohere, is there any support for in-line citations like you can get with their first party API?
237. samstave ◴[] No.43174353[source]
Who the heck is on your UX team?

WHY is a huge % of my UX filled with nothing? I would apprececiate metrics, token graphs etc

https://i.imgur.com/VlxLCwI.png

Why so much wasted space? ... >>??

https://i.imgur.com/7LlCLUf.jpeg

238. swairshah ◴[] No.43175755{3}[source]
I did! its 900% over the context window limit :D I will have to do it function by function lets see a decent project for me and claude-3.7
239. galaxyLogic ◴[] No.43177802{4}[source]
So is moving a function or class possible? What actions you need to take to accomplish that? Thanks
replies(1): >>43219860 #
240. throw83288 ◴[] No.43178228{4}[source]
Can you just make a blog post on this explaining your thesis in detail? It's hard for me not to see non-technical "vibe coding" [0] sidelining everyone in the industry except for the most senior of senior devs/PMs.

[0] https://x.com/karpathy/status/1886192184808149383

241. zzygan ◴[] No.43178752{5}[source]
No issue with AU credit card here. Is a credit card and not a debit card though
242. zaptrem ◴[] No.43178922{5}[source]
Yes, but if you want more usage it is reasonable to expect to pay more.
243. airstrike ◴[] No.43180228{3}[source]
dunno who else to tell this but my pet request for the next version of Claude is to have it say "ensure" and "You're absolutely right!" less often
244. baumy ◴[] No.43180609{5}[source]
My entire company of 100+ engineers is using cursor on multiple large typescript repos with zero issues. Must be some kind of local setup issue on your end, it definitely works just fine. In fact I've seen consistently more useful / less junky results from using LLMs for code with typescript than any other language, particularly when cursor's "shadow workspace" option is enabled.
245. simonw ◴[] No.43183513{8}[source]
This is a really useful definition of "agentic search", thanks.
246. dghlsakjg ◴[] No.43185623{7}[source]
$1500 is 100 million output tokens, or 500 million input tokens for Claude 3.7.

The entire LOTR trilogy is ~.55 million tokens (1,200 pages, published).

If you are sending and receiving the text equivalent of several hundred copies of the LOTR trilogy every week, I don't think you are actually using AI for anything useful, or you are providing far too much context.

247. regularfry ◴[] No.43186485{9}[source]
That's far clearer. Yes.
248. srigi ◴[] No.43187266[source]
To me it doesn’t look like a bug. I believe it is a intended “feature” pushed from high management - a dark patern to make plebs pay for answer that has overflowed the quota.
249. handfuloflight ◴[] No.43188305{4}[source]
What do you do to build context?
250. danskeren ◴[] No.43192161[source]
A bit off topic but I wanted to let you know that anthropic is currently in violation of EU Directive 98/6/EC:

> The selling price and the unit price must be indicated in an unambiguous, easily identifiable and clearly legible manner for all products offered by traders to consumers (i.e. the final price should include value added tax and all other taxes).

I wanted to see what the annual plan would cost as it was just displaying €170+VAT, and when I clicked the upgrade button to find out (I checked everywhere on the page) then I was automatically subscribed without any confirmation and without ever seeing the final price before the transaction was completed.

replies(1): >>43192381 #
251. cft ◴[] No.43192381[source]
You can stuff up your EU directives up your nose, like your bottle caps when you try to drink from a European bottle
replies(1): >>43192546 #
252. danskeren ◴[] No.43192546{3}[source]
The bottle caps are a joke, but how can anyone in their right mind be against transparent pricing?

You think it's acceptable that a company say the price is €170+vat and then after the transaction is complete they inform you that the actual price was €206.50?

replies(1): >>43193635 #
253. cft ◴[] No.43193635{4}[source]
No, not OK. In this case, the recourse in the US is simple- contact the company, and when refused a refund, cancel the charge in your credit card wit a couple of simple clicks in the app.
254. hassleblad23 ◴[] No.43219860{5}[source]
This is supported natively by most IDEs today.
replies(1): >>43231985 #
255. hassleblad23 ◴[] No.43231985{6}[source]
At least Pycharm is good at it.
256. throwaway454647 ◴[] No.43239657{3}[source]
I've made the extension, but I haven't been able to test it (hence I'd rather not release it). I use Claude daily, but I haven't bumped into the situation yet where the generated output would disappear.
replies(1): >>43319648 #
257. throwaway454647 ◴[] No.43319648{4}[source]
Good news, I caught it today, I'll be able to iterate and at some point I'll publish my extension at Mozilla.