Most active commenters

punkpeye(4)
eschluntz(3)

Popular/hot comments

>>43164021 #
>>43163771 #
>>43164415 #
>>43165653 #

←back to thread

Claude 3.7 Sonnet and Claude Code

(www.anthropic.com)

Show context

bcherny ◴[24 Feb 25 19:04 UTC] No.43163488[source]▶

>>43163011 (OP) #

Hi everyone! Boris from the Claude Code team here. @eschluntz, @catherinewu, @wolffiex, @bdr and I will be around for the next hour or so and we'll do our best to answer your questions about the product.

replies(82): >>43163527 #>>43163532 #>>43163549 #>>43163554 #>>43163555 #>>43163576 #>>43163585 #>>43163588 #>>43163589 #>>43163592 #>>43163593 #>>43163632 #>>43163642 #>>43163664 #>>43163677 #>>43163733 #>>43163758 #>>43163789 #>>43163803 #>>43163813 #>>43163821 #>>43163893 #>>43163909 #>>43163915 #>>43163921 #>>43163957 #>>43163958 #>>43163992 #>>43164069 #>>43164089 #>>43164102 #>>43164103 #>>43164104 #>>43164111 #>>43164127 #>>43164158 #>>43164329 #>>43164353 #>>43164424 #>>43164482 #>>43164514 #>>43164585 #>>43164616 #>>43164768 #>>43164797 #>>43164819 #>>43164899 #>>43165002 #>>43165057 #>>43165065 #>>43165088 #>>43165091 #>>43165187 #>>43165308 #>>43165355 #>>43165409 #>>43165468 #>>43165499 #>>43165516 #>>43165570 #>>43165578 #>>43165592 #>>43165836 #>>43165884 #>>43165965 #>>43165976 #>>43165995 #>>43166183 #>>43166711 #>>43166748 #>>43167130 #>>43167804 #>>43168626 #>>43168836 #>>43169047 #>>43169107 #>>43169119 #>>43169294 #>>43169310 #>>43173097 #>>43174353 #>>43192161 #

1. pookieinc ◴[24 Feb 25 19:09 UTC] No.43163554[source]▶

>>43163488 #

The biggest complaint I (and several others) have is that we continuously hit the limit via the UI after even just a few intensive queries. Of course, we can use the console API, but then we lose ability to have things like Projects, etc.

Do you foresee these limitations increasing anytime soon?

Quick Edit: Just wanted to also say thank you for all your hard work, Claude has been phenomenal.

replies(4): >>43163771 #>>43163889 #>>43164021 #>>43167940 #

2. eschluntz ◴[24 Feb 25 19:25 UTC] No.43163771[source]▶

>>43163554 (TP) #

We are definitely aware of this (and working on it for the web UI), and that's why Claude Code goes directly through the API!

replies(3): >>43163984 #>>43164057 #>>43167719 #

3. clangfan ◴[24 Feb 25 19:35 UTC] No.43163889[source]▶

>>43163554 (TP) #

this is also my problem, ive only used the UI with $20 subscription, can I use the same subscription to use the cli? I'm afraid its like those aws api billing where there is no limit to how much I can use then get a surprise bill

replies(2): >>43164409 #>>43166498 #

4. smallerfish ◴[24 Feb 25 19:42 UTC] No.43163984[source]▶

>>43163771 #

I'm sure many of us would gladly pay more to get 3-5x the limit.

And I'm also sure that you're working on it, but some kind of auto-summarization of facts to reduce the context in order to avoid penalizing long threads would be sweet.

I don't know if your internal users are dogfooding the product that has user limits, so you may not have had this feedback - it makes me irritable/stressed to know that I'm running up close to the limit without having gotten to the bottom of a bug. I don't think stress response in your users is a desirable thing :).

replies(2): >>43165060 #>>43165318 #

5. punkpeye ◴[24 Feb 25 19:45 UTC] No.43164021[source]▶

>>43163554 (TP) #

If you are open to alternatives, try https://glama.ai/gateway

We currently serve ~10bn tokens per day (across all models). OpenAI compatible API. No rate limits. Built in logging and tracing.

I work with LLMs every day, so I am always on top of adding models. 3.7 is also already available.

https://glama.ai/models/claude-3-7-sonnet-20250219

The gateway is integrated directly into our chat (https://glama.ai/chat). So you can use most of the things that you are used to having with Claude. And if anything is missing, just let me know and I will prioritize it. If you check our Discord, I have a decent track record of being receptive to feedback and quickly turning around features.

Long term, Glama's focus is predominantly on MCPs, but chat, gateway and LLM routing is integral to the greater vision.

I would love feedback if you are going to give a try frank@glama.ai

replies(5): >>43164075 #>>43164764 #>>43167057 #>>43173593 #>>43174149 #

6. sealthedeal ◴[24 Feb 25 19:48 UTC] No.43164057[source]▶

>>43163771 #

I haven't been able to find ClaudeCLI for pubic access yet. Would love to use.

replies(2): >>43164217 #>>43165657 #

7. airstrike ◴[24 Feb 25 19:49 UTC] No.43164075[source]▶

>>43164021 #

The issue isn't API limits, but web UI limits. We can always get around the web interface's limits by using the claude API directly but then you need to have some other interface...

replies(1): >>43164415 #

8. eschluntz ◴[24 Feb 25 20:01 UTC] No.43164217{3}[source]▶

>>43164057 #

>>> npm install -g @anthropic-ai/claude-code

>>> claude

9. eschluntz ◴[24 Feb 25 20:14 UTC] No.43164409[source]▶

>>43163889 #

It is API billing like AWS - you pay for what you use. Every time you exit a session we print the cost, and in the middle of a session you can do /cost to see your cost so far that session!

You can track costs in a few ways and set spend limits to avoid surprises: https://docs.anthropic.com/en/docs/agents-and-tools/claude-c...

replies(2): >>43164841 #>>43165653 #

10. punkpeye ◴[24 Feb 25 20:15 UTC] No.43164415{3}[source]▶

>>43164075 #

The API still has limits. Even if you are on the highest tier, you will quickly run into those limits when using coding assistants.

The value proposition of Glama is that it combines UI and API.

While everyone focuses on either one or the other, I've been splitting my time equally working on both.

Glama UI would not win against Anthropic if we were to compare them by the number of features. However, the components that I developed were created with craft and love.

You have access to:

* Switch models between OpenAI/Anthropic, etc.

* Side-by-side conversations

* Full-text search of all your conversations

* Integration of LaTeX, Mermaid, rich-text editing

* Vision (uploading images)

* Response personalizations

* MCP

* Every action has a shortcut via cmd+k (ctrl+k)

replies(3): >>43164969 #>>43165930 #>>43166283 #

11. cmdtab ◴[24 Feb 25 20:47 UTC] No.43164764[source]▶

>>43164021 #

Do you have deepseek r1 support? I need it for a current product I’m working on.

replies(2): >>43165021 #>>43165209 #

12. mindok ◴[24 Feb 25 20:56 UTC] No.43164841{3}[source]▶

>>43164409 #

Which is theoretically great, but if anyone can get an Aussie credit card to work, please let me know.

replies(2): >>43165019 #>>43178752 #

13. airstrike ◴[24 Feb 25 21:10 UTC] No.43164969{4}[source]▶

>>43164415 #

Ok, but that's not the issue the parent was mentioning. I've never hit API limits but, like the original comment mentioned, I too constantly hit the web interface limits particularly when discussing relatively large modules.

replies(1): >>43165299 #

14. robbiep ◴[24 Feb 25 21:16 UTC] No.43165019{4}[source]▶

>>43164841 #

I haven’t had an issue with Aussie cards?

But I still hit limits, I use Claudemind with jetbrains stuff and there is a max of input tokens (j believe), I am ‘tier 2’ but doesn’t look like I can go past this without an enterprise agreement

15. pclmulqdq ◴[24 Feb 25 21:16 UTC] No.43165021{3}[source]▶

>>43164764 #

They are just selling a frontend wrapper on other people's services, so if someone else offers deepseek, I'm sure they will integrate it.

16. punkpeye ◴[24 Feb 25 21:35 UTC] No.43165209{3}[source]▶

>>43164764 #

Indeed we do https://glama.ai/models/deepseek-r1

It is provided by DeepSeek and Avian.

I am also midway of enabling a third-provider (Nebius).

You can see all models/providers over at https://glama.ai/models

As another commenter in this tread said, we are just a 'frontend wrapper' around other people services. Therefore, it is not particularly difficult to add models that are already supported by other providers.

The benefit of using our wrapper is that you can use a single API key and you get one bill for all your AI bills, you don't need to hack together your own logic for routing requests between different providers, failovers, keeping track of their costs, worry what happens if a provider goes down, etc.

The market at the moment is hugely fragmented, with many providers unstable, constantly shifting prices, etc. The benefit of a router is that you don't need to worry about those things.

replies(1): >>43165480 #

17. glenstein ◴[24 Feb 25 21:42 UTC] No.43165299{5}[source]▶

>>43164969 #

Right, that's how I read it also. It's not that there's no limits with the API, but that they're appreciably different.

18. justinbaker84 ◴[24 Feb 25 21:45 UTC] No.43165318{3}[source]▶

>>43163984 #

This is the main point I always want to communicate to the teams building foundation models.

A lot of people just want the ability to pay more in order to get more.

I would gladly pay 10x more to get relatively modest increases in performance. That is how important the intelligence is.

replies(1): >>43166135 #

19. cmdtab ◴[24 Feb 25 22:05 UTC] No.43165480{4}[source]▶

>>43165209 #

Yeah I am aware. I use open router at the moment but I find it lacks a good UX.

replies(1): >>43165659 #

20. danw1979 ◴[24 Feb 25 22:25 UTC] No.43165653{3}[source]▶

>>43164409 #

What I really want (as a current Pro subscriber) is a subscription tier ("Ultimate" at ~$120/month ?) that gives me priority access to the usual chat interface, but _also_ a bunch of API credits that would ensure Claude and I can code together for most of the average working month (reasonable estimate would be 4 hours a day, 15 days a month).

i.e I'd like my chat and API usage to be all included under a flat-rate subscription.

Currenty Pro doesn't give me any API credits to use with coding assistants (Claude Code included ?) which is completely disjointed. And I need to be a business to use the API still ?

Honestly, Claude is so good, just please take my money and make it easy to do the above !

replies(3): >>43165991 #>>43166003 #>>43166054 #

21. kkarpkkarp ◴[24 Feb 25 22:25 UTC] No.43165657{3}[source]▶

>>43164057 #

see https://docs.anthropic.com/en/docs/agents-and-tools/claude-c...

22. punkpeye ◴[24 Feb 25 22:25 UTC] No.43165659{5}[source]▶

>>43165480 #

Open router is great.

They have a very solid infrastructure.

Scaling infrastructure to handle billions of tokens is no joke.

I believe they are approaching 1 trillion tokens per week.

Glama is way smaller. We only recently crossed 10bn tokens per day.

However, I have invested a lot more into UX/UI of that chat itself, i.e. while OpenRouter is entirely focused on API gateway (which is working for them), I am going for a hybrid approach.

The market is big enough for both projects to co-exist.

23. Aeolun ◴[24 Feb 25 22:56 UTC] No.43165930{4}[source]▶

>>43164415 #

> Even if you are on the highest tier, you will quickly run into those limits when using coding assistants.

Even heavy coding sessions never run into Claude limits, and I’m nowhere near the highest tier.

replies(1): >>43167484 #

24. Aeolun ◴[24 Feb 25 23:03 UTC] No.43165991{4}[source]▶

>>43165653 #

I don’t think you need to be a business to use the API? At least I’m fairly certain I’m using it in a personal capacity. You are never going to hit $120/month even with full-time usage (no guarantees of course, but I get to like $40/month).

replies(1): >>43166709 #

25. istjohn ◴[24 Feb 25 23:04 UTC] No.43166003{4}[source]▶

>>43165653 #

You don't need to be a business to use the API.

26. dghlsakjg ◴[24 Feb 25 23:10 UTC] No.43166054{4}[source]▶

>>43165653 #

You can do this yourself. Anyone can buy API credits. I literally just did this with my personal credit card using my gmail based account earlier today.

1. Subscribe to Claude Pro for $20 month

2. Separately, Buy $100 worth of API credits.

Now you have a Claude "ultimate" subscription where the credits roll over as an added bonus.

As someone who only uses the APIs, and not the subscription services for AI, I can tell you that $100 is A LOT of usage. Quite frankly, I've never used anywhere close to $20 in a month which is why I don't subscribe. I mostly just use text though, so if you do a lot of image generation that can add up quickly

replies(2): >>43166309 #>>43166332 #

27. willsmith72 ◴[24 Feb 25 23:21 UTC] No.43166135{4}[source]▶

>>43165318 #

As a growth company, they likely would prefer a larger amount of users even with occasional rate limits, vs smaller pool of power users.

As long as capacity is an issue, you can't have both

replies(1): >>43166920 #

28. m_kos ◴[24 Feb 25 23:41 UTC] No.43166283{4}[source]▶

>>43164415 #

Your chat idea is a little similar to Abacus AI. I wish you had a similarly affordable monthly plan for chat only, but your UI seems much better. I may give it a try!

29. numba888 ◴[24 Feb 25 23:44 UTC] No.43166309{5}[source]▶

>>43166054 #

I don't think you can generate images with claude. just asked it for pink elephant: "I can't generate images directly, but I can create an SVG representation of a pink elephant for you." And it did it :)

30. dr_kiszonka ◴[24 Feb 25 23:47 UTC] No.43166332{5}[source]▶

>>43166054 #

That is a good idea. For something like Claude Code, $100 is not a lot, though.

31. edmundsauto ◴[25 Feb 25 00:08 UTC] No.43166498[source]▶

>>43163889 #

I use AnythingLLM so you can still have a "Projects" like RAG.

32. Terretta ◴[25 Feb 25 00:39 UTC] No.43166709{5}[source]▶

>>43165991 #

Careful -- a solo dev using it professionally, meaning, coding with it as a pair coder (XP style), can easily spend $1500/week.

replies(1): >>43185623 #

33. cruffle_duffle ◴[25 Feb 25 01:17 UTC] No.43166920{5}[source]▶

>>43166135 #

If people are paying for use, then why can’t you have both?

replies(1): >>43167140 #

34. thrdbndndn ◴[25 Feb 25 01:37 UTC] No.43167057[source]▶

>>43164021 #

Just tried it, is there a reason why the webUI is so slow?

Try to delete (close) the panel on the right on a side-by-side view. It took a good second to actually close. Creating one isn't much faster.

This is unbearably slow, to be blurt.

35. saulpw ◴[25 Feb 25 01:50 UTC] No.43167140{6}[source]▶

>>43166920 #

It takes time to grow capacity to meet growing revenue/usage. As parent is saying, if you are in a growth market at time T with capacity X, you would rather have more people using it even if that means they can each use less.

replies(1): >>43168074 #

36. smokeydoe ◴[25 Feb 25 02:44 UTC] No.43167484{5}[source]▶

>>43165930 #

I think it’s based on the tools you’re using. If I’m using Cline I don't have to try very hard to hit limits. I’m on the second tier.

37. raylad ◴[25 Feb 25 03:18 UTC] No.43167719[source]▶

>>43163771 #

The problem with the API is that it, as it says in the documentation, could cost $100/hr.

I would pay $50/mo or something to be able to have reasonable use of Claude Code in a limited (but not as limited) way as through the web UI, but all of these coding tools seem to work only with the API and are therefore either too expensive or too limited.

replies(1): >>43167844 #

38. rudedogg ◴[25 Feb 25 03:33 UTC] No.43167844{3}[source]▶

>>43167719 #

> The problem with the API is that it, as it says in the documentation, could cost $100/hr.

I've used https://github.com/cline/cline to get a similar workflow to their Claude Code demo, and yes it's amazing how quickly the token counts add up. Claude seems to have capacity issues so I'm guessing they decided to charge a premium for what they can serve up.

+1 on the too expensive or too limited sentiment. I subscribed to Claude for quite a while but got frustrated the few times I would use it heavily I'd get stuck due to the rate limits.

I could stomach a $20-$50 subscription for something like 3.7 that I could use a lot when coding, and not worry about hitting limits (or I suspect being pushed on to a quantized/smaller model when used too much).

replies(1): >>43172588 #

39. mianos ◴[25 Feb 25 03:47 UTC] No.43167940[source]▶

>>43163554 (TP) #

I paid for it for a while, but I kept running out of usage limits right in the middle of work every day. I'd end up pasting the context into ChatGPT to continue. It was so frustrating, especially because I really liked it and used it a lot.

It became such an anti-pattern that I stopped paying. Now, when people ask me which one to use, I always say I like Claude more than others, but I don’t recommend using it in a professional setting.

replies(2): >>43170211 #>>43170510 #

40. brador ◴[25 Feb 25 04:07 UTC] No.43168074{7}[source]▶

>>43167140 #

If you can’t scale with your customer base fire your CTO.

41. zaptrem ◴[25 Feb 25 10:33 UTC] No.43170211[source]▶

>>43167940 #

I have substantial usage via their API using LibreChat and have never run into rate limits. Why not just use that?

replies(1): >>43170881 #

42. divan ◴[25 Feb 25 11:17 UTC] No.43170510[source]▶

>>43167940 #

Same.

43. yarbas89 ◴[25 Feb 25 12:09 UTC] No.43170881{3}[source]▶

>>43170211 #

That sounds more expensive than the £18/mo Claude Pro costs?

replies(1): >>43178922 #

44. jasonjmcghee ◴[25 Feb 25 14:52 UTC] No.43172588{4}[source]▶

>>43167844 #

Claude Code does caching well fwiw. Looking my costs after a few code sessions (totaling $6 or so) the vast majority is cache read, which is great to see. Without caching it'd be wildly more expensive.

Like $5+ was cache read ($0.05/token vs $3/token) so it would have cost $300+

45. tesch1 ◴[25 Feb 25 16:01 UTC] No.43173593[source]▶

>>43164021 #

Who is glama.ai though? Could not find company info on the site, the Frank name writing the blog posts seems to be an alias for Popeye the sailor. Am I missing something there? How can a user vet the company?

46. Daniel_Van_Zant ◴[25 Feb 25 16:40 UTC] No.43174149[source]▶

>>43164021 #

I see Cohere, is there any support for in-line citations like you can get with their first party API?

47. zzygan ◴[25 Feb 25 23:13 UTC] No.43178752{4}[source]▶

>>43164841 #

No issue with AU credit card here. Is a credit card and not a debit card though

48. zaptrem ◴[25 Feb 25 23:38 UTC] No.43178922{4}[source]▶

>>43170881 #

Yes, but if you want more usage it is reasonable to expect to pay more.

49. dghlsakjg ◴[26 Feb 25 17:06 UTC] No.43185623{6}[source]▶

>>43166709 #

$1500 is 100 million output tokens, or 500 million input tokens for Claude 3.7.

The entire LOTR trilogy is ~.55 million tokens (1,200 pages, published).

If you are sending and receiving the text equivalent of several hundred copies of the LOTR trilogy every week, I don't think you are actually using AI for anything useful, or you are providing far too much context.

↑