Most active commenters
  • punkpeye(4)
  • eschluntz(3)

←back to thread

2127 points bakugo | 49 comments | | HN request time: 1.131s | source | bottom
1. pookieinc ◴[] No.43163554[source]
The biggest complaint I (and several others) have is that we continuously hit the limit via the UI after even just a few intensive queries. Of course, we can use the console API, but then we lose ability to have things like Projects, etc.

Do you foresee these limitations increasing anytime soon?

Quick Edit: Just wanted to also say thank you for all your hard work, Claude has been phenomenal.

replies(4): >>43163771 #>>43163889 #>>43164021 #>>43167940 #
2. eschluntz ◴[] No.43163771[source]
We are definitely aware of this (and working on it for the web UI), and that's why Claude Code goes directly through the API!
replies(3): >>43163984 #>>43164057 #>>43167719 #
3. clangfan ◴[] No.43163889[source]
this is also my problem, ive only used the UI with $20 subscription, can I use the same subscription to use the cli? I'm afraid its like those aws api billing where there is no limit to how much I can use then get a surprise bill
replies(2): >>43164409 #>>43166498 #
4. smallerfish ◴[] No.43163984[source]
I'm sure many of us would gladly pay more to get 3-5x the limit.

And I'm also sure that you're working on it, but some kind of auto-summarization of facts to reduce the context in order to avoid penalizing long threads would be sweet.

I don't know if your internal users are dogfooding the product that has user limits, so you may not have had this feedback - it makes me irritable/stressed to know that I'm running up close to the limit without having gotten to the bottom of a bug. I don't think stress response in your users is a desirable thing :).

replies(2): >>43165060 #>>43165318 #
5. punkpeye ◴[] No.43164021[source]
If you are open to alternatives, try https://glama.ai/gateway

We currently serve ~10bn tokens per day (across all models). OpenAI compatible API. No rate limits. Built in logging and tracing.

I work with LLMs every day, so I am always on top of adding models. 3.7 is also already available.

https://glama.ai/models/claude-3-7-sonnet-20250219

The gateway is integrated directly into our chat (https://glama.ai/chat). So you can use most of the things that you are used to having with Claude. And if anything is missing, just let me know and I will prioritize it. If you check our Discord, I have a decent track record of being receptive to feedback and quickly turning around features.

Long term, Glama's focus is predominantly on MCPs, but chat, gateway and LLM routing is integral to the greater vision.

I would love feedback if you are going to give a try frank@glama.ai

replies(5): >>43164075 #>>43164764 #>>43167057 #>>43173593 #>>43174149 #
6. sealthedeal ◴[] No.43164057[source]
I haven't been able to find ClaudeCLI for pubic access yet. Would love to use.
replies(2): >>43164217 #>>43165657 #
7. airstrike ◴[] No.43164075[source]
The issue isn't API limits, but web UI limits. We can always get around the web interface's limits by using the claude API directly but then you need to have some other interface...
replies(1): >>43164415 #
8. eschluntz ◴[] No.43164217{3}[source]
>>> npm install -g @anthropic-ai/claude-code

>>> claude

9. eschluntz ◴[] No.43164409[source]
It is API billing like AWS - you pay for what you use. Every time you exit a session we print the cost, and in the middle of a session you can do /cost to see your cost so far that session!

You can track costs in a few ways and set spend limits to avoid surprises: https://docs.anthropic.com/en/docs/agents-and-tools/claude-c...

replies(2): >>43164841 #>>43165653 #
10. punkpeye ◴[] No.43164415{3}[source]
The API still has limits. Even if you are on the highest tier, you will quickly run into those limits when using coding assistants.

The value proposition of Glama is that it combines UI and API.

While everyone focuses on either one or the other, I've been splitting my time equally working on both.

Glama UI would not win against Anthropic if we were to compare them by the number of features. However, the components that I developed were created with craft and love.

You have access to:

* Switch models between OpenAI/Anthropic, etc.

* Side-by-side conversations

* Full-text search of all your conversations

* Integration of LaTeX, Mermaid, rich-text editing

* Vision (uploading images)

* Response personalizations

* MCP

* Every action has a shortcut via cmd+k (ctrl+k)

replies(3): >>43164969 #>>43165930 #>>43166283 #
11. cmdtab ◴[] No.43164764[source]
Do you have deepseek r1 support? I need it for a current product I’m working on.
replies(2): >>43165021 #>>43165209 #
12. mindok ◴[] No.43164841{3}[source]
Which is theoretically great, but if anyone can get an Aussie credit card to work, please let me know.
replies(2): >>43165019 #>>43178752 #
13. airstrike ◴[] No.43164969{4}[source]
Ok, but that's not the issue the parent was mentioning. I've never hit API limits but, like the original comment mentioned, I too constantly hit the web interface limits particularly when discussing relatively large modules.
replies(1): >>43165299 #
14. robbiep ◴[] No.43165019{4}[source]
I haven’t had an issue with Aussie cards?

But I still hit limits, I use Claudemind with jetbrains stuff and there is a max of input tokens (j believe), I am ‘tier 2’ but doesn’t look like I can go past this without an enterprise agreement

15. pclmulqdq ◴[] No.43165021{3}[source]
They are just selling a frontend wrapper on other people's services, so if someone else offers deepseek, I'm sure they will integrate it.
16. punkpeye ◴[] No.43165209{3}[source]
Indeed we do https://glama.ai/models/deepseek-r1

It is provided by DeepSeek and Avian.

I am also midway of enabling a third-provider (Nebius).

You can see all models/providers over at https://glama.ai/models

As another commenter in this tread said, we are just a 'frontend wrapper' around other people services. Therefore, it is not particularly difficult to add models that are already supported by other providers.

The benefit of using our wrapper is that you can use a single API key and you get one bill for all your AI bills, you don't need to hack together your own logic for routing requests between different providers, failovers, keeping track of their costs, worry what happens if a provider goes down, etc.

The market at the moment is hugely fragmented, with many providers unstable, constantly shifting prices, etc. The benefit of a router is that you don't need to worry about those things.

replies(1): >>43165480 #
17. glenstein ◴[] No.43165299{5}[source]
Right, that's how I read it also. It's not that there's no limits with the API, but that they're appreciably different.
18. justinbaker84 ◴[] No.43165318{3}[source]
This is the main point I always want to communicate to the teams building foundation models.

A lot of people just want the ability to pay more in order to get more.

I would gladly pay 10x more to get relatively modest increases in performance. That is how important the intelligence is.

replies(1): >>43166135 #
19. cmdtab ◴[] No.43165480{4}[source]
Yeah I am aware. I use open router at the moment but I find it lacks a good UX.
replies(1): >>43165659 #
20. danw1979 ◴[] No.43165653{3}[source]
What I really want (as a current Pro subscriber) is a subscription tier ("Ultimate" at ~$120/month ?) that gives me priority access to the usual chat interface, but _also_ a bunch of API credits that would ensure Claude and I can code together for most of the average working month (reasonable estimate would be 4 hours a day, 15 days a month).

i.e I'd like my chat and API usage to be all included under a flat-rate subscription.

Currenty Pro doesn't give me any API credits to use with coding assistants (Claude Code included ?) which is completely disjointed. And I need to be a business to use the API still ?

Honestly, Claude is so good, just please take my money and make it easy to do the above !

replies(3): >>43165991 #>>43166003 #>>43166054 #
21. kkarpkkarp ◴[] No.43165657{3}[source]
see https://docs.anthropic.com/en/docs/agents-and-tools/claude-c...
22. punkpeye ◴[] No.43165659{5}[source]
Open router is great.

They have a very solid infrastructure.

Scaling infrastructure to handle billions of tokens is no joke.

I believe they are approaching 1 trillion tokens per week.

Glama is way smaller. We only recently crossed 10bn tokens per day.

However, I have invested a lot more into UX/UI of that chat itself, i.e. while OpenRouter is entirely focused on API gateway (which is working for them), I am going for a hybrid approach.

The market is big enough for both projects to co-exist.

23. Aeolun ◴[] No.43165930{4}[source]
> Even if you are on the highest tier, you will quickly run into those limits when using coding assistants.

Even heavy coding sessions never run into Claude limits, and I’m nowhere near the highest tier.

replies(1): >>43167484 #
24. Aeolun ◴[] No.43165991{4}[source]
I don’t think you need to be a business to use the API? At least I’m fairly certain I’m using it in a personal capacity. You are never going to hit $120/month even with full-time usage (no guarantees of course, but I get to like $40/month).
replies(1): >>43166709 #
25. istjohn ◴[] No.43166003{4}[source]
You don't need to be a business to use the API.
26. dghlsakjg ◴[] No.43166054{4}[source]
You can do this yourself. Anyone can buy API credits. I literally just did this with my personal credit card using my gmail based account earlier today.

1. Subscribe to Claude Pro for $20 month

2. Separately, Buy $100 worth of API credits.

Now you have a Claude "ultimate" subscription where the credits roll over as an added bonus.

As someone who only uses the APIs, and not the subscription services for AI, I can tell you that $100 is A LOT of usage. Quite frankly, I've never used anywhere close to $20 in a month which is why I don't subscribe. I mostly just use text though, so if you do a lot of image generation that can add up quickly

replies(2): >>43166309 #>>43166332 #
27. willsmith72 ◴[] No.43166135{4}[source]
As a growth company, they likely would prefer a larger amount of users even with occasional rate limits, vs smaller pool of power users.

As long as capacity is an issue, you can't have both

replies(1): >>43166920 #
28. m_kos ◴[] No.43166283{4}[source]
Your chat idea is a little similar to Abacus AI. I wish you had a similarly affordable monthly plan for chat only, but your UI seems much better. I may give it a try!
29. numba888 ◴[] No.43166309{5}[source]
I don't think you can generate images with claude. just asked it for pink elephant: "I can't generate images directly, but I can create an SVG representation of a pink elephant for you." And it did it :)
30. dr_kiszonka ◴[] No.43166332{5}[source]
That is a good idea. For something like Claude Code, $100 is not a lot, though.
31. edmundsauto ◴[] No.43166498[source]
I use AnythingLLM so you can still have a "Projects" like RAG.
32. Terretta ◴[] No.43166709{5}[source]
Careful -- a solo dev using it professionally, meaning, coding with it as a pair coder (XP style), can easily spend $1500/week.
replies(1): >>43185623 #
33. cruffle_duffle ◴[] No.43166920{5}[source]
If people are paying for use, then why can’t you have both?
replies(1): >>43167140 #
34. thrdbndndn ◴[] No.43167057[source]
Just tried it, is there a reason why the webUI is so slow?

Try to delete (close) the panel on the right on a side-by-side view. It took a good second to actually close. Creating one isn't much faster.

This is unbearably slow, to be blurt.

35. saulpw ◴[] No.43167140{6}[source]
It takes time to grow capacity to meet growing revenue/usage. As parent is saying, if you are in a growth market at time T with capacity X, you would rather have more people using it even if that means they can each use less.
replies(1): >>43168074 #
36. smokeydoe ◴[] No.43167484{5}[source]
I think it’s based on the tools you’re using. If I’m using Cline I don't have to try very hard to hit limits. I’m on the second tier.
37. raylad ◴[] No.43167719[source]
The problem with the API is that it, as it says in the documentation, could cost $100/hr.

I would pay $50/mo or something to be able to have reasonable use of Claude Code in a limited (but not as limited) way as through the web UI, but all of these coding tools seem to work only with the API and are therefore either too expensive or too limited.

replies(1): >>43167844 #
38. rudedogg ◴[] No.43167844{3}[source]
> The problem with the API is that it, as it says in the documentation, could cost $100/hr.

I've used https://github.com/cline/cline to get a similar workflow to their Claude Code demo, and yes it's amazing how quickly the token counts add up. Claude seems to have capacity issues so I'm guessing they decided to charge a premium for what they can serve up.

+1 on the too expensive or too limited sentiment. I subscribed to Claude for quite a while but got frustrated the few times I would use it heavily I'd get stuck due to the rate limits.

I could stomach a $20-$50 subscription for something like 3.7 that I could use a lot when coding, and not worry about hitting limits (or I suspect being pushed on to a quantized/smaller model when used too much).

replies(1): >>43172588 #
39. mianos ◴[] No.43167940[source]
I paid for it for a while, but I kept running out of usage limits right in the middle of work every day. I'd end up pasting the context into ChatGPT to continue. It was so frustrating, especially because I really liked it and used it a lot.

It became such an anti-pattern that I stopped paying. Now, when people ask me which one to use, I always say I like Claude more than others, but I don’t recommend using it in a professional setting.

replies(2): >>43170211 #>>43170510 #
40. brador ◴[] No.43168074{7}[source]
If you can’t scale with your customer base fire your CTO.
41. zaptrem ◴[] No.43170211[source]
I have substantial usage via their API using LibreChat and have never run into rate limits. Why not just use that?
replies(1): >>43170881 #
42. divan ◴[] No.43170510[source]
Same.
43. yarbas89 ◴[] No.43170881{3}[source]
That sounds more expensive than the £18/mo Claude Pro costs?
replies(1): >>43178922 #
44. jasonjmcghee ◴[] No.43172588{4}[source]
Claude Code does caching well fwiw. Looking my costs after a few code sessions (totaling $6 or so) the vast majority is cache read, which is great to see. Without caching it'd be wildly more expensive.

Like $5+ was cache read ($0.05/token vs $3/token) so it would have cost $300+

45. tesch1 ◴[] No.43173593[source]
Who is glama.ai though? Could not find company info on the site, the Frank name writing the blog posts seems to be an alias for Popeye the sailor. Am I missing something there? How can a user vet the company?
46. Daniel_Van_Zant ◴[] No.43174149[source]
I see Cohere, is there any support for in-line citations like you can get with their first party API?
47. zzygan ◴[] No.43178752{4}[source]
No issue with AU credit card here. Is a credit card and not a debit card though
48. zaptrem ◴[] No.43178922{4}[source]
Yes, but if you want more usage it is reasonable to expect to pay more.
49. dghlsakjg ◴[] No.43185623{6}[source]
$1500 is 100 million output tokens, or 500 million input tokens for Claude 3.7.

The entire LOTR trilogy is ~.55 million tokens (1,200 pages, published).

If you are sending and receiving the text equivalent of several hundred copies of the LOTR trilogy every week, I don't think you are actually using AI for anything useful, or you are providing far too much context.