Most active commenters

    ←back to thread

    Gemini CLI

    (blog.google)
    1339 points sync | 17 comments | | HN request time: 0.535s | source | bottom
    1. wohoef ◴[] No.44378022[source]
    A few days ago I tested Claude Code by completely vibe coding a simple stock tracker web app in streamlit python. It worked incredibly well, until it didn't. Seems like there is a critical project size where it just can't fix bugs anymore. Just tried this with Gemini CLI and the critical project size it works well for seems to be quite a bit bigger. Where claude code started to get lost, I simply told Gemini CLI to "Analyze the codebase and fix all bugs". And after telling it to fix a few more bugs, the application simply works.

    We really are living in the future

    replies(8): >>44378198 #>>44378469 #>>44378677 #>>44378994 #>>44379068 #>>44379186 #>>44379685 #>>44384682 #
    2. AJ007 ◴[] No.44378198[source]
    Current best practice for Claude Code is to have heavy lifting done by Gemini Pro 2.5 or o3/o3pro. There are ways to do this pretty seamlessly now because of MCP support (see Repo Prompt as an example.) Sometimes you can also just use Claude but it requires iterations of planning, integration while logging everything, then repeat.

    I haven't looked at this Gemini CLI thing yet, but if its open source it seems like any model can be plugged in here?

    I can see a pathway where LLMs are commodities. Every big tech company right now both wants their LLM to be the winner and the others to die, but they also really, really would prefer a commodity world to one where a competitor is the winner.

    If the future use looks more like CLI agents, I'm not sure how some fancy UI wrapper is going to result in a winner take all. OpenAI is winning right now with user count by pure brand name with ChatGPT, but ChatGPT clearly is an inferior UI for real work.

    replies(3): >>44378680 #>>44383737 #>>44385874 #
    3. dawnofdusk ◴[] No.44378469[source]
    I feel like you get more mileage out of prompt engineering and being specific... not sure if "fix all the bugs" is an effective real-world use case.
    4. tvshtr ◴[] No.44378677[source]
    Yeah, and it's variable, can happen at 250k, 500k or later. When you interrogate it; usually the issue comes to it being laser focused or stuck on one specific issue, and it's very hard to turn it around. For the lack of the better comparison it feels like the AI is on a spectrum...
    5. sysmax ◴[] No.44378680[source]
    I think, there are different niches. AI works extremely well for Web prototyping because a lot of that work is superficial. Back in the 90s we had Delphi where you could make GUI applications with a few clicks as opposed to writing tons of things by hand. The only reason we don't have that for Web is the decentralized nature of it: every framework vendor has their own vision and their own plan for future updates, so a lot of the work is figuring out how to marry the latest version of component X with the specific version of component Y because it is required by component Z. LLMs can do that in a breeze.

    But in many other niches (say embedded), the workflow is different. You add a feature, you get weird readings. You start modelling in your head, how the timing would work, doing some combination of tracing and breakpoints to narrow down your hypotheses, then try them out, and figure out what works the best. I can't see the CLI agents do that kind of work. Depends too much on the hunch.

    Sort of like autonomous driving: most highway driving is extremely repetitive and easy to automate, so it got automated. But going on a mountain road in heavy rain, while using your judgment to back off when other drivers start doing dangerous stuff, is still purely up to humans.

    6. ugh123 ◴[] No.44378994[source]
    Claude seems to have trouble with extracting code snippets to add to the context as the session gets longer and longer. I've seen it get stuck in a loop simply trying to use sed/rg/etc to get just a few lines out of a file and eventually give up.
    7. TechDebtDevin ◴[] No.44379068[source]
    Yeah but this collapses under any real complexity and there is likely an extreme amount of redundant code and would probably be twice as memory efficient if you just wrote it yourself.

    Im actually interested to see if we see a rise in demand for DRAM that is greater than usual because more software is vibe coded than being not, or some form of vibe coding.

    8. crazylogger ◴[] No.44379186[source]
    Ask the AI to document each module in a 100-line markdown. These should be very high level, don't contain any detail, but just include pointers to relevant files for AI to find out by itself. With a doc as the starting point, AI will have context to work on any module.

    If the module just can't be documented in this way in under 100 lines, it's a good time to refactor. Chances are if Claude's context window is not enough to work with a particular module, a human dev can't either. It's all about pointing your LLM precisely at the context that matters.

    9. agotterer ◴[] No.44379685[source]
    I wonder how much of this had to do with the context window size? Gemini’s window is 5x larger than Cladue’s.

    I’ve been using Claude for a side project for the past few weeks and I find that we really get into a groove planning or debugging something and then by the time we are ready to implement, we’ve run out of context window space. Despite my best efforts to write good /compact instructions, when it’s ready to roll again some of the nuance is lost and the implementation suffers.

    I’m looking forward to testing if that’s solved by the larger Gemini context window.

    replies(3): >>44382389 #>>44383702 #>>44386731 #
    10. macNchz ◴[] No.44382389[source]
    I definitely think the bigger context window helps. The code quality quite visibly drops across all models I've used as the context fills up, well before the hard limit. The editor tooling also makes a difference—Claude Code pollutes its own context window with miscellaneous file accesses and tool calls as it tries to figure out what to do. Even if it's more manual effort to manage the files that are in-context with Aider, I find the results to be much more consistent when I'm able to micromanage the context.

    Approaching the context window limit in Claude Code, having it start to make more and worse mistakes, then seeing it try to compact the context and keep going, is a major "if you find yourself in a hole, stop digging" situation.

    11. data-ottawa ◴[] No.44383702[source]
    Does /compact help with this? I ran out of context with claude code for the first time today, so looking for any tips.

    I'm trying to get better at the /resume and memories to try and get more value out of the tool.

    replies(2): >>44383913 #>>44385223 #
    12. sagarpatil ◴[] No.44383737[source]
    You might want to give this a try: https://github.com/opencode-ai/opencode
    13. agotterer ◴[] No.44383913{3}[source]
    I thought I read that best practice was to start a new session every time you work on a new feature / task. That’s what I’ve been doing. I also often ask Claude to update my readme and claude.md with details about architecture or how something works.

    As for /compact, if I’m nearing the end of my context window (around 15%) and are still in the middle of something, I’ll give /compact very specific details about how and what to compact. Let’s say we are debugging an error - I might write something along the lines of “This session is about to close and we will continue debugging in next session. We will be debugging this error message [error message…]. Outline everything we’ve tried that didn’t work, make suggestions about what to try next, and outline any architecture or files that will be critical for this work. Everything else from earlier on in this session can be ignored.” I’ve had decent success with that. More so on debugging than trying to hand off all the details of a feature that’s being implemented.

    Reminder: you need context space for compact, so leave a little head room.

    14. tobyhinloopen ◴[] No.44384682[source]
    At some point, LLMs just get distracted and you might be better off throwing it away and restarting hah.
    15. nojs ◴[] No.44385223{3}[source]
    The best approach is never to get remotely close to the point where it auto-compacts. Type /clear often, and set up docs, macros etc to make it easy to built the context you need for new tasks quickly.

    If you see that 20% remaining warning, something has gone badly wrong and results will probably not get better until you clear the context and start again.

    16. jswny ◴[] No.44385874[source]
    Which MCPs allow you to for the heavy lifting with external models in Claude Code?
    17. seunosewa ◴[] No.44386731[source]
    I've found that I can quickly get a new AI session up to speed by adding critical context that it's missing. In my largest codebase it's usually a couple of critical functions.once they have the key context, they can do the rest. This of course doesn't work when you can't view their thinking process and interrupt it to supply them with the context that they are missing. Opacity doesn't work unless the agent does the right thing every time.