Most active commenters
  • mritchie712(3)
  • lsaferite(3)

←back to thread

Tools: Code Is All You Need

(lucumr.pocoo.org)
313 points Bogdanp | 22 comments | | HN request time: 1.837s | source | bottom
1. mritchie712 ◴[] No.44454328[source]
> try completing a GitHub task with the GitHub MCP, then repeat it with the gh CLI tool. You'll almost certainly find the latter uses context far more efficiently and you get to your intended results quicker.

This is spot on. I have a "devops" folder with a CLAUDE.md with bash commands for common tasks (e.g. find prod / staging logs with this integration ID).

When I complete a novel task (e.g. count all the rows that were synced from stripe to duckdb) I tell Claude to update CLAUDE.md with the example. The next time I ask a similar question, Claude one-shots it.

This is the first few lines of the CLAUDE.md

    This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

    ## Purpose
    This devops folder is dedicated to Google Cloud Platform (GCP) operations, focusing on:
    - Google Cloud Composer (Airflow) DAG management and monitoring
    - Google Cloud Logging queries and analysis
    - Kubernetes cluster management (GKE)
    - Cloud Run service debugging

    ## Common DevOps Commands

    ### Google Cloud Composer
    ```bash
    # View Composer environment details
    gcloud composer environments describe meltano --location us-central1 --project definite-some-id

    # List DAGs in the environment
    gcloud composer environments storage dags list --environment meltano --location us-central1 --project definite-some-id

    # View DAG runs
    gcloud composer environments run meltano --location us-central1 dags list

    # Check Airflow logs
    gcloud logging read 'resource.type="cloud_composer_environment" AND resource.labels.environment_name="meltano"' --project definite-some-id --limit 50
replies(6): >>44454513 #>>44455438 #>>44456486 #>>44456645 #>>44458429 #>>44495829 #
2. lsaferite ◴[] No.44454513[source]
Just as a related aside, you could literally make that bottom section into a super simple stdio MCP Server and attach that to Claude Code. Each of your operations could be a tool and have a well-defined schema for parameters. Then you are giving the LLM a more structured and defined way to access your custom commands. I'm pretty positive there are even pre-made MCP Servers that are designed for just this activity.

Edit: First result when looking for such an MCP Server: https://github.com/inercia/MCPShell

replies(2): >>44454770 #>>44475726 #
3. gbrindisi ◴[] No.44454770[source]
wouldn't this defeat the point? Claude Code already has access to the terminal, adding specific instruction in the context is enough
replies(1): >>44455551 #
4. jayd16 ◴[] No.44455438[source]
I feel like I'm taking crazy pills sometimes. You have a file with a set of snippets and you prefer to ask the AI to hopefully run them instead of just running it yourself?
replies(5): >>44455582 #>>44455595 #>>44455618 #>>44458068 #>>44475635 #
5. lsaferite ◴[] No.44455551{3}[source]
No. You are giving textual instructions to Claude in the hopes that it correctly generates a shell command for you vs giving it a tool definition with a clearly defined schema for parameters and your MCP Server is, presumably, enforcing adherence to those parameters BEFORE it hits your shell. You would be helping Claude in this case as you're giving a clearer set of constraints on operation.
replies(2): >>44455956 #>>44456349 #
6. lreeves ◴[] No.44455582[source]
The commands aren't the special sauce, it's the analytical capabilities of the LLM to view the outputs of all those commands and correlate data or whatever. You could accomplish the same by prefilling a gigantic context window with all the logs but when the commands are presented ahead of time the LLM can "decide" which one to run based on what it needs to do.
7. light_hue_1 ◴[] No.44455595[source]
Yes. I'm not the poster but I do something similar.

Because now the capabilities of the model grow over time. And I can ask questions that involve a handful of those snippets. When we get to something new that requires some doing, it becomes another snippet.

I can offload everything I used to know about an API and never have to think about it again.

8. mritchie712 ◴[] No.44455618[source]
the snippets are examples. You can ask hundreds of variations of similar, but different, complex questions and the LLM can adjust the example for that need.

I don't have a snippet for, "find all 500's for the meltano service for duckdb syntax errors", but it'd easily nail that given the existing examples.

replies(1): >>44456100 #
9. fassssst ◴[] No.44455956{4}[source]
Either way it is text instructions used to call a function (via a JSON object for MCP or a shell command for scripts). What works better depends on how the model you’re using was post trained and where in the prompt that info gets injected.
10. dingnuts ◴[] No.44456100{3}[source]
but if I know enough about the service to write examples, most of the time I will know the command I want, which is less typing, faster, costs less, and doesn't waste a ton of electricity.

In the other cases I see what the computer outputs, LEARN, and then the functionality of finding what I need just isn't useful next time. Next time I just type the command.

I don't get it.

replies(1): >>44456852 #
11. wrs ◴[] No.44456349{4}[source]
Well, with MCP you’re giving textual instructions to Claude in hopes that it correctly generates a tool call for you. It’s not like tool calls have access to some secret deterministic mode of the LLM; it’s still just text.

To an LLM there’s not much difference between the list of sample commands above and the list of tool commands it would get from an MCP server. JSON and GNU-style args are very similar in structure. And presumably the command is enforcing constraints even better than the MCP server would.

replies(1): >>44459511 #
12. chriswarbo ◴[] No.44456486[source]
I use a similar file, but just for myself (I've never used an LLM "agent"). I live in Emacs, but this is the only thing I use org-mode for: it lets me fold/unfold the sections, and I can press C-c C-c over any of the code snippets to execute it. Some of them are shell code, some of them are Emacs Lisp code which generates shell code, etc.
13. stpedgwdgfhgdd ◴[] No.44456645[source]
I do something similar, but the problem is that claude.md keeps on growing.

To tackle this, I converted a custom prompt into an application, but there is an interesting trade-off. The application is deterministic. It cannot deal with unknown situations. In contrast to CC, which is way slower, but can try alternative ways of dealing with an unknown situation.

I ended up with adding an instruction to the custom command to run the application and fix the application code (TDD) if there is a problem. Self healing software… who ever thought

14. loudmax ◴[] No.44456852{4}[source]
LLMs are really good at processing vague descriptions of problems and offering a solution that's reasonably close to the mark. They can be a great guide for unfamiliar tools.

For example, I have a pretty good grasp of regular expressions because I'm an old Perl programmer, but I find processing json using `jq` utterly baffling. LLMs are great at coming up with useful examples, and sometimes they'll even get it perfect the first time. I've learned more about properly using `jq` with the help of LLMs than I ever did on my own. Same goes for `ffmpeg`.

LLMs are not a substitute for learning. When used properly, they're an enhancement to learning.

Likewise, never mind the idiot CEOs of failing companies looking forward to laying off half their workforce and replacing them with AI. When properly used, AI is a tool to help people become more productive, not replace human understanding.

15. qazxcvbnmlp ◴[] No.44458068[source]
You dont ask the ai to run the commands. you say "build and test this feature" and then the AI correctly iterates back and forth between the build and test commands until the thing works.
16. e12e ◴[] No.44458429[source]
You're letting the LLM execute privileged API calls against your production/test/staging environment, just hoping it won't corrupt something, like truncate logs, files, databases etc?

Or are you asking it to provide example commands that you can sanity check?

I'd be curious to see some more concrete examples.

17. lsaferite ◴[] No.44459511{5}[source]
Not strictly true. The LLM provider should be running a constrained token selection based off of the json schema of the tool call. That alone makes a massive difference as you're already discarding non-valid tokens during the completion at a low level. Now, if they had a BNF Grammer for each cli tool and enforced token selection based on that, you'd be much better off than unrestrained token selection.
replies(1): >>44460939 #
18. wrs ◴[] No.44460939{6}[source]
Yeah, that's why I said "not much" difference. I don't think it's much, because LLMs do quite well generating JSON without turning on constrained output mode, and I can't remember them ever messing up a bash command line unless the quoting got weird.
19. theshrike79 ◴[] No.44475635[source]
The AI will run whatever command it figures out might work, which might be wasteful and taint the context with useless crap.

But when you give it tools for retrieving all client+server logs combined (for a web application), it can use it and just get what it needs as simply as possible.

Or it'll start finding a function by digging around code files with grep, if you provide a tool that just lists all functions, their parameters and locations, it'll find the exact spot in one go.

20. theshrike79 ◴[] No.44475726[source]
The problem with MCP is that it's not composeable.

With separate tools or command snippets the LLM can run one command, feed the result to another command and grep that result for whatever it needs. One composed command or script and it gets exactly what it needed.

With MCPs it'd need to run every command separately, spending precious context for shuffling data from MCP tool to another.

21. tayloramurphy ◴[] No.44495829[source]
Fun to see Meltano mentioned here :)
replies(1): >>44500284 #
22. mritchie712 ◴[] No.44500284[source]
meltano4life