(github.com)

116 points rohansood15 | 3 comments | 26 Jul 24 20:34 UTC | HN request time: 0.439s | source

Hi HN! We’re Asankhaya and Rohan and we are building Patchwork.

Patchwork tackles development gruntwork—like reviews, docs, linting, and security fixes—through customizable, code-first 'patchflows' using LLMs and modular code management steps, all in Python. Here's a quick overview video: https://youtu.be/MLyn6B3bFMU

From our time building DevSecOps tools, we experienced first-hand the frustrations our users faced as they built complex delivery pipelines. Almost a third of developer time is spent on code management tasks[1], yet backlogs remain.

Patchwork lets you combine well-defined prompts with effective workflow orchestration to automate as much as 80% of these gruntwork tasks using LLMs[2]. For instance, the AutoFix patchflow can resolve 82% of issues flagged by semgrep using gpt-4 (or 68% with llama-3.1-8B) without fine-tuning or providing specialized context [3]. Success rates are higher for text-based patchflows like PR Review and Generate Docstring, but lower for more complex tasks like Dependency Upgrades.

We are not a coding assistant or a black-box GitHub bot. Our automation workflows run outside your IDE via the CLI or CI scripts without your active involvement.

We are also not an ‘AI agent’ framework. In our experience, LLM agents struggle with planning and rarely identify the right execution path. Instead, Patchwork requires explicitly defined workflows that provide greater success and full control.

Patchwork is open-source so you can build your own patchflows, integrate your preferred LLM endpoints, and fully self-host, ensuring privacy and compliance for large teams.

As devs, we prefer to build our own ‘AI-enabled automation’ given how easy it is to consume LLM APIs. If you do, try patchwork via a simple 'pip install patchwork-cli' or find us on Github[4].

Sources:

[1] https://blog.tidelift.com/developers-spend-30-of-their-time-...

[2] https://www.patched.codes/blog/patched-rtc-evaluating-llms-f...

[3] https://www.patched.codes/blog/how-good-are-llms

[4] https://github.com/patched-codes/patchwork

[Sample PRs] https://github.com/patched-demo/sample-injection/pulls

Show context

meiraleal ◴[27 Jul 24 03:05 UTC] No.41084187[source]▶

>>41082041 (OP) #

PR reviews are the one thing you sure don't want a LLM doing.

replies(4): >>41084276 #>>41084316 #>>41084513 #>>41086015 #

rohansood15 ◴[27 Jul 24 03:36 UTC] No.41084276[source]▶

>>41084187 #

I agree and disagree. You definitely need someone competent to take a look before merging in code, but you can do a first pass with an LLM to provide immediate feedback on any obvious issues as defined in your internal engineering standards.

Especially helpful if you're a team with where there's a wide variance in competency/experience levels.

replies(1): >>41084408 #

1. aaomidi ◴[27 Jul 24 04:11 UTC] No.41084408[source]▶

>>41084276 #

Until that immediate feedback is outright wrong feedback and now you’ve sent them down a goose chase.

replies(2): >>41084506 #>>41085521 #

2. rohansood15 ◴[27 Jul 24 04:45 UTC] No.41084506[source]▶

>>41084408 (TP) #

This is where prompting and context is key - you need to keep the scope of the review limited and well-defined. And ideally, you want to validate the review with another LLM before passing it to the dev.

Still won't be perfect, but you'll definitely get to a point where it's a net positive overall - especially with frontier models.

3. throwthrowuknow ◴[27 Jul 24 10:00 UTC] No.41085521[source]▶

>>41084408 (TP) #

That happens with human review too and often serves as an opportunity to clarify your reasoning to both the reviewer and yourself. If the code is easily misunderstood then you should take a second look at it and do something to make it easier to understand. Sometimes that process even turns up a problem that isn’t a bug now but could become one later when the code is modified by someone in the future.

↑

Show HN: Patchwork – Open-source framework to automate development gruntwork