←back to thread

262 points lawrencechen | 1 comments | | HN request time: 0.497s | source

0github.com is a pull request viewer that color-codes every diff line/token by how much human attention it probably needs. Unlike PR-review bots, we try to flag not just by "is it a bug?" but by "is it worth a second look?" (examples: hard-coded secret, weird crypto mode, gnarly logic, ugly code).

To try it, replace github.com with 0github.com in any pull-request URL. Under the hood, we split the PR into individual files, and for each file, we ask an LLM to annotate each line with a data structure that we parse into a colored heatmap.

Examples:

https://0github.com/manaflow-ai/cmux/pull/666

https://0github.com/stack-auth/stack-auth/pull/988

https://0github.com/tinygrad/tinygrad/pull/12995

https://0github.com/simonw/datasette/pull/2548

Notice how all the example links have a 0 prepended before github.com. This navigates you to our custom diff viewer where we handle the same URL path parameters as github.com. Darker yellows indicate that an area might require more investigation. Hover on the highlights to see the LLM's explanation. There's also a slider on the top left to adjust the "should review" threshold.

Repo (MIT license): https://github.com/manaflow-ai/cmux

Show context
MattyRad ◴[] No.45774928[source]
Seems like a catch-22. For codebases that I'm highly familiar with and regularly perform code review in, I'd say "thanks LLM, but I don't trust you, I'm more familiar with this codebase than you, and I don't need your help." For codebases that I'm not familiar with, I'm not really performing code review (at least not approving MR/PRs or doing the merging).

But still, this is very creative and a nice application of LLMs that isn't strictly barf.

replies(1): >>45775137 #
1. MattyRad ◴[] No.45775137[source]
Ok, I'll bite though, let's try it out as a non-maintainer.

I loaded https://0github.com/laravel/framework/pull/57499. Completely random, it's a PR in the last github repo I had open.

At 60%, it highlights significantly more test code than the material changes that need review. Strike one.

At no threshold (0-100) does it highlight the deleted code in UniqueBroadcastEvent.php, which seems highly important to review. The maintainer even comments about the removal in the actual PR! Strike two.

The only line that gets highlighted at > 50% in the material code diffs is one that hasn't changed. Strike three.

So, honest attempt, but it didn't work out for me.