Most active commenters
  • macawfish(13)
  • eschaton(5)
  • KritVutGu(5)
  • otterley(4)
  • mattgreenrocks(3)
  • mattbee(3)
  • sho_hn(3)

←back to thread

728 points freetonik | 60 comments | | HN request time: 0.206s | source | bottom
Show context
Waterluvian ◴[] No.44976790[source]
I’m not a big AI fan but I do see it as just another tool in your toolbox. I wouldn’t really care how someone got to the end result that is a PR.

But I also think that if a maintainer asks you to jump before submitting a PR, you politely ask, “how high?”

replies(16): >>44976860 #>>44976869 #>>44976945 #>>44977015 #>>44977025 #>>44977121 #>>44977142 #>>44977241 #>>44977503 #>>44978050 #>>44978116 #>>44978159 #>>44978240 #>>44978311 #>>44978533 #>>44979437 #
1. cvoss ◴[] No.44976945[source]
It does matter how and where a PR comes from, because reviewers are fallible and finite, so trust enters the equation inevitably. You must ask "Do I trust where this came from?" And to answer that, you need to know where it come from.

If trust didn't matter, there wouldn't have been a need for the Linux Kernel team to ban the University of Minnesota for attempting to intentionally smuggle bugs through the PR process as part of an unauthorized social experiment. As it stands, if you / your PRs can't be trusted, they should not even be admitted to the review process.

replies(4): >>44977169 #>>44977263 #>>44978862 #>>44979553 #
2. koolba ◴[] No.44977169[source]
> You must ask "Do I trust where this came from?" And to answer that, you need to know where it come from.

No you don’t. You can’t outsource trust determinations. Especially to the people you claim not to trust!

You make the judgement call by looking at the code and your known history of the contributor.

Nobody cares if contributors use an LLM or a magnetic needle to generate code. They care if bad code gets introduced or bad patches waste reviewers’ time.

replies(3): >>44977245 #>>44977531 #>>44978479 #
3. falcor84 ◴[] No.44977245[source]
Trust is absolutely a thing. Maintaining an open source project is an unreasonably demanding and thankless job, and it would be even more so if you had to treat every single PR as if it's a high likelihood supply-chain attack.
replies(1): >>44977696 #
4. ToucanLoucan ◴[] No.44977445[source]
The sheer amount of entitlement on display by very pro-AI people genuinely boggles the mind.
replies(1): >>44977972 #
5. geraneum ◴[] No.44977531[source]
> Nobody cares if contributors use an LLM or a magnetic needle to generate code.

That’s exactly opposite of what the author is saying. He mentions that [if the code is not good, or you are a beginner] he will help you get to finish line, but if it’s LLM code, he shouldn’t be putting effort because there’s no human on the other side.

It makes sense to me.

replies(1): >>44978344 #
6. fnimick ◴[] No.44977696{3}[source]
While true, we really should be treating every single piece of external code as though it's malicious.
replies(1): >>44978629 #
7. mattgreenrocks ◴[] No.44977972{3}[source]
They genuinely believe their use of chatbots is equivalent to multiple years of production experience in a language. They want to erase that distinction (“democratize”) so they can have the same privileges and status without the work.

Otherwise, what’s the harm in saying AI guides you to the solution if you can attest to it being a good solution?

replies(4): >>44977994 #>>44978056 #>>44978461 #>>45005379 #
8. ToucanLoucan ◴[] No.44977994{4}[source]
I guess it's just different kinds of people. I have used Copilot to generate code I barely understand (stuff for a microcontroller project, nothing important) but I wouldn't in a thousand years say I wrote it. I broadly understand how it works, and like, if someone wanted to see it, I'd show them. But like... how can you take pride in something you didn't make?
replies(1): >>44978053 #
9. mattgreenrocks ◴[] No.44978053{5}[source]
Not making the thing is a point in favor of LLMs for some of these people I suspect. So the pride in work thing is just not high on the list of incentives.

I don’t get it at all. Feels like modernity is often times just inventing pale shadows of things with more addictive hooks to induce needlessly dependent behavior.

replies(1): >>45005303 #
10. macawfish ◴[] No.44978056{4}[source]
That's just not true. I have 20 years of dev experience and also am using these tools. I won't commit slop. I'm open to being transparent about my usage of AI but tbh right now there's so much bias and vitriol coming from people afraid of these new tools that in the haze of their fear I don't trust people will actually take the time to neutrally determine whether or not the code is actually slop. I've had manually written, well thought through, well conceived, rough around the edges code get called "AI slop" by a colleague (who I very much respect and have a good relationship with) who admittedly hadn't had a chance to thoroughly understand the code yet.

If I just vibe-coded something and haven't looked at the code myself, that seems like a necessary thing to disclose. But beyond that, if the code is well understood and solid, I feel that I'd be clouding the conversation by unnecessarily bringing the tools I used into it. If I understand the code and feel confident in it, whether I used AI or not seems irrelevant and distracting.

This policy is just shoving the real problem under the rug. Generative AI is going to require us to come up with better curation/filtering/selection tooling, in general. This heuristic of "whether or not someone self-disclosed using LLMs" just doesn't seem very useful in the long run. Maybe it's a piece of the puzzle but I'm pretty sure there are more useful ways to sift through PRs than that. Line count differences, for example. Whether it was a person with an LLM or a 10x coder without one, a PR that adds 15000 lines is just not likely to be it.

replies(3): >>44978392 #>>44978442 #>>44978444 #
11. macawfish ◴[] No.44978344{3}[source]
"but if it’s LLM code, he shouldn’t be putting effort because there’s no human on the other side"

That's the false equivalence right there

replies(1): >>44978651 #
12. eschaton ◴[] No.44978392{5}[source]
You should not just be open to being transparent, you need to understand that there will be times you will be required to be transparent about the tools you’ve used and the ultimate origin of your contributions, and that trying to evade or even push back against it is a huge red flag that you cannot be trusted to abide by your commitments.

If you’re unwilling to stop using slop tools, then you don’t get to contribute to some projects, and you need to be accept that.

replies(2): >>44978632 #>>44978751 #
13. moron4hire ◴[] No.44978432[source]
I think this is an excellent example of how software is different from everything else that is copyright-able. If you look at how GenAI is being applied to "the arts" and how completely destructive it is to the visual mediums, it is clearly a completely different beast to code. "The artists" of visual art don't want AI trained on their work and competing with their work. I completely understand. The point of art is to be a personal interaction between artist and consumer. But, even though I'm quite skeptical of AI code gen on a practical standpoint, it really doesn't feel like the same existential threat.
replies(1): >>45006528 #
14. gizmo686 ◴[] No.44978442{5}[source]
> I've had manually written, well thought through, well conceived, rough around the edges code get called "AI slop" by a colleague (who I very much respect and have a good relationship with) who admittedly hadn't had a chance to thoroughly understand the code yet.

This is the core problem with AI that makes so many people upset. In the old days, if you get a substantial submission, you know a substantial amount of effort went into it. You know that someone at some point had a mental model of what the submission was. Even if they didn't translate that perfectly, you can still try to figure out what they meant and we're thinking. You know the submitter put forth significant effort. That is a real signal that they are both willing and able to do so to address going forward to address issues you raise.

The existence of AI slop fundamentally breaks these assumptions. That is why we need enforced social norms around disclosure.

replies(2): >>44978592 #>>45005475 #
15. mattgreenrocks ◴[] No.44978444{5}[source]
I get it. But it’s ultimately up to the maintainer here.
replies(1): >>44978556 #
16. throwawaybob420 ◴[] No.44978461{4}[source]
Democratize X via boiling our oceans!!
replies(1): >>44979507 #
17. eschaton ◴[] No.44978479[source]
You’re completely incorrect. People care a lot about where code came from. They need to be able to trust that code you’re contributing was not copied from a project under AGPLv3, if the project you’re contributing to is under a different license.

Stop trying to equate LLM-generated code with indexing-based autocomplete. They’re not the same thing at all: LLM-generated code is equivalent to code copied off Stack Overflow, which is also something you’d better not be attempting to fraudulently pass off as your own work.

replies(2): >>44979814 #>>44981058 #
18. macawfish ◴[] No.44978556{6}[source]
Of course. I don't actually think the maintainer guidelines here are entirely unreasonable, even if I'd personally modify them to reduce reactive panic.

My little essay up there is more so a response to the heated "LLM people vs pure people" comments I'm reading all over this discussion. Some of this stuff just seems entirely misguided and fear driven.

19. macawfish ◴[] No.44978592{6}[source]
We need better social norms about disclosure, but maybe those don't need to be about "whether or not you used LLMs" and might have more to do with "how well you understand the code you are opening a PR for" (or are reviewing, for that matter). Normalize draft PRs and sharing big messes of code you're not quite sure about but want to start a conversation about. Normalize admitting that you don't fully understand the code you've written / are tasked with reviewing and that this is perfectly fine and doesn't reflect poorly on you at all, in fact it reflects humility and a collaborative spirit.

10x engineers create so many bugs without AI, and vibe coding could multiply that to 100x. But let's not distract from the source of that, which is rewarding the false confidence it takes to pretend we understand stuff that we actually don't.

replies(3): >>44978880 #>>44979072 #>>45005522 #
20. tsimionescu ◴[] No.44978629{4}[source]
No, we shouldn't. We live in a society, and that level of distrust is not just unrealistic, it's disastrous. This doesn't mean you should share your house keys with every drive by PR contributor, but neither should you treat every PR as if it's coming from Jia Tan.
21. macawfish ◴[] No.44978632{6}[source]
Your blanket determination that the tools themselves are slop generators is an attitude I'm definitely not interested in engaging with in collaboration.
replies(1): >>44978669 #
22. tsimionescu ◴[] No.44978651{4}[source]
It's not a false equivalence. You can teach a beginner to become an intermediate (and later a master, if they stick to it). You can't teach an LLM to be better. Every piece of feedback you give to an LLM is like screaming into the void - it wastes your time, and doesn't change the LLM one iota.
replies(1): >>44978685 #
23. macawfish ◴[] No.44978685{5}[source]
"Every piece of feedback you give to an LLM is like screaming into the void - it wastes your time, and doesn't change the LLM one iota."

I think you just haven't gotten the hang of it yet, which is fine... the tooling is very immature and hard to get consistent results with. But this isn't a given. Some people do get good, steerable LLM coding setups.

replies(2): >>44979402 #>>44979604 #
24. ikiris ◴[] No.44978751{6}[source]
What tools did you use to generate this response? Please include the make and model of the devices, and what os and browser you were running, including all browser plugins and other code which had screen access at that time.
replies(1): >>44978785 #
25. macawfish ◴[] No.44978754{8}[source]
I'd way sooner quit my job / not contribute than deal with someone who projects on me the way you have in this conversation.
replies(1): >>44978790 #
26. eschaton ◴[] No.44978790{9}[source]
Enjoy being found out for fraudulently passing off work you didn’t do as your own then.
replies(1): >>44978855 #
27. macawfish ◴[] No.44978855{10}[source]
Ironically as a practice I'm actually quite transparent about how I use LLMs and believe that destigmatising open conversation about use of these tools is actually really important, just not that it's a useful heuristic for whether or not some code is slop.
28. otterley ◴[] No.44978862[source]
If it comes with good documentation and appropriate tests, does that help?
replies(2): >>44979254 #>>44979456 #
29. inferiorhuman ◴[] No.44978880{7}[source]

  but maybe those don't need to be about "whether or not you used LLMs" and might have more to do
  with "how well you understand the code you are opening a PR for" (or are reviewing, for that matter)
AI is a great proxy for how much someone has. If you're writing a PR you're demonstrating some manner of understanding. If you're submitting AI slop you're not.
replies(1): >>44978961 #
30. macawfish ◴[] No.44978961{8}[source]
I've worked with 10x developers who committed a lot of features and a lot of bugs, and who got lots of accolades for all their green squares. They did not use LLM dev tools because those didn't exist then.

If they had used AI, their PRs might have been more understandable / less buggy, and ultimately I would have preferred that.

replies(1): >>44980594 #
31. risyachka ◴[] No.44979072{7}[source]
>> but maybe those don't need to be about "whether or not you used LLMs"

The only reason one may not want disclosure is if one can’t write anything by themselves, thus they will have to label all code as AI generated and everyone will see their real skill level.

32. mattbee ◴[] No.44979254[source]
The observation that inspired this policy is that if you used AI, it is likely you don't know if the code, the documentation or tests are good or appropriate.
replies(1): >>44979322 #
33. otterley ◴[] No.44979322{3}[source]
What if you started with good documentation that you personally wrote, you gave that to the agent, and you verified the tests were appropriate and passed?
replies(1): >>44979539 #
34. sho_hn ◴[] No.44979402{6}[source]
Steering via prompting isn't the same as fundamentally changing the LLM by teaching, as you can do with humans. I think OP understands this better than you.
replies(1): >>44979589 #
35. explorigin ◴[] No.44979456[source]
I suppose it depends if AI is writing the tests an documentation.
36. macawfish ◴[] No.44979507{5}[source]
https://youtu.be/klW65MWJ1PY?t=3234

https://youtu.be/klW65MWJ1PY?t=1320

X sucks and should not be allowed to proceed with what they're doing in Memphis. Nor should Meta be allowed to proceed with multiple Manhattan sized data centers.

37. mattbee ◴[] No.44979539{4}[source]
I'd extrapolate that the OP's view would be: you've still put in less effort, so your PR is less worthy of his attention than someone who'd done the same without using LLMs.

That's a pretty nice offer from one of the most famous and accomplished free software maintainers in the world. He's promising not to take a short-cut reviewing your PR, in exchange for you not taking a short-cut writing it in the first place.

replies(1): >>44979731 #
38. therealpygon ◴[] No.44979540[source]
Being more trusting of people’s code simply because they didn’t use AI seems as naive as distrusting code contributions simply because they were written with the assistance of AI.

It seems a bit like saying you can’t trust a legal document because it was written on a computer with spellcheck, rather than by a $10 an hour temp with a typewriter.

39. RossBencina ◴[] No.44979553[source]
> "Do I trust where this came from?"

In an open source project I think you have to start with a baseline assumption of "trust nobody." Exceptions possibly if you know the contributors personally, or have built up trust over years of collaboration.

I wouldn't reject or decline to review a PR just because I don't trust the contributor.

replies(1): >>44981560 #
40. macawfish ◴[] No.44979589{7}[source]
Can't tell if you're responding in earnest or not here?

LLMs are trained to be steerable at inference time via context/prompting. Fine tuning is also possible and often used. Both count as "feedback" in my book, and my point is that both can be effective at "changing the LLM" in terms of its behavior at inference time.

replies(1): >>44980193 #
41. david_allison ◴[] No.44979604{6}[source]
As a maintainer, if you're dealing with a contributor who's sending in AI slop, you have no opportunity to prompt the LLM.

The PR effectively ends up being an extremely high-latency conversation with an LLM, via another human who doesn't have the full context/understanding of the problem.

replies(1): >>44980772 #
42. otterley ◴[] No.44979731{5}[source]
> in exchange for you not taking a short-cut writing it in the first place.

This “short cut” language suggests that the quality of the submission is going to be objectively worse by way of its provenance.

Yet, can one reliably distinguish working and tested code generated by a person vs a machine? We’re well past passing Turing tests at this point.

replies(1): >>44980065 #
43. koolba ◴[] No.44979814{3}[source]
I’m not equating any type of code generation. I’m saying that as a maintainer you have to evaluate any submission on the merits, not on a series of yes/no questions provided by the submitter. And your own judgement is influenced by what you know about the submitter.
replies(1): >>44981192 #
44. mattbee ◴[] No.44980065{6}[source]
LLMs can't count letters, their writing is boring, and you can trick them into talking gibberish. That is a long way off the Turing test, even if we were fooled for a couple of weeks in 2022.

IMO when people declare that LLMs "pass" at a particular skill, it's a sign that they don't have the taste or experience to judge that skill themselves. Or - when it's CEOs - they have an interest in devaluing it.

So yes if you're trying to fool an experienced open source maintainer with unrefined LLM-generated code, good luck (especially one who's said he doesn't want that).

replies(1): >>44984592 #
45. sho_hn ◴[] No.44980193{8}[source]
And also clearly not what the OP means, who was trying to make a point that tuning the prompt to an otherwise stateless LLM inference job is nothing at all like teaching a human being. Mechanically, computationally, morally or emotionally. For example, humans aren't just tools; giving feedback to LLMs does little to further their agency.
replies(1): >>44980767 #
46. inferiorhuman ◴[] No.44980594{9}[source]

  If they had used AI, their PRs might have been more understandable / less buggy, and ultimately I would have preferred that.
Sure, and if they had used AI pigs could depart my rectum on a Part 121 flight. One has absolutely nothing to do with the other. Submitting AI slop does not demonstrate any knowledge of the code in question even if you do understand the code.

To address your claim about AI slop improving the output of these mythical 10x coders: doubtful. LLMs can only approximate meaningful output if they've already indexed the solution. If your vaunted 10x coders are working on already solved problems you're likely wasting their time. If they're working on something novel LLMs are of little use. For instance: I've had the pleasure of working with a notoriously poorly documented crate that's also got a reputation for frequently making breaking changes. I used DDG and Google to see if I could track down someone with a similar use case. If I forgot to append "-ai" to the query I'd get back absolutely asinine results typically along the line of "here's an answer with rust and one of the words in your query". At best first sentence would explain something entirely unrelated about the crate.

Potentially LLMs could be improved by ingesting more and more data, but that's an arms race they're destined to lose. People are already turning to Cloudflare and Anubis en masse to avoid being billed for training LLMs. If Altman and co. had to pay market rate for their training data nobody could afford to use these AI doodads.

47. macawfish ◴[] No.44980767{9}[source]
The false equivalence I pointed at earlier was "LLM code => no human on the other side".

The person driving the LLM is a teachable human who can learn what's what's going on and learn to improve the code. It's simply not true that there's no person on the other side of the PR.

The idea that we should be comparing "teaching a human" to "teaching an LLM" is yet another instance of this false equivalence.

It's not inherently pointless to provide feedback on a PR with code written using an LLM, that feedback goes to the person using the LLM tools.

People are swallowing this b.s. marketing mystification of "LLMs as non human entities". But really they're fancy compilers that we have a lot to learn about.

replies(1): >>44981579 #
48. macawfish ◴[] No.44980772{7}[source]
You're totally dismissing this person's agency and their ability to learn. You're all but writing off their existence.
49. fluidcruft ◴[] No.44981058{3}[source]
How does an "I didn't use AI" pledge provide any assurance/provenance that submitted code wasn't copied from an AGPLv3 reference?
replies(1): >>44981229 #
50. eschaton ◴[] No.44981192{4}[source]
And I’m saying, as a maintainer, you have to and are doing both, even if you don’t think you are.

For example, you either make your contributors attest that their changes are original or that they have the right to contribute their changes—or you assume this of them and consider it implicit in their submission.

What you (probably) don’t do is welcome contributions that the contributors do not have the right to make.

51. eschaton ◴[] No.44981229{4}[source]
It doesn’t, it provides an assurance (but not provenance) you didn’t use AI.

Assuring you didn’t include any AGPLv3 code in your contribution is exactly the same kind of assurance. It also doesn’t provide any provenance.

Conflating assurance with provenance is bogus because the former is about making a representation that, if false, exposes the person making it to liability. For most situations that’s sufficient that provenance isn’t needed.

52. nullc ◴[] No.44981560[source]
Better to think in terms of distrust rather than trust.

Presumably if a contributor repeatedly made bad PRs that didn't do what they said, introduced bugs, scribbled pointlessly on the codebase, and when you tried to coach or clarify at best they later forgot everything you said and at worst outright gaslit and lied to you about their PRs... you would reject or decline to review their PRs, right? You'd presumably ban the outright.

Well that's exactly what commercial LLM products, with the aid of less sophisticated users, have already done to the maintainers of many large open source projects. It's not that they're not trusted-- they should be distrusted with ample cause.

So what if the above banned contributor kept getting other people to mindlessly submit their work and even proxy communication through -- evading your well earned distrust and bans? Asking people to at least disclose that they were acting on behalf of the distrusted contributor would be the least you would do, I hope? Or even asking them to disclose if and to what extent their work was a collaboration with a distrusted contributor?

53. nullc ◴[] No.44981579{10}[source]
The person operating the LLM is not a meaningfully teachable human when they're not disclosing that they're using an LLM.

IF they disclose what they've done, provided the prompts, etc. then other contributors can help them get better results from the tools. But the feedback is very different than the feedback you'd give a human that actually wrote the code in question, that latter feedback is unlikely to be of much value (and even less likely to persist).

replies(1): >>44982704 #
54. sho_hn ◴[] No.44982704{11}[source]
Yep, true.

I've done things like share a ChatGPT account with a junior dev to steer them toward better prompts, actually, and that had some merit.

55. otterley ◴[] No.44984592{7}[source]
We’re talking about code here, not prose.

Would you like to take the Pepsi challenge? Happy to put random code snippets in front of you and see whether you can accurately determine whether it was written by a human or an LLM.

56. KritVutGu ◴[] No.45005303{6}[source]
> the pride in work thing is just not high on the list of incentives

Thanks for putting it so well.

That is what hurts. A lot. Taking pride out of work, especially creative work, makes the world a worse place; it makes life less worth living.

> inventing pale shadows of things

Yes.

57. KritVutGu ◴[] No.45005379{4}[source]
> Otherwise, what’s the harm in saying AI guides you to the solution if you can attest to it being a good solution?

For one: it threatens to make an entire generation of programmers lazy and stupid. They stop exercising their creative muscle. Writing and reviewing are different activities; both should be done continuously.

This is perfectly observable with a foreign language. If you stop actively using a foreign language after learning it really well, your ability to speak it fades pretty quickly, while your ability to understand it fades too, but less quickly.

58. KritVutGu ◴[] No.45005475{6}[source]
> The existence of AI slop fundamentally breaks these assumptions. That is why we need enforced social norms around disclosure.

Exactly! The code used double as "proof of work". Well-formed language used to double as "proof of thinking". And that's what AI breaks: it speaks, but doesn't think. And my core point is that language that does not originate from well-reasoned human effort (i.e., from either writing the language directly, or from writing such code manually that generates the language deterministically, and for known reasons/intents), does not deserve human attention. Even if the "observable behavior" of such language (when executed as code) looks "alright".

And because I further think that no code should be accepted without human review (which excludes both not reviewing AI-generated code at all and having some other AI review the AI-generated code), I conclude that AI-generated code can never be accepted.

59. KritVutGu ◴[] No.45005522{7}[source]
> Normalize draft PRs and sharing big messes of code you're not quite sure about but want to start a conversation about. Normalize admitting that you don't fully understand the code you've written / are tasked with reviewing and that this is perfectly fine and doesn't reflect poorly on you at all, in fact it reflects humility and a collaborative spirit.

Such behaviors can only be normalized in a classroom / ramp-up / mentorship-like setting. Which is very valid, BUT:

- Your reviewers are always overloaded, so they need some official mandate / approval to mentor newcomers. This is super important, and should be done everywhere.

- Even with the above in place: because you're being mentored with great attention to detail, you owe it to your reviewer not to drown them in AI slop. You must honor them by writing every single line that you ask them to spend their attention on yourself. Ultimately, their educative efforts are invested IN YOU, not (only) in the code that may finally be merged. I absolutely refuse to review or otherwise correct AI slop, while at the same time I'm 100% committed to transfer whatever knowledge I may have to another human.

Fuck AI.

60. KritVutGu ◴[] No.45006528{3}[source]
It is precisely the same existential threat, to code and to software developers, in my eyes. I take the exact same pride in my code (which I want to be free software, BTW) as artists do in their art. Writing code is a form of self-expression and self-realization for me, and as such, it is completely personal, between myself, and those (humans) who read my code.