←back to thread

549 points thecr0w | 1 comments | | HN request time: 0s | source
Show context
thuttinger ◴[] No.46184466[source]
Claude/LLMs in general are still pretty bad at the intricate details of layouts and visual things. There are a lot of problems that are easy to get right for a junior web dev but impossible for an LLM. On the other hand, I was able to write a C program that added gamma color profile support to linux compositors that don't support it (in my case Hyprland) within a few minutes! A - for me - seemingly hard task, which would have taken me at least a day or more if I didn't let Claude write the code. With one prompt Claude generated C code that compiled on first try that:

- Read an .icc file from disk

- parsed the file and extracted the VCGT (video card gamma table)

- wrote the VCGT to the video card for a specified display via amdgpu driver APIs

The only thing I had to fix was the ICC parsing, where it would parse header strings in the wrong byte-order (they are big-endian).

replies(3): >>46184840 #>>46185379 #>>46185476 #
jacquesm ◴[] No.46185379[source]
Claude didn't write that code. Someone else did and Claude took that code without credit to the original author(s), adapted it to your use case and then presented it as its own creation to you and you accepted this. If a human did this we probably would have a word for them.
replies(16): >>46185404 #>>46185408 #>>46185442 #>>46185473 #>>46185478 #>>46185791 #>>46185885 #>>46185911 #>>46186086 #>>46186326 #>>46186420 #>>46186759 #>>46187004 #>>46187058 #>>46187235 #>>46188771 #
mlinsey ◴[] No.46185791[source]
Certainly if a human wrote code that solved this problem, and a second human copied and tweaked it slightly for their use case, we would have a word for them.

Would we use the same word if two different humans wrote code that solved two different problems, but one part of each problem was somewhat analogous to a different aspect of a third human's problem, and the third human took inspiration from those parts of both solutions to create code that solved a third problem?

What if it were ten different humans writing ten different-but-related pieces of code, and an eleventh human piecing them together? What if it were 1,000 different humans?

I think "plagiarism", "inspiration", and just "learning from" fall on some continuous spectrum. There are clear differences when you zoom out, but they are in degree, and it's hard to set a hard boundary. The key is just to make sure we have laws and norms that provide sufficient incentive for new ideas to continue to be created.

replies(6): >>46186125 #>>46186199 #>>46187063 #>>46188272 #>>46189797 #>>46194087 #
nitwit005 ◴[] No.46187063[source]
Ask for something like "a first person shooter using software rendering", and search github for the function names for the rendering functions. Using Copilot I found code simply lifted from implementations of Doom, except that "int" was replaced with "int32_t" and similar.

It's also fun to tell Copilot that the code will violate a license. It will seemingly always tell you it's fine. Safe legal advice.

replies(2): >>46187330 #>>46190083 #
martin-t ◴[] No.46187330[source]
And this is just the stuff you notice.

1) Verbatin copy is first-order plagiarism.

2a) Second-order plagiarism of written text would be replacing words with synonyms. Or taking a book paragraph by paragraph and for each one of them, rephrasing it in your own words. Yes, it might fool automated checkers but the structure would still be a copy of the original book. And most importantly, it would not contain any new information. No new positive-sum work was done. It would have no additional value.

Before LLMs almost nobody did this because the chance that it would help in a lawsuit vs the amount of work was not a good tradeoff. Now it is. But LLMs can do "better":

2b) A different kind of second-order plagiarism is using multiple sources and plagiarizing each of them only in part. Find multiple books on the same topic, take 1 chapter from each and order them in a coherent manner. Make it more granular. Find paragraphs or phrases which fit into the structure of your new book but are verbatim from other books. See how granular you can make it.

The trick here is that doing this by hand is more work than just writing your own book. So nobody did it and copyright law does not really address this well. But with LLMs, it can be automated. You can literally instruct an LLM to do this and it will do it cheaper than any human could. However, how LLMs work internally is yet different:

n) Higher-order plagiarism is taking multiple source books, identifying patterns, and then reproducing them in your "new" book.

If the patterns are sufficiently complex, nobody will ever be able to prove what specifically you did. What previously took creative human work now became a mechanical transformation of input data.

The point is this ability to detect and reproduce patterns is an impressive innovation but it's built on top of the work of hundreds of millions[0] of humans whose work was used without consent. The work done by those employed by the LLM companies is minuscule compared to that. Yet all of the reward goes to them.

Not to mention LLMs completely defear the purpose of (A)GPL. If you can take AGPL code and pass it through a sufficiently complex mechanical transformation that the output does the same thing but copyright no longer applies, then free software is dead. No more freedom to inspect and modify.

[0]: Github alone has 100 million users ( https://expandedramblings.com/index.php/github-statistics/ ) and we have reason to believe all of their data was used in training.

replies(2): >>46187445 #>>46190385 #
jacquesm ◴[] No.46187445{5}[source]
If a human did 2a or 2b we would think that a larger infraction than (1) because it shows intent to obfuscate the origins.

As for your free software is dead argument: I think it is worse than that: it takes away the one payment that free software authors get: recognition. If a commercial entity can take the code, obfuscate it and pass it off as their own copyrighted work to then embrace and extend it then that is the worst possible outcome.

replies(1): >>46187588 #
1. martin-t ◴[] No.46187588{6}[source]
> shows intent to obfuscate the origins

Good point. Reminds me of how if you poison one person, you go to prison, but when a company poisons thousands, it gets a fine... sometimes.

> it takes away the one payment that free software authors get: recognition

I keep flip-flopping on this. I did most of my open source work not caring about recognition but about the principles of GPL and later AGPL. However, I came to realize it was a mistake - people don't judge you by the work you actually do but by the work you appear to do. I have zero respect for people who do something just for the approval of others but I am aware of the necessity of making sure people know your value.

One thing is certain: credit/recognition affect all open source code, user rights (e.g. to inspect and modify) affect only the subset under (A)GPL.

Both are bad in their own right.