Most active commenters
  • Riverheart(6)
  • tpmoney(5)
  • ToucanLoucan(3)

←back to thread

553 points bookofjoe | 14 comments | | HN request time: 1.955s | source | bottom
Show context
adzm ◴[] No.43654878[source]
Adobe is the one major company trying to be ethical with its AI training data and no one seems to even care. The AI features in Photoshop are the best around in my experience and come in handy constantly for all sorts of touchup work.

Anyway I don't really think they deserve a lot of the hate they get, but I do hope this encourages development of viable alternatives to their products. Photoshop is still pretty much peerless. Illustrator has a ton of competitors catching up. After Effects and Premiere for video editing are getting overtaken by Davinci Resolve -- though for motion graphics it is still hard to beat After Effects. Though I do love that Adobe simply uses JavaScript for its expression and scripting language.

replies(36): >>43654900 #>>43655311 #>>43655626 #>>43655700 #>>43655747 #>>43655859 #>>43655907 #>>43657271 #>>43657436 #>>43658069 #>>43658095 #>>43658187 #>>43658412 #>>43658496 #>>43658624 #>>43659012 #>>43659378 #>>43659401 #>>43659469 #>>43659478 #>>43659507 #>>43659546 #>>43659648 #>>43659715 #>>43659810 #>>43660283 #>>43661100 #>>43661103 #>>43661122 #>>43661755 #>>43664378 #>>43664554 #>>43665148 #>>43667578 #>>43674357 #>>43674455 #
f33d5173 ◴[] No.43655907[source]
Adobe isn't trying to be ethical, they are trying to be more legally compliant, because they see that as a market opportunity. Otoh, artists complain about legal compliance of AIs not because that is what they care about, but because they see that as their only possible redress against a phenomenon they find distasteful. A legal reality where you can only train AI on content you've licensed would be the worst for everybody bar massive companies, legacy artists included.
replies(7): >>43658034 #>>43658253 #>>43659203 #>>43659245 #>>43659443 #>>43659929 #>>43661258 #
Riverheart ◴[] No.43658253[source]
“A legal reality where you can only train AI on content you've licensed would be the worst for everybody bar massive companies, legacy artists included.”

Care to elaborate?

Also, saying artists only concern themselves with the legality of art used in AI because of distaste when there are legal cases where their art has been appropriated seems like a bold position to take.

It’s a practice founded on scooping everything up without care for origin or attribution and it’s not like it’s a transparent process. There are people that literally go out of their way to let artists know they’re training on their art and taunt them about it online. Is it unusual they would assume bad faith from those purporting to train their AI legally when participation up till now has either been involuntary or opt out? Rolling out AI features when your customers are artists is tone deaf at best and trolling at worst.

replies(1): >>43658703 #
Workaccount2 ◴[] No.43658703[source]
There is no "scooping up", the models aren't massive archives of copied art. People either don't understand how these models work or they purposely misrepresent it (or purposely refuse to understand it).

Showing the model an picture doesn't create a copy of that picture in it's "brain". It moves a bunch of vectors around that captures an "essence" of what the image is. The next image shown from a totally different artist with a totally different style may well move around many of those same vectors again. But suffice to say, there is no copy of the picture anywhere inside of it.

This also why these models hallucinate so much, they are not drawing from a bank of copies, they are working off of a fuzzy memory.

replies(3): >>43658755 #>>43658813 #>>43658942 #
1. ToucanLoucan ◴[] No.43658942[source]
Training data at scale unavoidably taints models with vast amounts of references to the same widespread ideas that appear repeatedly in said data, so because the model has "seen" probably millions of photos of Indiana Jones, if you ask for an image of an archeologist who wears a hat and uses a whip, it's weighted averages are going to lead it to create something extremely similar to Indiana Jones because it has seen Indiana Jones so much. Disintegrating IP into trillions of pieces and then responding to an instruction to create it with something so close to the IP as to barely be distinguishable is still infringement.

The flip-side to that is the truly "original" images where no overt references are present all look kinda similar. If you run vague enough prompts to get something new that won't land you in hot water, you end up with a sort of stock-photo adjacent looking image where the lighting doesn't make sense and is completely unmotivated, the framing is strange, and everything has this over-smoothed, over-tuned "magazine copy editor doesn't understand the concept of restraint" look.

replies(1): >>43659527 #
2. tpmoney ◴[] No.43659527[source]
> if you ask for an image of an archeologist who wears a hat and uses a whip, it's weighted averages are going to lead it to create something extremely similar to Indiana Jones because it has seen Indiana Jones so much.

If you ask a human artist for an image of "an archeologist who wears a hat and uses a whip" you're also going to get something extremely similar to Indiana Jones unless you explicitly ask for something else. Let's imagine we go to deviantart and ask some folks to draw us some drawing from these prompts:

A blond haired fighter from a fantasy world that wears a green tunic and green pointy cap and used a sword and shield.

A foreboding space villain with all black armor, a cape and full face breathing apparatus that uses a laser sword.

A pudgy plumber in blue overalls and a red cap of Italian descent

I don't know about you but I would expect with nothing more than that, most of the time you're going to get something very close to Link, Darth Vader and Mario. Link might be the one with the best chance to get something different just because the number of publicly known images of "fantasy world heroes" is much more diverse than the set of "black armored space samurai" and "Italian plumbers"

> Disintegrating IP into trillions of pieces and then responding to an instruction to create it with something so close to the IP as to barely be distinguishable is still infringement.

But it's the person that causes the creation of the infringing material that is responsible for the infringement, not the machine or device itself. A xerox machine is a machine that disintegrates IP into trillions of pieces and then responds to instructions to duplicate that IP almost exactly (or to the best of its abilities). And when that functionality was challenged, the courts rightfully found that a xerox machine in and of itself, regardless of its capability to be used for infringement is not in and of itself infringing.

replies(2): >>43659716 #>>43664912 #
3. Riverheart ◴[] No.43659716[source]
You know why we put up with copyrighted info in the human brain right? Because those are human beings, it’s unavoidable. This? Avoidable.

Also, the model isn’t a human brain. Nobody has invented a human brain.

And the model might not infringe if its inputs are licensed but that doesn’t seem to be the case for most and it’s not clearly transparent they don’t. If the inputs are bad, the intent of the user is meaningless. I can ask for a generic super hero and not mean to get superman but if I do I can’t blame that on myself, I had no role in it, heck even the model doesn’t know what it’s doing, it’s just a function. If I Xerox Superman my intent is clear.

replies(1): >>43659952 #
4. tpmoney ◴[] No.43659952{3}[source]
> You know why we put up with copyrighted info in the human brain right? Because those are human beings, it’s unavoidable.

I would hope we put up with it because "copyright" is only useful to us insofar as it advances good things that we want in our society. I certainly don't want to live in a world where if we could forcibly remove copyrighted information from human brains as soon as the "license" expired that we would do so. That seems like a dystopian hell worse than even the worst possible predictions of AI's detractors.

> I can ask for a generic super hero and not mean to get superman but if I do I can’t blame that on myself, I had no role in it, heck even the model doesn’t know what it’s doing, it’s just a function.

And if you turn around and discard that output and ask for something else, then no harm has been caused. Just like when artists trace other artists work for practice, no harm is caused and while it might be copyright infringement in a "literal meaning of the words" it's also not something that as a society we consider meaningfully infringing. If on the other hand, said budding artist started selling copies of those traces, or making video games using assets scanned from those traces, then we do consider it infringement worth worrying about.

> If I Xerox Superman my intent is clear.

Is it? If you have a broken xerox machine and you think you have it fixed, grab the nearest papers you can find and as a result of testing the machine xerox Superman, what is your intent? I don't think it was to commit copyright infringement, even if again in the "literal meaning of the words" sense you absolutely did.

replies(1): >>43660326 #
5. Riverheart ◴[] No.43660326{4}[source]
I’m saying that retaining information is a natural, accepted part of being human and operating in society. Don’t know why it needed to be turned into an Orwell sequel.
replies(2): >>43660501 #>>43661530 #
6. tpmoney ◴[] No.43660501{5}[source]
I had assumed when you said that a human retaining information was "unavoidable" and a machine retaining it was "avoidable" that the implication was we wouldn't tolerate humans retaining information if it was also "avoidable". Otherwise I'm unclear what the intent of distinguishing between "avoidable" and "unavoidable" was, and I'm unclear what it has to do with whether or not an AI model that was trained with "unlicensed" content is or isn't copyright infringing on its own.
replies(1): >>43661524 #
7. Riverheart ◴[] No.43661524{6}[source]
I’m in the camp that believes that it’s neither necessary nor desirable to hold humans and software to the same standard of law. Society exists for our collective benefit and we make concessions with each other to ensure it functions smoothly and I don’t think those concessions should necessarily extend to automated processes even if they do in fact mimic humans for the myriad ways in which they differ from us.
replies(1): >>43664300 #
8. CaptainFever ◴[] No.43661530{5}[source]
Appeal to nature fallacy.

https://www.logicallyfallacious.com/logicalfallacies/Appeal-...

replies(2): >>43661581 #>>43668604 #
9. Riverheart ◴[] No.43661581{6}[source]
I’m not saying it’s better because it’s naturally occurring, the objective reality is that we live in a world of IP laws where humans have no choice but to retain copyrighted information to function in society. I don’t care that text or images have been compressed into an AI model as long as it’s done legally but the fact that it is has very real consequences for society since, unlike a human, it doesn’t need to eat, sleep, pay taxes, nor will it ever die which is constantly ignored in this conversation of what’s best for society.

These tools are optional whether people like to hear it or not. I’m not even against them ideologically, I just don’t think they’re being integrated into society in anything resembling a well thought out way.

10. tpmoney ◴[] No.43664300{7}[source]
So what benefit do we derive as a society from deciding that the capability for copyright infringement is in and of itself infringement? What do we gain by overturning the current protections the law (or society) currently has for technologies like xerox machines, VHS tapes, blank CDs and DVDs, media ripping tools, and site scraping tools? Open source digital media encoding, blank media, site scraping tools and bit-torrent enable copyright infringement on a massive scale to the tune of millions or more dollars in losses every year if you believe the media companies. And yet, I would argue as a society we would be worse off without those tools. In fact, I'd even argue that as a society we'd be worse off without some degree of tolerated copyright infringement. How many pieces of interesting media have been "saved" from the dust bin of history and preserved for future generations by people committing copyright infringement for their own purposes? Things like early seasons of Dr Who or other TV shows that were taped over and so the only extant copies are from people's home collections taped off the TV. The "De-specialized" editions of Star Wars are probably the most high quality and true to the original cuts of the original Star Wars trilogy that exist, and they are unequivocally pure copyright infringement.

Or consider the youtube video "Fan.tasia"[1]. That is a collection of unlicensed video clips, combined with another individual's work which itself is a collection of unlicensed audio clips mashed together into a amalgamation of sight and sound to produce something new and I would argue original, but very clearly also full of copyright infringement and facilitated by a bunch of technologies that enable doing infringement at scale. It is (IMO) far more obviously copyright infringement than anything an AI model is. Yet I would argue a world in which that media and the technologies that enable it were made illegal, or heavily restricted to only the people that could afford to license all of the things that went into it from the people who created all the original works, would be a worse world for us all. The ability to easily commit copyright infringement at scale enabled the production of new and interesting art that would not have existed otherwise, and almost certainly built skills (like editing and mixing) for the people involved. That, to me, is more valuable to society than ensuring that all the artists and studios whose work went into that media got whatever fractions of a penny they lost from having their works infringed.

[1]: https://www.youtube.com/watch?v=E-6xk4W6N20&pp=ygUJZmFuLnRhc...

replies(1): >>43664710 #
11. Riverheart ◴[] No.43664710{8}[source]
The capability of the model to infringe isn’t the problem. Ingesting unlicensed inputs to create the model is the initial infringement before the model has even output anything and I’m saying that copyright shouldn’t be assigned to it or its outputs. If you train on licensed art and output Darth Vader that’s cool so long as you know better than to try copyrighting that. If you train on licensed art and produce something original and the law says it’s cool to copyright that or there’s just no one to challenge you, also cool.

If you want to ingest unlicensed input and produce copyright infringing stuff for no profit, just for the love of the source material, well that’s complicated. I’m not saying no good ever came of it, and the tolerance for infringement comes from it happening on a relatively small scale. If I take an artists work with a very unique style and feed it into a machine then mass produce art for people based on that style and the artist is someone who makes a living off commissions I’m obviously doing harm to their business model. Fanfics/fanart of Nintendo characters probably not hurting Nintendo. It’s not black or white. It’s about striking a balance, which is hard to do. I can’t just give it a pass because large corporations will weather it fine.

That Fantasia video was good. You ever see Pogo’s Disney remixes? Incredible musical creativity but also infringing. I don’t doubt the time and effort needed to produce these works, they couldn’t just write a prompt and hit a button. I respect that. At the same time, this stuff is special partly because there aren’t a lot of things like it. If you made a AI to spit out stuff like this it would be just another video on the internet. Stepping outside copyright, I would prefer not to see a flood of low effort work drown out everything that feels unique, whimsical, and personal but I can understand those who would prefer the opposite. Disney hasn’t taken it down in the last 17 years and god I’m old. https://youtu.be/pAwR6w2TgxY?si=K8vN2epX4CyDsC96

The training of unlicensed inputs is the ultimate issue and we can just agree to disagree on how that should be handled. I think

12. ToucanLoucan ◴[] No.43664912[source]
> But it's the person that causes the creation of the infringing material that is responsible for the infringement, not the machine or device itself.

That's simply not good enough. This is not merely a machine that can be misused if desired by a bad actor, this is a machine that specializes in infringement. It's a machine which is internally biased, by the nature of how it works, towards infringement, because it is inherently "copying:" It is copying the weighted averages of millions perhaps billions of training images, many of which depict similar things. No, it doesn't explicitly copy one Indiana Jones image or another: It copies a shit ton of Indiana Jones images, mushed together into a "new" image from a technical perspective, but will inherit all the most prominent features from all of those images, and thus: it remains a copy.

And if you want to disagree with this point, it'd be most persuasive then to explain why, if this is not the case, AI images regularly end up infringing on various aspects of various popular artworks, like characters, styles, intellectual properties, when those things are not being requested by the prompt.

> If you ask a human artist for an image of "an archeologist who wears a hat and uses a whip" you're also going to get something extremely similar to Indiana Jones unless you explicitly ask for something else.

No, you aren't, because an artist is a person that doesn't want to suffer legal consequences for drawing something owned by someone else. Unless you specifically commission "Indiana Jones fanart" I in fact, highly doubt you'll get something like him because an artist will want to use this work to promote their work to others, and unless you are driven to exist in the copyright gray area of fan created works, which is inherently legally dicey, you wouldn't do that.

replies(1): >>43669615 #
13. ToucanLoucan ◴[] No.43668604{6}[source]
Firstly it’s not an appeal to nature fallacy to accurately describe how a product of nature works, secondly it’s the peak of lazy online discussion to name a fallacy and leave as though it means something. Fallacies can be applied to tons of good arguments and along with the fallacy, you need to explain why the point itself being made is fallacious.

It’s a philosophical concept not a trap card.

14. tpmoney ◴[] No.43669615{3}[source]
> This is not merely a machine that can be misused if desired by a bad actor, this is a machine that specializes in infringement.

So is a xerox machine. It's whole purpose is to make copies of things whatever you put into it with no regard to whether you have a license to make that copy. Likewise with the record capability on your VCR. Sure you could hook it up to a cam corder and transfer your home movie from a Super-8 to a VHS with your VCR (or like one I used to own, it might even have a camera accessory and port that you could hook a camera up to directly) and yet, I would wager most recordings on most VCRs were to commit copyright infringement. Bit-torrent specializes in facilitating copyright infringement, no matter how many Linux ISOs you download with it. CD ripping software and DeCSS is explicitly about copyright infringement. And let's be real, while MAME is a phenomenal piece of software that has done an amazing job of documenting legacy hardware and its quirks, the entire emulation scene as a whole is built on copyright infringement, and I would wager to a rounding error none of the folks that write MAME emulators have a license to copy the ROMs that they use to do that.

But in all of these cases, the fact that it can (and even usually is) used for copyright infringement is not in and of itself a reason to restrict or ban the technology.

> And if you want to disagree with this point, it'd be most persuasive then to explain why, if this is not the case, AI images regularly end up infringing on various aspects of various popular artworks, like characters, styles, intellectual properties, when those things are not being requested by the prompt.

Well for starters, I'd like to clarify to axioms:

1) "characters" as a subset of "intellectual properties"

2) "style" is not something you can copyright or infringe under US law. It can be part of a trademark or a design patent, and certainly you can commit fraud if you represent something in someone else's style as being a genuine item from that person, but style itself is not protected and I don't think it should be.

So then to answer the question, I would argue that AI images don't "regularly end up infringing on ... intellectual properties, when those things are not being requested by the prompt". I've generated quite a few AI images myself in exploring the various products out there and not a one of them has generated an infringing work, because none of my prompts have asked it to generate an infringing work. It is certainly possible that a given model with a sufficiently limited training set for a given set of words might be likely to generate an infringing image on a prompt, and that's because with a limited set of options to draw from, the prompt is inherently asking for an infringing image no matter how much you try to scrape the serial numbers off. That is, if I ask for an image of "two Italian plumbers who are brothers and battle turtles", everyone knows what that prompt is asking for. There's not a lot of reference options for that particular set of requirements and so it is more likely to generate an infringing image. It's also partly a function of the current goals of the models. As it stands, for the most part we want a model that takes a vague description and gives us something that matches our imagined output. Give that description to most people and they're going to envision the Mario Brothers, so a "good" image generation model is one that will generate a "Mario Brothers" inspired (or infringing) image.

As the technology improves and we get better about producing models that can take new paths without also generating body horror results, and as the users start wanting models that are more creative, we'll begin to see models that can respond to even that limited training set and generate something more unique and less likely to be infringing.

> No, you aren't, because an artist is a person that doesn't want to suffer legal consequences for drawing something owned by someone else.

Sorry, I think you're wrong. If you commission it for money from someone with enough potential visibility, you might encounter people who go out of their way to avoid anything that could be construed as Indiana Jones, but I bet even then you'd get more "Indiana Jones with the serial numbers filed off" images than not.

But if you just asked random artists to draw that prompt, you're going to get an artists rendition of Indiana Jones. It's clear thats what you want from the prompt and that's the single and sole cultural creative reference for that prompt. Though I suppose you and I are going to have to agree to disagree on what people will do unless you're feeling like actually asking a bunch of artist on Fiver to draw the prompt for you.

And realistically what do you expect them to draw when you make that request? When that article showed up with the headline, EVERYONE reading the headline knew the article was talking about an AI generating Indiana Jones. Why did everyone know that? Because of the limited reference for that prompt that exists. "Archeologist that wears a hat and uses a whip" describes very uniquely a single character to almost every single person.

There's a reason no one is writing articles about AIs ripping off Studio Ghibli by showing the output from the prompt "raccoon with giant testicles." No one writes articles talking about how the AI spontaneously generated Garfield knockoffs when prompted to draw an "orange stripped cat". There's no articles about AIs churning out truckloads of Superman images when someone asks for "super hero". And those articles don't exist because there's enough variations on those themes out there, enough different combinations of those words to describe enough different combinations of images and things that those words don't instantly conjure the same image and character for everyone. And so it goes for the AI too. Those prompts don't ask specifically for infringing art so they don't generally generate infringing art.