Educating people about such a technical topic seems very difficult especially since people get emotional of their work being used.
Educating people about such a technical topic seems very difficult especially since people get emotional of their work being used.
I know because I'm literally working on setting up Dreambooth to do what I'd otherwise have to pay an artist to do.
And not only is it replacing artists, it's using their own work to do so. None of these could exist without being trained on the original artwork.
Surely you can imagine why they're largely not happy?
We want data privacy, but we also like playing with any sort of leaked information. We like it when we can get music for free but clutch our pearls when Microsoft sells our code back to us. We talk a good game about free speech, but fail to understand that being shouted over, DDoSed, or harassed is a form of censorship. And whenever words are used that reference any of these concepts in ways we haven't considered - i.e. "marginalized voices", or "consent" - we circle the wagons.
The only consistent thing I can infer is that we don't like it when we get a taste of our own medicine.
In this case, technologists figured out how to exploit people's work without compensating them. A camera is possible without the artists it replaces. Generative modeling is not. It's fundamentally different.
If people figured out how to generate this kind of art without exploiting uncompensated unwilling artists' free labor, it would be a different story.
We're surrounded by people who don't understand what's happening. They seem to think some kind of art intelligence has been invented.
No, it's the aggregation and interpolation of vast amounts of existing art.
The same thing is happening with software, through Microsoft's Copilot:
https://bugfix-66.com/7a82559a13b39c7fa404320c14f47ce0c304fa...
I think people just don't understand what they're seeing. They have no idea what it is.
They think it's really "intelligence", dreaming and imagining and simulating and feeling and experimenting and...
It's none of these things. It's a sophisticated interpolation, not so different from linear interpolation:
a*x + (1-a)*y
No it wouldn't. It would still compete against artists. We'd have worse models in the beginning and it would take time until someone licensed enough images to improve the models, but the capability is there and we know about it, too late to stop.
By the way, Stable Diffusion has been fine-tuned with Midjourney image text pairs. So now we also have AI trained on AI images.
I think both humans and AI without training are stupid. Take a human alone, raised alone, without culture. He/she will be closer to animals than humans. It's the culture that is the locus of intelligence and we're borrowing intelligence from it just like the AIs.
Landscapes are another matter. Try finding any photo of a landscape that is half as sublime as the landscape paintings made by the Hudson river school. An effective painter can improve upon optical reality in a way that beggers belief. They do this with a clever mix of increasing contrast and affinity in a way that would be almost impossible for a photographer.
It doesn't matter, and never did in the first place. All large models (including SD) are already trained on other models output, since there's simply no possible way to have a high quality tagged dataset of the size they need. Smaller models are used to classify the data for larger ones, then the process is repeated for even larger models, with whatever manual data you have. Humans only select the data sources, and otherwise curate the entire bootstrapping process. This kind of curated training actually produces better results.
These algorithms are specifically non-linear a far cry from ‘linear interpolation’ unless you want to water down the meaning of interpolation to be so generic it loses its meaning.
Having said all that - the sophistication of the algorithm is beyond the point here as long as what they are generating is substantially transformative (which >99% of the possible outputs are legally speaking).
Is it? Or does the idea of a photo presuppose the painting? Could a camera have been invented by someone not looking at the world through the lens of a very particular tradition of art?
I suspect their indignation is more to do with their work being consumed without their permission, and then turned into a tool that undermines their value. These tools wouldn't exist without the work of artists. I don't think it's fair to act like no injustice has been done.
And the "idea" of a frozen, materialized, two dimensional projection of what we see with our eyes, aka an image, transcends cultures and tools.
It does not depend on a particular tradition of art. You might argue that it depends on human culture, but that's a different thing.
Also, making a portrait photo does not need concrete instances of portrait paintings.
So I don't really follow your argument.
Also, yes, I'd say the invention if the camera could be motivated by reasons that have nothing to do with art at all (documenting the physical world). The lines get blurry depending on how you define "art".
But none of that implies that the invention of the camera depends on recycling prior art.
An image AI is incapable of depicting something that's entirely missing from its training data.
A quote attributed to Mark Twain says “A photograph is a most important document, and there is nothing more damning to go down to posterity than a silly, foolish smile caught and fixed forever.“
They even gave a linear equation in their example… again not even close. If we can call what these algorithms do interpolation - we can call what humans do interpolation too - it makes the word that meaningless
Not sure I would agree with that. Granted there may be a cultural component in the mix somewhere, but as someone who has painted from observation many faces, the fugitive nature of a smile presents almost insurmountable problems. Franz Hals (below) could do it because he painted insanely quickly.
https://www.art-prints-on-demand.com/a/hals-frans/thelaughin...
https://images.prismic.io/barnebys/a671f804-2e03-4541-afa0-9...
https://az333960.vo.msecnd.net/images-9/laughing-boy-frans-h...
They key issue is that a smile involves the eyes as much as the face. This cannot be faked without the frozen effect: example:
https://images7.alphacoders.com/694/694598.jpg
As for photography, the long exposures of early photography made the capture of anything fugitive an impossibility. However, the moment that snapshot photography was invented (Kodaks Box Brownie) smiles were being photographed all the time.