←back to thread

Faking a JPEG

(www.ty-penguin.org.uk)
337 points todsacerdoti | 1 comments | | HN request time: 0.209s | source
Show context
jandrese ◴[] No.44539502[source]
I wonder if you could mess with AI input scrapers by adding fake captions to each image? I imagine something like:

    (big green blob)

    "My cat playing with his new catnip ball".


    (blue mess of an image)

    "Robins nesting"
replies(2): >>44539766 #>>44540012 #
Dwedit ◴[] No.44540012[source]
A well-written scraper would check the image against a CLIP model or other captioning model to see if the text there actually agrees with the image contents.
replies(2): >>44540126 #>>44541347 #
1. Someone ◴[] No.44541347[source]
Do scrapers actually do such things on every page they download? Sampling a small fraction of a site to check how trustworthy it is, I can see happen, but I would think they’d rather scrape many more pages than spend resources doing such checks on every page.

Or is the internet so full of garbage nowadays that it is necessary to do that on every page?