(theaiunderwriter.substack.com)

1503 points participant3 | 4 comments | 03 Apr 25 17:55 UTC | HN request time: 0.691s | source

1. traverseda ◴[03 Apr 25 20:55 UTC] No.43575285[source]▶

I don't understand why problems like this aren't solved by vector similarity search. Indiana Jones lives in a particular part of vector space.

Two close to one of the licensed properties you care to censor the generation of? Push that vector around. Honestly detecting whether a given sentence is a thinly veiled reference to indiana jones seems to be exactly the kind of thing AI vector search is going to be good at.

replies(2): >>43575341 #>>43575408 #

2. htrp ◴[03 Apr 25 21:00 UTC] No.43575341[source]▶

>>43575285 (TP) #

Not worth it to compute the embedding for Indy and a "bull-whip archaeologist" most guardrails operate at the input level it seems?

replies(1): >>43575543 #

3. genericone ◴[03 Apr 25 21:06 UTC] No.43575408[source]▶

>>43575285 (TP) #

Thinking of it in terms of vector similarity does seem appropriate, and then definition of similarity suddenly comes into debate: If you don't get Harrison Ford, but a different well-known actor along with everything else Indiana-Jones, what is that? Do you flatten the vector similarity matrix to a single infringement-scale?

4. gavmor ◴[03 Apr 25 21:16 UTC] No.43575543[source]▶

>>43575341 #

> Not worth it to compute the embedding for Indy

If IP holders submit embeddings for their IP, how can image generators "warp" the latent space around a set of embeddings so that future inferences slide around and avoid them--not perfectly, or literally, but as a function of distance, say, following a power curve?

Maybe by "Finding non-linear RBF paths in GAN latent space"[0] to create smooth detours around protected regions.

0. https://openaccess.thecvf.com/content/ICCV2021/papers/Tzelep...

↑

An image of an archeologist adventurer who wears a hat and uses a bullwhip