←back to thread

1311 points msoad | 5 comments | | HN request time: 0.401s | source
Show context
sillysaurusx ◴[] No.35393782[source]
On the legal front, I’ve been working with counsel to draft a counterclaim to Meta’s DMCA against llama-dl. (GPT-4 is surprisingly capable, but I’m talking to a few attorneys: https://twitter.com/theshawwn/status/1641841064800600070?s=6...)

An anonymous HN user named L pledged $200k for llama-dl’s legal defense: https://twitter.com/theshawwn/status/1641804013791215619?s=6...

This may not seem like much vs Meta, but it’s enough to get the issue into the court system where it can be settled. The tweet chain has the details.

The takeaway for you is that you’ll soon be able to use LLaMA without worrying that Facebook will knock you offline for it. (I wouldn’t push your luck by trying to use it for commercial purposes though.)

Past discussion: https://news.ycombinator.com/item?id=35288415

I’d also like to take this opportunity to thank all of the researchers at MetaAI for their tremendous work. It’s because of them that we have access to such a wonderful model in the first place. They have no say over the legal side of things. One day we’ll all come together again, and this will just be a small speedbump in the rear view mirror.

EDIT: Please do me a favor and skip ahead to this comment: https://news.ycombinator.com/item?id=35393615

It's from jart, the author of the PR the submission points to. I really had no idea that this was a de facto Show HN, and it's terribly rude to post my comment in that context. I only meant to reassure everyone that they can freely hack on llama, not make a huge splash and detract from their moment on HN. (I feel awful about that; it's wonderful to be featured on HN, and no one should have to share their spotlight when it's a Show HN. Apologies.)

replies(7): >>35393813 #>>35393848 #>>35394028 #>>35394029 #>>35394084 #>>35394156 #>>35394431 #
1. sheeshkebab ◴[] No.35393848[source]
All models trained on public data need to be made public. As it is their outputs are not copyrightable, it’s not a stretch to say models are public domain.
replies(3): >>35393876 #>>35394018 #>>35407677 #
2. sillysaurusx ◴[] No.35393876[source]
I’m honestly not sure. RLHF seems particularly tricky —- if someone is shaping a model by hand, it seems reasonable to extend copyright protection to them.

For the moment, I’m just happy to disarm corporations from using DMCAs against open source projects. The long term implications will be interesting.

3. xoa ◴[] No.35394018[source]
You seem to be mixing a few different things together here. There's a huge leap from something not being copyrightable to saying there is grounds for it to be made public. No copyright would greatly limit the ability of model makers to legally restrict distribution if they made it to the public, but they'd be fully within their rights to keep them as trade secrets to the best of their ability. Trade secret law and practice is its own thing separate from copyright, lots of places have private data that isn't copyrightable (pure facts) but that's not the same as it being made public. Indeed part of the historic idea of certain areas of IP like patents was to encourage more stuff to be made public vs kept secret.

>As it is their outputs are not copyrightable, it’s not a stretch to say models are public domain.

With all respect this is kind of nonsensical. "Public domain" only applies to stuff that is copyrightable, if they simply aren't then it just never enters into the picture. And it not being patentable or copyrightable doesn't mean there is any requirement to share it. If it does get out though then that's mostly their own problem is all (though depending on jurisdiction and contract whoever did the leaking might get in trouble), and anyone else is free to figure it out on their own and share that and they can't do anything.

replies(1): >>35394844 #
4. sheeshkebab ◴[] No.35394844[source]
Public domain applies to uncopyrightable works, among other things (including previously copyrighted works). In this case models are uncopyrightable, and I think FB (or any of these newfangled ai cos) would have interesting time proving otherwise, if they ever try.

https://en.m.wikipedia.org/wiki/Public_domain

5. __turbobrew__ ◴[] No.35407677[source]
Aggregating and organizing public knowledge is a fundamentally valuable action which many companies make their business off of.

If I create a website for tracking real estate trends in my area — which is public information — should I not be able to sell that information?

Similarly if a consulting company analyzes public market macro trends are they not allowed to sell that information?

Just because the information which is being aggregated and organized is public does not necessarily mean that the output product should be in the public.