Most active commenters
  • bluecalm(3)

←back to thread

747 points porridgeraisin | 30 comments | | HN request time: 1.351s | source | bottom
1. Syzygies ◴[] No.45063736[source]
Claude assists me in my math research.

The scenario that concerns me is that Claude learns unpublished research ideas from me as we chat and code. Claude then suggests these same ideas to someone else, who legitimately believes this is now their work.

Clearly commercial accounts use AI to assist in developing intellectual product, and privacy is mandatory. The same can apply to individuals.

replies(9): >>45063744 #>>45064034 #>>45064105 #>>45064140 #>>45064248 #>>45064416 #>>45064428 #>>45065522 #>>45065601 #
2. Aurornis ◴[] No.45063744[source]
When you get the pop-up about the new terms, select the “opt out” option. Then your chats will not be used for training.
replies(1): >>45064328 #
3. vdfs ◴[] No.45064034[source]
> Claude assists me in my math research.

> Claude then suggests these same ideas to someone else, who legitimately believes this is now their work.

Won't this mean that claude assisted you with someone else work? Sure it's not from a "chat" but claude doesn't really know anything other than it's training data

replies(3): >>45064395 #>>45064437 #>>45066937 #
4. andrewmcwatters ◴[] No.45064105[source]
A lot of people doing cat-and-mouse threat detection development are keeping their work outside of public LLMs right now, so it sounds like you’re in the same boat as a lot of us.
5. thisOtterBeGood ◴[] No.45064140[source]
This perfectly describes one of the biggest dillema with AI. Where does an AI company stop to utilize human knowledge it does not actually own. Where do they draw the line. Apparently it's possible there aren't any lines drawn at all.
replies(1): >>45064192 #
6. sneak ◴[] No.45064192[source]
You can’t own knowledge. Intellectual property is a legal fiction invented to prop up industries.

You can no more own knowledge or information than you can own the number 2.

replies(2): >>45065253 #>>45068920 #
7. Ardren ◴[] No.45064248[source]
> Claude assists me in my math research.

Pulling up the ladder behind you :-)

replies(1): >>45065747 #
8. Klonoar ◴[] No.45064328[source]
Well, theoretically they won’t.

Anyone who’s worked in an engineering team is familiar with someone forgetting to check ‘if(doNotDoThisCondition)’.

This is why (among many other reasons) opt-in is more user respecting here than opt-out.

replies(1): >>45064436 #
9. iaw ◴[] No.45064395[source]
> claude doesn't really know anything other than it's training data

I've seen cases where Claude demonstrates novel behaviors or combines existing concepts in new ways based on my input. I don't think it's as simple as memorization anymore.

replies(1): >>45064788 #
10. bluecalm ◴[] No.45064416[source]
Math research or anything new/clever in a particular niche. Imagine you optimized a piece of code to get an advantage or came up with some clever trick to solve a common problem in your niche and then everyone gets it from free from Claude believing, as you pointed out, that it's now their work.

I had this exact conversation with my business partner a few days ago. Our "secret sauce" might not be worth that much after many years but still I am not comfortable exposing it to Claude. Fortunately it's very easy to separate in our project so Claude gets the other parts and is very helpful.

11. Deegy ◴[] No.45064428[source]
If your work was truly novel, wouldn't the odds of it showing up in later models be extremely low given that these are probabilistic?

In a sense these machines are outputting the aggregate of the collective thoughts of the commons. In order for concepts to be output they have to be quite common in the training data. Which works out kind of nice for privacy and innovation because by the time concepts are common enough to show up through inference they probably deserve to be part of the public knowledge (IP aside).

replies(1): >>45064597 #
12. SoftTalker ◴[] No.45064436{3}[source]
Forgetting. Riiighht.
13. simpaticoder ◴[] No.45064437[source]
There is a stark difference between using the public web to do research and searching through your colleagues' private notebooks and discussions to do research.
14. bluecalm ◴[] No.45064597[source]
They might optimize learning to weight novel/unexpected parts more in the future. The better the models become (the more the expect) the more value they will get from unexpected/new ideas.
replies(1): >>45064686 #
15. Deegy ◴[] No.45064686{3}[source]
Good point. But can the models even behave that way? They depend on probability. If they put a greater weight on novel/unexpected outputs don't they just become undependable hallucination machines? Despite what some people think, these models can't reason about a concept to determine it's validity. They depend on recurring data in training to determine what might be true.

That said, it would be interesting to see a model tuned that way. It could be marketed as a 'creativity model' where the user understands there will be a lot of junk hallucination and that it's up to them to reason whether a concept has validity or not.

replies(2): >>45064914 #>>45073377 #
16. ffsm8 ◴[] No.45064788{3}[source]
If I am standing in Finland and look out on the ocean, and the whole sky is green... Is the sky actually green?

You're equating your own perspective as objective truth, which is a very common pitfall and fallacy

17. ceroxylon ◴[] No.45064914{4}[source]
Temperature plays a large role in fine tuning model output, you're correct that there is a theoretical sweet spot:

https://towardsdatascience.com/a-comprehensive-guide-to-llm-...

18. wolvesechoes ◴[] No.45065253{3}[source]
Property itself is a legal fiction. Every other right you enjoy is a legal fiction.

So what?

replies(1): >>45067227 #
19. Syzygies ◴[] No.45065522[source]
To clarify, I see AI as an association engine of immense scope. Others are responding with variations on this model in mind.

It has long been a problem in math research to distinguish between "no one has had this idea" and "one person has had this idea". This used to take months. With the internet and MathSciNet, ArXiv online it took many iterations of guessing keywords. Now, I've spent six months learning how to coax rare responses from AI. That's not everyone's use case.

What complicates this is AI's ability to generalize. My best paper, we imagined we were expressing in print what everyone was thinking, when we were in fact connecting the dots on an idea that was latent. This is an interesting paradox: People see you as most original when you're least original, but you're helping them think.

With the right prompts AI can also "connect the dots".

20. dns_snek ◴[] No.45065601[source]
If you talk to a human they're free to discuss your ideas with someone else. Why should LLMs be any different? The likelihood of these models reproducing your ideas word for word is essentially zero anyway.

More to the point, respecting your wishes to keep those conversations confidential would risk stifling human progress, so they have to be disregarded for the greater good.

replies(2): >>45065700 #>>45071173 #
21. dmbche ◴[] No.45065700[source]
Love to see people being directly and fully against the concept of "confidentiality"
replies(1): >>45066332 #
22. notrealyme123 ◴[] No.45065747[source]
Unpublished work Vs. Published.
23. dns_snek ◴[] No.45066332{3}[source]
Not in the slightest! The only thing I'm against is hypocrisy.

LLM enthusiasts are staunch defenders of the argument that use of everyone's ideas and labour in LLM training isn't just fair use, but a moral imperative in order to advance science, art, and human progress as a whole.

It's beyond hypocritical for beneficiaries of this paradigm to then turn around and expect special treatment by demanding that "their" ideas, "their" knowledge, "their" labour be excluded from this process.

replies(1): >>45066456 #
24. dmbche ◴[] No.45066456{4}[source]
Gotcha - right with you. Gotta get my sarcasm detector checked.
25. kmacdough ◴[] No.45066937[source]
If you have an idea and are putting it together, you might use Claude for a few things:

- Search the web for related ideas. This could help if someone's already had the idea or if there are things to learn from related ideas. - Review your writeup or proofs for mistakes and clarity

None of these things make the idea Claude's. Claude merely helped with some of the legwork.

But Claude now has your idea in clear, plain text to train on. The next time someone hits on even a similar idea, Claude might well suggest your idea outright. Not seeing your idea published, the user has no way to know it isn't a novel idea. If the person is less diligent/thorough, they may well publish first and claim it as there own, without any nefarious intent.

26. sneak ◴[] No.45067227{4}[source]
> Property itself is a legal fiction.

Maybe real property (which only exists because of a property record held in a government building), but it is self-evident to me (and, I believe, most people) that personal property is a natural right.

One only need look up some TikTok videos of Americans getting pickpocketed in Europe to see how large groups of people feel on the matter.

replies(1): >>45067882 #
27. mitthrowaway2 ◴[] No.45067882{5}[source]
But you won't feel the same way about a pickpocket who borrows the source code to the software you derive your livelihood from, your sales team's customer list, your would-be-bestselling novel manuscript, your company's secret formula for a rust-proof coating, or that scientific paper that you and your grad students have spent all summer getting ready to submit for publication?

Thank you for your generosity!

28. AvAn12 ◴[] No.45068920{3}[source]
So Anthropic should have no property rights to its own source code?
29. const_cast ◴[] No.45071173[source]
> Why should LLMs be any different?

Because they're a computer program and not a human and humans are special.

Why are humans special? Because we're humans and we make the rules.

Its as inane as saying "why can I eat a burger but I can't chop up my friend and eat him? Why is that any different?"

30. bluecalm ◴[] No.45073377{4}[source]
I think it's happening already. Chat GPT was able to connect my name to my project based on chess.com profile and one Hacker News post for example. It's not that hard to imagine that it learns a solution to a rare problem based on one input point. It may see one solution 1000 times an a rare solution 1 time and it can still be able to reference both.