←back to thread

747 points porridgeraisin | 5 comments | | HN request time: 0.001s | source
Show context
Syzygies ◴[] No.45063736[source]
Claude assists me in my math research.

The scenario that concerns me is that Claude learns unpublished research ideas from me as we chat and code. Claude then suggests these same ideas to someone else, who legitimately believes this is now their work.

Clearly commercial accounts use AI to assist in developing intellectual product, and privacy is mandatory. The same can apply to individuals.

replies(9): >>45063744 #>>45064034 #>>45064105 #>>45064140 #>>45064248 #>>45064416 #>>45064428 #>>45065522 #>>45065601 #
1. Deegy ◴[] No.45064428[source]
If your work was truly novel, wouldn't the odds of it showing up in later models be extremely low given that these are probabilistic?

In a sense these machines are outputting the aggregate of the collective thoughts of the commons. In order for concepts to be output they have to be quite common in the training data. Which works out kind of nice for privacy and innovation because by the time concepts are common enough to show up through inference they probably deserve to be part of the public knowledge (IP aside).

replies(1): >>45064597 #
2. bluecalm ◴[] No.45064597[source]
They might optimize learning to weight novel/unexpected parts more in the future. The better the models become (the more the expect) the more value they will get from unexpected/new ideas.
replies(1): >>45064686 #
3. Deegy ◴[] No.45064686[source]
Good point. But can the models even behave that way? They depend on probability. If they put a greater weight on novel/unexpected outputs don't they just become undependable hallucination machines? Despite what some people think, these models can't reason about a concept to determine it's validity. They depend on recurring data in training to determine what might be true.

That said, it would be interesting to see a model tuned that way. It could be marketed as a 'creativity model' where the user understands there will be a lot of junk hallucination and that it's up to them to reason whether a concept has validity or not.

replies(2): >>45064914 #>>45073377 #
4. ceroxylon ◴[] No.45064914{3}[source]
Temperature plays a large role in fine tuning model output, you're correct that there is a theoretical sweet spot:

https://towardsdatascience.com/a-comprehensive-guide-to-llm-...

5. bluecalm ◴[] No.45073377{3}[source]
I think it's happening already. Chat GPT was able to connect my name to my project based on chess.com profile and one Hacker News post for example. It's not that hard to imagine that it learns a solution to a rare problem based on one input point. It may see one solution 1000 times an a rare solution 1 time and it can still be able to reference both.