Now this would effectively kill off the current AI powered solution, because they have no way of explaining, or even understanding, why a paper may be plagiarized or not, but I'm okay with that.
Now this would effectively kill off the current AI powered solution, because they have no way of explaining, or even understanding, why a paper may be plagiarized or not, but I'm okay with that.
> The data subject shall have the right not to be subject to a decision based solely on automated processing [...]
[1]: https://gdpr.eu/article-22-automated-individual-decision-mak...
I also thing that there should be laws requiring a clear explanation whenever that happens.
Reading the rules quickly, it does seem like you're not entitled to know why the computer flagged you, only that you have the right to "obtain human intervention". That seems a little to soft, I'd like to know under which rules exactly I'm being judged.
That's not correct. Some solution look at perplexity for specific models, some will look at ngram frequencies, and similar approaches. Almost all of those can produce a heatmap of "what looks suspicious". I wouldn't expect any of the detection systems to be like black boxes relying on LLM over the whole text.
Anyone interested to learn more about it, I recommend the recent book "AI Snake Oil" from Arvind Narayanan and Sayash Kapoor [1]. It is a critical but nuanced book and helps to see the whole AI hype a little more clearly.
[1] https://press.princeton.edu/books/hardcover/9780691249131/ai....
In any case, if you where to use LLMs, or other black box solutions, you'd have to yank those out, if you where met with a requirement to explain why something is suspicious.
Reliable systems in some areas? - Absolutely, and yes, even facial recognition. I agree, it works very well, but that is a different issue as it does not reveal or try to guess anything about the inner person. There are other problems that arise from the fact that it works so well (surveillance, etc.), but I did not mean that part of the equation.
Really? A spammer is trying to ace a test where my attention is the prize. I don't really see a huge difference between a student/diploma and a spammer/my attention.
Education tech companies have been playing with ML and similar tech that is "AI adjacent" for decades. If you went to school in the US any time after computers entered the class room, you probably had some exposure to a machine generated/scored test. That data was used to tailor lessons to pupil interest/goals/state curricula. Good software also gave instructor feedback about where each student/cohort is struggling or not.
LLMs are just an evolution of tech that's been pretty well integrated into academic life for a while now. Was anything in academia prepared for this evolution? No. But banning it outright isn't going to work
This happens anyways, though? Any service that's useful for alternative / shady / illicit purposes is part of a cat/mouse game. Even if you don't tell the $badActors what you're looking for, they'll learn soon enough what you're not looking for just by virtue of their exploitative behavior still working.
I'm a little skeptical of any "we fight bad guys!" effort that can be completely tanked by telling the bad guys how they got caught.
The problem being discussed here feels like it should be similar in that last regard: any time an automated system is making a serious decision they should be required to have an explanation and review process. If they don’t have sufficient evidence to back up the claim, they need to collect that evidence before making further accusations.
In what world is this fair? Our court systems certainly don't operate under these assumptions.
I had both, over a decade ago in high school. Plagiarism detection is the original AI detection, although they usually told you specifically what you were accused of stealing from. A computer-based English course I took over the summer used automated grading to decide if what you wrote was good enough (IIRC they did have a human look over it at some point).
But if you're a nobody, and can't afford to go to court against Deutsche Bank for example, of course you're SOL. EU has some good parts, but it's still a human government.
It's especially problematic since a good chunk of those "flagged" are actually doing something nefarious, and both courts and government will consider that "mostly works" is a good outcome. One or ten unlucky citizens are just the way the world works, as long as it's not someone with money or power or fame.
Figuring out who the hell you are in your high school years was hard enough when Kafka was only a reading assignment.
The assumption is that a human review the conditions that led the automated system to make that decision.
I think it would be trivial to argue in court that rubberstamping some scalar value that a deep neural net or whatever spit out does not pass that bar. It's still the automated system's decision, the human is just parroting it.
Note that it's easier for the FAANGs to argue such a review has happened because they have massive amounts of heterogenous data where there's bound to be something that would be sufficient to argue with (like having posted something that offended someone).
But a single score? I'd say almost impossible to argue. One would have to demonstrate that the system is near-perfect, and virtually never makes mistakes.
Either there can be an undefeatable AI detector, or an undetectable AI writer, both can't exist in the same universe. And my assumption is that with sufficient advances there could be a fully human-equivalent AI that is not distinguishable from a human in any way, so in that sense being able to detect it will actually never work.