Most active commenters

hnlmorg(8)
throwaway290(5)

Do AI detectors work? Students face false cheating accusations

(www.bloomberg.com)

Show context

greatartiste ◴[21 Oct 24 06:46 UTC] No.41901335[source]▶

For a human who deals with student work or reads job applications spotting AI generated work quickly becomes trivially easy. Text seems to use the same general framework (although words are swapped around) also we see what I call 'word of the week' where whichever 'AI' engine seems to get hung up on a particular English word which is often an unusual one and uses it at every opportunity. It isn't long before you realise that the adage that this is just autocomplete on steroids is true.

However programming a computer to do this isn't easy. In a previous job I had dealing with plagiarism detectors and soon realised how garbage they were (and also how easily fooled they are - but that is another story). The staff soon realised what garbage these tools are so if a student accused of plagiarism decided to argue back then the accusation would be quietly dropped.

replies(14): >>41901440 #>>41901484 #>>41901662 #>>41901851 #>>41901926 #>>41901937 #>>41902038 #>>41902121 #>>41902132 #>>41902248 #>>41902627 #>>41902658 #>>41903988 #>>41906183 #

1. acchow ◴[21 Oct 24 07:09 UTC] No.41901484[source]▶

>>41901335 #

> For a human who deals with student work or reads job applications spotting AI generated work quickly becomes trivially easy. Text seems to use the same general framework (although words are swapped around) also we see what I call 'word of the week'

Easy to catch people that aren't trying in the slightest not to get caught, right? I could instead feed a corpus of my own writing to ChatGPT and ask it to write in my style.

replies(1): >>41901583 #

2. hau ◴[21 Oct 24 07:29 UTC] No.41901583[source]▶

>>41901484 (TP) #

I don't believe it's possible at all if any effort is made beyond prompting chat-like interfaces to "generate X". Given a hand crafted corpus of text even current llms could produce perfect style transfer for a generated continuation. If someone believes it's trivially easy to detect, then they absolutely have no idea what they are dealing with.

I assume most people would make least amount of effort and simply prompt chat interface to produce some text, such text is rather detectable. I would like to see some experiments even for this type of detection though.

replies(1): >>41901673 #

3. hnlmorg ◴[21 Oct 24 07:42 UTC] No.41901673[source]▶

>>41901583 #

Are you then plagiarising if the LLM is just regurgitating stuff you’d personally written?

The point of these detectors is to spot stuff the students didn’t research and write themselves. But if the corpus is your own written material then you’ve already done the work yourself.

replies(2): >>41901696 #>>41901754 #

4. throwaway290 ◴[21 Oct 24 07:49 UTC] No.41901696{3}[source]▶

>>41901673 #

LLM is just regurgitating stuff as a principle. You can request someone else's style. People who are easy to detect simply don't do that. But they will learn quickly

replies(2): >>41902120 #>>41903123 #

5. hau ◴[21 Oct 24 08:01 UTC] No.41901754{3}[source]▶

>>41901673 #

Oh I agree, producing text by llms which is expected to be produced by human is at least deceiving and probably plagiarising. It's also skipping some important work, if we're talking about some person trying to detect it at all, usually in education context.

Student don't have to perform research or study for the given task, they need to acquire an example of text suitable for reproducing their style, text structure, to create an impression of being produced by hand, so the original task could be avoided. You have to have at least one corpus of your own work for this to work, or an adequate substitute. And you still could reject works by their content, but we are specifically talking about llm smell.

I was talking about the task of detecting llm generated text which is incredibly hard if any effort is made, while some people have an impression that it's trivially easy. It leads to unfair outcomes while giving false confidence to e.g. teachers that llms are adequately accounted for.

6. A4ET8a8uTh0 ◴[21 Oct 24 09:10 UTC] No.41902120{4}[source]▶

>>41901696 #

Yep, some with fun results. I occasionally amuse myself now by asking for X in the style of writing of fictional figure Y. It does have moments.

7. hnlmorg ◴[21 Oct 24 11:50 UTC] No.41903123{4}[source]▶

>>41901696 #

I’ve found LLMs to be relatively poor at writing in someone else’s style beyond superficial / comical styles like “pirate” or “Shakespeare”.

To get an LLM to generate content in your own writing, there’s going to be no substitute for training it on your own corpus. By which point you might as well do the work yourself.

The whole point cheating is to avoid doing the work. Building your own corpus requires doing that work.

replies(1): >>41903410 #

8. throwaway290 ◴[21 Oct 24 12:25 UTC] No.41903410{5}[source]▶

>>41903123 #

I meant you don't need to feed it your corpus if it's good enough at mimicking styles. Just ask to mimic someone else. I don't mean novelty like pirate or shakespeare. Mimic "a student with average ability". Then ask to ramp up authenticity. Or even use some model or service with this built in so you don't even need to write any prompts. Zero effort

You're saying it's not good enough at mimicking styles. others saying it's good enough. I think if it's not good enough today it'll be good enough tomorrow. Are you betting on it not becoming good enough?

replies(1): >>41903656 #

9. hnlmorg ◴[21 Oct 24 12:55 UTC] No.41903656{6}[source]▶

>>41903410 #

I’m betting on it not becoming good enough at mimicking a specific students style without having access to their specific work.

Teachers will notice if students writing style shifts in one piece compared to another.

Nobody disputes that you can get LLMs to mimic other people. However it cannot mimic a specific style it hasn’t been trained on. And very few people who are going to cheat are going to take the time to train an LLM on their writing style since the entire point of plagiarism is to avoid doing work.

replies(1): >>41904878 #

10. throwaway290 ◴[21 Oct 24 14:55 UTC] No.41904878{7}[source]▶

>>41903656 #

How would the teacher know what student's style is if she always uses the LLM? Also do you expect that student's style is fixed forever or teachers are all so invested that they can really tell when the student is trying something new vs use an LLM that was trained to output writing in the style of an average student?

Imagine the teacher saying "this is not your style it's too good" to a student who legit tried killing any motivation to do anything but cheat for remaining life

replies(1): >>41905280 #

11. hnlmorg ◴[21 Oct 24 15:34 UTC] No.41905280{8}[source]▶

>>41904878 #

> How would the teacher know what student's style is if she always uses the LLM?

If the student always uses LLMs then it would be pretty obvious by the fact that they’re failing at the cause in all bar the written assessments (ie the stuff they can cheat on).

> Also do you expect that student's style is fixed forever

Of course not. But people’s styles don’t change dramatically on one paper and reset back afterwards.

> teachers are all so invested that they can really tell when the student is trying something new vs use an LLM that was trained to output writing in the style of an average student?

Depends on the size of the classes. When I was at college I do know that teachers did check for changes in writing styles. I know this because one of the kids on my class was questioned about his changes in his writing style.

With time, I’m sure anti-cheat software will also check again previous works by the students to check for changes in style.

However this was never my point. My point was that cheaters wouldn’t bother training on their own corpus. You keep pushing the conversation away from that.

> Imagine the teacher saying "this is not your style it's too good" to a student who legit tried killing any motivation to do anything but cheat for remaining life

That’s how literally no good teacher would ever approach the subject. Instead they’d talk about how good the paper was and ask about where the inspiration came from.

replies(2): >>41906979 #>>41911940 #

12. og_kalu ◴[21 Oct 24 18:32 UTC] No.41906979{9}[source]▶

>>41905280 #

>If the student always uses LLMs then it would be pretty obvious by the fact that they’re failing at the cause in all bar the written assessments (ie the stuff they can cheat on).

There's nothing stopping students from generating an essay and going over it.

>Of course not. But people’s styles don’t change dramatically on one paper and reset back afterwards.

Takes just a little effort to avoid this.

>With time, I’m sure anti-cheat software will also check again previous works by the students to check for changes in style.

That's never going to happen. Probably because it doesn't make any sense. What's a change in writing style ? Who's measuring that ? And why is that an indicator of cheating ?

>However this was never my point. My point was that cheaters wouldn’t bother training on their own corpus. You keep pushing the conversation away from that.

Training is not necessary in any technical sense. A decent sample of your writing in the context is more than good enough. Probably most cheaters wouldn't bother but some certainly would.

replies(1): >>41908102 #

13. hnlmorg ◴[21 Oct 24 20:17 UTC] No.41908102{10}[source]▶

>>41906979 #

> There's nothing stopping students from generating an essay and going over it.

This then comes back to my original point. If they learn the content and rewrite the output, is it really plagiarism?

> Takes just a little effort to avoid this.

That depends entirely on the size of the coursework.

> That's never going to happen. Probably because it doesn't make any sense. What's a change in writing style ? Who's measuring that ? And why is that an indicator of cheating ?

This entire article and all the conversations that followed are about using writing styles to spot plagiarism. It’s not a new concept nor a claim I made up.

So if you don’t agree with this premise then it’s a little late in the thread to be raising that disagreement.

> Training is not necessary in any technical sense. A decent sample of your writing in the context is more than good enough. Probably most cheaters wouldn't bother but some certainly would.

I think you’d need a larger corpus than the average cheater would be bothered to do. But I will admit I could be waaay off in my estimations of this.

replies(1): >>41908304 #

14. og_kalu ◴[21 Oct 24 20:38 UTC] No.41908304{11}[source]▶

>>41908102 #

>This then comes back to my original point. If they learn the content and rewrite the output, is it really plagiarism?

Who said anything about rewriting? That's not necessary. You can have GPT write your essay and all you do is study it afterwards, maybe ask questions etc. You've saved hours of time and yes that would still be cheating and plagiarism by most.

>This entire article and all the conversations that followed are about using writing styles to spot plagiarism. It’s not a new concept nor a claim I made up.

>So if you don’t agree with this premise then it’s a little late in the thread to be raising that disagreement.

The article is about piping essays into black box neural networks that you can at best hypothesize is looking for similarities between the presented writing and some nebulous "AI" style. It's not comparing styles between your past works and telling you just cheated because of some deviation. That's never going to happen.

>I think you’d need a larger corpus than the average cheater would be bothered to do. But I will admit I could be waaay off in my estimations of this.

An essay or two in the context window is fine. I think you underestimate just what SOTA LLMs are capable of.

You don't even need to bother with any of that if all you want is a consistent style. A style prompt with a few instructions to deviate from GPT's default writing style is sufficient.

My point is that it's not this huge effort to have generated writing that doesn't yo-yo in writing style between essays.

replies(1): >>41909182 #

15. hnlmorg ◴[21 Oct 24 22:21 UTC] No.41909182{12}[source]▶

>>41908304 #

> Who said anything about rewriting? That's not necessary. You can have GPT write your essay and all you do is study it afterwards, maybe ask questions etc. You've saved hours of time and yes that would still be cheating and plagiarism by most.

Maybe. But I think we are getting too deep into hypotheticals about stuff that wasn’t even related to my original point.

> The article is about piping essays into black box neural networks that you can at best hypothesize is looking for similarities between the presented writing and some nebulous "AI" style. It's not comparing styles between your past works and telling you just cheated because of some deviation. That's never going to happen.

You cannot postulate your own hypothetical scenarios and deny other people the same privilege. That’s just not an honest way to debate.

> My point is that it's not this huge effort to have generated writing that doesn't yo-yo in writing style between essays.

I get your point. It’s just your point requires a bunch of assumptions and hypotheticals to work.

In theory you’re right. But, and at risk of continually harping on about my original point, I think the effort involved in doing it well would be beyond the effort required for the average person looking to cheat.

And that’s the real crux of it. Not whether something can be done, because hypothetically speaking anything is possible in AI with sufficient time, money and effort. But that doesn’t mean it’s actually going to happen.

But since this entire argument is a hypothetical, it’s probably better we agree to disagree.

16. throwaway290 ◴[22 Oct 24 07:12 UTC] No.41911940{9}[source]▶

>>41905280 #

> pretty obvious by the fact that they’re failing at the cause in all bar the written assessments (ie the stuff they can cheat on).

performing badly under pressure is not a thing in your world

> My point was that cheaters wouldn’t bother training on their own corpus. You keep pushing the conversation away from that.

My point was cheaters don't need to train on their corpus. That's why it's zero effort. You keep trying to wave that away

> That’s how literally no good teacher would ever approach the subject.

Now we only need to eliminate bad teachers

replies(1): >>41913974 #

17. hnlmorg ◴[22 Oct 24 13:21 UTC] No.41913974{10}[source]▶

>>41911940 #

>performing badly under pressure is not a thing in your world

No need to be rude.

Pressure presents different characteristics. Plus lecturers would be working with failing students so would understand the difference between pressure and cheating.

> My point was cheaters don't need to train on their corpus. That's why it's zero effort. You keep trying to wave that away

My entire point was that most cheats wouldn't bother training their corpus!

With the greatest of respect, have you actually read my comments?

> Now we only need to eliminate bad teachers

Well that's a whole other discussion :)

replies(1): >>41923623 #

18. throwaway290 ◴[23 Oct 24 10:29 UTC] No.41923623{11}[source]▶

>>41913974 #

> My entire point was that most cheats wouldn't bother training their corpus!

Good, because they don't need a custom corpus to cheat with LLMs with most normal teachers.

And if a teacher reduced your grade saying you are using LLM because your style doesn't match you just report them for it and say you were trying a new style (teacher would probably will be wrong 50% of the time anyway)

replies(1): >>41929099 #

19. hnlmorg ◴[23 Oct 24 20:42 UTC] No.41929099{12}[source]▶

>>41923623 #

> Good, because they don't need a custom corpus to cheat with LLMs with most normal teachers.

I think you're underestimating the capabilities of normal teachers. And I say this as someone who a large percentage of their family are teachers.

Also this topic was about using LLMs to spot LLMs. Not teachers spotting LLMs.

> And if a teacher reduced your grade saying you are using LLM because your style doesn't match you just report them for it and say you were trying a new style (teacher would probably will be wrong 50% of the time anyway)

You're drifting off topic again. I'm not going to discuss handling false positives because that's going to come down the policies of each institution.

↑