Most active commenters

hnlmorg(8)
throwaway290(5)
A4ET8a8uTh0(5)
red_admiral(4)
ben_w(3)
tonypace(3)

Popular/hot comments

>>41901662 #
>>41901914 #
>>41901926 #
>>41902038 #
>>41904027 #

←back to thread

Do AI detectors work? Students face false cheating accusations

(www.bloomberg.com)

1. greatartiste ◴[21 Oct 24 06:46 UTC] No.41901335[source]▶

>>41896973 (OP) #

For a human who deals with student work or reads job applications spotting AI generated work quickly becomes trivially easy. Text seems to use the same general framework (although words are swapped around) also we see what I call 'word of the week' where whichever 'AI' engine seems to get hung up on a particular English word which is often an unusual one and uses it at every opportunity. It isn't long before you realise that the adage that this is just autocomplete on steroids is true.

However programming a computer to do this isn't easy. In a previous job I had dealing with plagiarism detectors and soon realised how garbage they were (and also how easily fooled they are - but that is another story). The staff soon realised what garbage these tools are so if a student accused of plagiarism decided to argue back then the accusation would be quietly dropped.

replies(14): >>41901440 #>>41901484 #>>41901662 #>>41901851 #>>41901926 #>>41901937 #>>41902038 #>>41902121 #>>41902132 #>>41902248 #>>41902627 #>>41902658 #>>41903988 #>>41906183 #

2. ClassyJacket ◴[21 Oct 24 07:03 UTC] No.41901440[source]▶

>>41901335 (TP) #

How are you verifying you're correct? How do you know you're not finding false positives?

replies(1): >>41901515 #

3. acchow ◴[21 Oct 24 07:09 UTC] No.41901484[source]▶

>>41901335 (TP) #

> For a human who deals with student work or reads job applications spotting AI generated work quickly becomes trivially easy. Text seems to use the same general framework (although words are swapped around) also we see what I call 'word of the week'

Easy to catch people that aren't trying in the slightest not to get caught, right? I could instead feed a corpus of my own writing to ChatGPT and ask it to write in my style.

replies(1): >>41901583 #

4. Etheryte ◴[21 Oct 24 07:15 UTC] No.41901515[source]▶

>>41901440 #

Have you tried reading AI-generated code? Most of the time it's painfully obvious, so long as the snippet isn't short and trivial.

replies(1): >>41901632 #

5. hau ◴[21 Oct 24 07:29 UTC] No.41901583[source]▶

>>41901484 #

I don't believe it's possible at all if any effort is made beyond prompting chat-like interfaces to "generate X". Given a hand crafted corpus of text even current llms could produce perfect style transfer for a generated continuation. If someone believes it's trivially easy to detect, then they absolutely have no idea what they are dealing with.

I assume most people would make least amount of effort and simply prompt chat interface to produce some text, such text is rather detectable. I would like to see some experiments even for this type of detection though.

replies(1): >>41901673 #

6. thih9 ◴[21 Oct 24 07:36 UTC] No.41901632{3}[source]▶

>>41901515 #

To me it is not obvious. I work with junior level devs and have seen a lot of non-AI junior level code.

replies(1): >>41901914 #

7. tessierashpool9 ◴[21 Oct 24 07:41 UTC] No.41901662[source]▶

>>41901335 (TP) #

the students are too lazy and dumb to do their own thinking and resort to ai. the teachers are also too lazy and dumb to assess the students' work and resort to ai. ain't it funny?

replies(4): >>41901734 #>>41901918 #>>41902146 #>>41902152 #

8. hnlmorg ◴[21 Oct 24 07:42 UTC] No.41901673{3}[source]▶

>>41901583 #

Are you then plagiarising if the LLM is just regurgitating stuff you’d personally written?

The point of these detectors is to spot stuff the students didn’t research and write themselves. But if the corpus is your own written material then you’ve already done the work yourself.

replies(2): >>41901696 #>>41901754 #

9. throwaway290 ◴[21 Oct 24 07:49 UTC] No.41901696{4}[source]▶

>>41901673 #

LLM is just regurgitating stuff as a principle. You can request someone else's style. People who are easy to detect simply don't do that. But they will learn quickly

replies(2): >>41902120 #>>41903123 #

10. miningape ◴[21 Oct 24 07:56 UTC] No.41901734[source]▶

>>41901662 #

It's truly a race to the bottom.

11. hau ◴[21 Oct 24 08:01 UTC] No.41901754{4}[source]▶

>>41901673 #

Oh I agree, producing text by llms which is expected to be produced by human is at least deceiving and probably plagiarising. It's also skipping some important work, if we're talking about some person trying to detect it at all, usually in education context.

Student don't have to perform research or study for the given task, they need to acquire an example of text suitable for reproducing their style, text structure, to create an impression of being produced by hand, so the original task could be avoided. You have to have at least one corpus of your own work for this to work, or an adequate substitute. And you still could reject works by their content, but we are specifically talking about llm smell.

I was talking about the task of detecting llm generated text which is incredibly hard if any effort is made, while some people have an impression that it's trivially easy. It leads to unfair outcomes while giving false confidence to e.g. teachers that llms are adequately accounted for.

12. aleph_minus_one ◴[21 Oct 24 08:16 UTC] No.41901851[source]▶

>>41901335 (TP) #

> The staff soon realised what garbage these tools are so if a student accused of plagiarism decided to argue back then the accusation would be quietly dropped.

I ask myself when the time comes that some student will accuse the stuff of libel or slander becuase of false AI plagiarism accusations.

replies(1): >>41902481 #

13. llmthrow102 ◴[21 Oct 24 08:28 UTC] No.41901914{4}[source]▶

>>41901632 #

You mean, you work with devs who are using AI to generate their code.

replies(4): >>41901987 #>>41902164 #>>41902461 #>>41904228 #

14. llmthrow102 ◴[21 Oct 24 08:29 UTC] No.41901918[source]▶

>>41901662 #

To be fair, using humans to spend time sifting through AI slop determining what is and isn't AI generated is not a fight that the humans are going to win.

15. sumo89 ◴[21 Oct 24 08:32 UTC] No.41901926[source]▶

>>41901335 (TP) #

My other half is a non-native English speaker. She's fluent but and since ChatGPT came out she's found it very helpful having somewhere to paste a paragraph and get a better version back rather than asking me to rewrite things. That said, she'll often message me with some text and I've got a 100% hit rate for guessing if she's put it through AI first. Once you're used to how they structure sentences it's very easy to spot. I guess the hardest part is being able to prove it if you're in a position of authority like a teacher.

replies(3): >>41902119 #>>41903286 #>>41903629 #

16. p0w3n3d ◴[21 Oct 24 08:36 UTC] No.41901937[source]▶

>>41901335 (TP) #

> For a human who deals with student work or reads job applications spotting AI generated work quickly becomes trivially easy

So far. Unless there is a new generation of teachers who are no longer able to learn on non-AI generated texts because all they get is grammatically corrected by AI for example...

Even I am using Grammarly here (as being non-native), but I usually tend to ignore it, because it removes all my "spoken" style, or at least what I think is a "spoken style"

replies(1): >>41903455 #

17. ◴[21 Oct 24 08:44 UTC] No.41901987{5}[source]▶

>>41901914 #

18. JoshTriplett ◴[21 Oct 24 08:54 UTC] No.41902038[source]▶

>>41901335 (TP) #

> also we see what I call 'word of the week' where whichever 'AI' engine seems to get hung up on a particular English word which is often an unusual one and uses it at every opportunity

So do humans. Many people have pet phrases or words that they use unusually often compared to others.

replies(3): >>41902139 #>>41902297 #>>41903277 #

19. ben_w ◴[21 Oct 24 09:10 UTC] No.41902119[source]▶

>>41901926 #

My partner and I are both native English speakers in Germany; if I use ChatGPT to make a sentence in German, he also spots it 100% of the time.

(Makes me worry I'm not paying enough attention, that I can't).

20. A4ET8a8uTh0 ◴[21 Oct 24 09:10 UTC] No.41902120{5}[source]▶

>>41901696 #

Yep, some with fun results. I occasionally amuse myself now by asking for X in the style of writing of fictional figure Y. It does have moments.

21. Veen ◴[21 Oct 24 09:10 UTC] No.41902121[source]▶

>>41901335 (TP) #

The ones that are easy to spot are easy to spot. You have no idea how much AI-generated work you didn't spot, because you didn't spot it.

22. xmodem ◴[21 Oct 24 09:11 UTC] No.41902132[source]▶

>>41901335 (TP) #

One course I took actually provided students with the output of the plagiarism detector. It was great at correctly identifying where I had directly quoted (and attributed) a source.

It would also identify random 5-6 word phrases and attribute them to different random texts on completely different topics where those same 5 words happened to appear.

23. blitzar ◴[21 Oct 24 09:11 UTC] No.41902139[source]▶

>>41902038 #

No cap.

24. A4ET8a8uTh0 ◴[21 Oct 24 09:13 UTC] No.41902146[source]▶

>>41901662 #

I suppose we all get from school what we put into it.

I forgot the name of the guy, who said it, but he was some big philosophy lecturer at Harvard and his view on the matter ( heavy reading course and one student left a course review - "not reading assigned reading did not hurt me at all") was ( paraphrased):

"This guy is an idiot if he thinks the point of paying $60k a semester of parents money is to sit here and learn nothing.'

replies(1): >>41903681 #

25. sensanaty ◴[21 Oct 24 09:14 UTC] No.41902152[source]▶

>>41901662 #

It's a race to the bottom, though. Why should the humans waste their time reading through AI-generated slop that took 11ms to generate, when it can take an hour or more to manually review it?

26. ben_w ◴[21 Oct 24 09:15 UTC] No.41902164{5}[source]▶

>>41901914 #

Not saying where, but well before transformers were invented, I saw an iOS project that had huge chunks of uncompiled Symbian code in the project "for reference", an entire pantheon of God classes, entire files duplicated rather than changing access modifiers, 1000 lines inside an always true if block, and 20% of the 120,000 lines were:

And no, those were not generally followed by a real comment.

replies(1): >>41903546 #

27. SilverBirch ◴[21 Oct 24 09:30 UTC] No.41902248[source]▶

>>41901335 (TP) #

I did engineering at a university, one of the courses that was mandatory was technical communication. The prof understood that the type of person that went into engineering was not necessarily going to appreciate the subtleties of great literature, so they're course work was extremely rote. It was like "Write about a technical subject, doesn't matter what, 1500 words, here's the exact score card". And the score card was like "Uses a sentence to introduce the topic of the paragraph". The result was that you write extremely formulaic prose. Now, I'm not sure that was going to teach people to ever be great communicators, but I think it worked extremely well to bring someone who communicated very badly up to some basic minimum standard. It could be extremely effective applied to the (few) other courseworks that required prose too - partly because by being so formulaic you appealed the overworked PhD student who was likely marking it.

It seems likely that a suitably disciplined student could look a lot like ChatGPT and the cost of a false accusation is extremely high.

replies(2): >>41903648 #>>41904741 #

28. jachee ◴[21 Oct 24 09:37 UTC] No.41902297[source]▶

>>41902038 #

In the mid 90s (yes I’m dating myself here. :P) I had a classmate who was such a big NIN fan that she worked the phrase “downward spiral” into every single essay she wrote for the entire year.

29. michaelt ◴[21 Oct 24 10:00 UTC] No.41902461{5}[source]▶

>>41901914 #

Actually some of us have been in the industry for more than 22 months.

30. red_admiral ◴[21 Oct 24 10:02 UTC] No.41902481[source]▶

>>41901851 #

Or of racism. There was a thing during the pandemic where automated proctoring tools couldn't cope with people of darker skin than they were trained on; I imagine the first properly verified and scientifically valid examples of AI-detection racism will be found soon.

replies(2): >>41902556 #>>41904671 #

31. Iulioh ◴[21 Oct 24 10:15 UTC] No.41902556{3}[source]▶

>>41902481 #

The "dark skin problem" is mostly the camera sensors, not only the training...

Low light scenarios are just a thing, you would need more expensive hardware do deal with it.

replies(1): >>41902726 #

32. wrasee ◴[21 Oct 24 10:29 UTC] No.41902627[source]▶

>>41901335 (TP) #

> trivially easy

That’s the problem. It is trivially easy, 99% of the time. But that misses the entire point of the article.

If I got 99% on an exam I’d say that was trivially easy. But making one mistake in a hundred is not ok when it’s someone else’s livelihood.

33. shusaku ◴[21 Oct 24 10:34 UTC] No.41902658[source]▶

>>41901335 (TP) #

What are you asking your applicants to do that LLM use is a problem? I see no issue with having a machine compile one’s history into a resume. Is their purpose statement not original enough /s?

34. 15155 ◴[21 Oct 24 10:44 UTC] No.41902726{4}[source]▶

>>41902556 #

> mostly the camera sensors

Could it be mostly just be..reality? More expensive hardware doesn't somehow make a darker surface reflect more energy in the visible spectrum. "Low light" is not the same condition as "dark surface in well-lit environment."

Leaving the visible spectrum is one possible solution, but it's substantially more error-prone and costly. This is still not the same solution as classical CV with "more expensive hardware."

replies(1): >>41903142 #

35. hnlmorg ◴[21 Oct 24 11:50 UTC] No.41903123{5}[source]▶

>>41901696 #

I’ve found LLMs to be relatively poor at writing in someone else’s style beyond superficial / comical styles like “pirate” or “Shakespeare”.

To get an LLM to generate content in your own writing, there’s going to be no substitute for training it on your own corpus. By which point you might as well do the work yourself.

The whole point cheating is to avoid doing the work. Building your own corpus requires doing that work.

replies(1): >>41903410 #

36. red_admiral ◴[21 Oct 24 11:53 UTC] No.41903142{5}[source]▶

>>41902726 #

If you're building a system to proctor students, then part of your job is to get it to work under all reasonable real-world conditions you might encounter: low light, students with standard webcams or just the one built into their laptop, students with darker skin etc. Reality might make this harder for some cases, but solving that is what you are being paid for.

Also, this could have been handled much better in the cases that came up in the media if there had been proper human review of all cases before prosecuting the students.

replies(1): >>41904027 #

37. pessimizer ◴[21 Oct 24 12:09 UTC] No.41903277[source]▶

>>41902038 #

People have their favorite phrases or words, but also as readers we fixate on words that we don't personally use, and project that onto the writer.

But as a second language learner, you notice that people get stuck on particular words during writing sessions. If I run into a very unusual (and unnecessary) word, I know they're going to use it again within a page or two, maybe once after that, then never again.

I blame it on the writer remembering a cool word, or finding a cool word in a thesaurus, then that word dropping out of their active vocabulary after they tried it out a couple times. There's probably an analogue in LLMs, if just because that makes unusual words more likely to repeat themselves in a particular passage.

replies(1): >>41912239 #

38. tonypace ◴[21 Oct 24 12:11 UTC] No.41903286[source]▶

>>41901926 #

It looked like black magic at first. But then you started to see the signs.

39. throwaway290 ◴[21 Oct 24 12:25 UTC] No.41903410{6}[source]▶

>>41903123 #

I meant you don't need to feed it your corpus if it's good enough at mimicking styles. Just ask to mimic someone else. I don't mean novelty like pirate or shakespeare. Mimic "a student with average ability". Then ask to ramp up authenticity. Or even use some model or service with this built in so you don't even need to write any prompts. Zero effort

You're saying it's not good enough at mimicking styles. others saying it's good enough. I think if it's not good enough today it'll be good enough tomorrow. Are you betting on it not becoming good enough?

replies(1): >>41903656 #

40. tonypace ◴[21 Oct 24 12:31 UTC] No.41903455[source]▶

>>41901937 #

It definitely flattens your style.

41. tonypace ◴[21 Oct 24 12:41 UTC] No.41903546{6}[source]▶

>>41902164 #

And yet, I have an unfortunately clear mental picture of the human that did this. In itself, that is a very specific coding style. I don't imagine an LLM would do that. Chat would instead take a couple of the methods from the Symbian codebase and use them where they didn't exist. The God classes would merely be mined for more non-existent functions. The true if block would become a function. And the # lines would have comments on them. Useless comments, but there would be text following every last one of them. Totally different styles.

replies(1): >>41903997 #

42. VeninVidiaVicii ◴[21 Oct 24 12:52 UTC] No.41903629[source]▶

>>41901926 #

Are you guys using free versions of terrible tools? Asking it just to rewrite the whole thing? I use it every day for checking academic figure legends and such, and get extremely minor edits — such as a capitalization or italicization.

43. VeninVidiaVicii ◴[21 Oct 24 12:54 UTC] No.41903648[source]▶

>>41902248 #

This is my exact issue. ChatGPT seems formulaic in part, because so much of the work it’s trained on is also formulaic or at least predictable.

44. hnlmorg ◴[21 Oct 24 12:55 UTC] No.41903656{7}[source]▶

>>41903410 #

I’m betting on it not becoming good enough at mimicking a specific students style without having access to their specific work.

Teachers will notice if students writing style shifts in one piece compared to another.

Nobody disputes that you can get LLMs to mimic other people. However it cannot mimic a specific style it hasn’t been trained on. And very few people who are going to cheat are going to take the time to train an LLM on their writing style since the entire point of plagiarism is to avoid doing work.

replies(1): >>41904878 #

45. lupire ◴[21 Oct 24 12:58 UTC] No.41903681{3}[source]▶

>>41902146 #

He's paying for the degree and the professional network. Studying would be a waste of time.

replies(1): >>41903795 #

46. A4ET8a8uTh0 ◴[21 Oct 24 13:11 UTC] No.41903795{4}[source]▶

>>41903681 #

I hope it will not sound too preachy. You are right in a sense that it is what he thinks he is paying for, but is actually missing out on untapped value. He will not be able to discuss death as a concept throughout the lens of various authors. He will not wrestle with questions of cognition and its human limitations ( which amusingly is a relevant subject these days ). He will not learn anything. He is and will remain an adult child in adult daycare.

I could go on like this, but I won't. Each of us has a choice how we play the cards we are dealt.

I accept your point, but this point reinforces a perspective I heard from my accountant family member, who clearly can identify price, but has a hard time not equating it with value. I hesitate to use the word wrong, because it is pragmatic, but it is also rather wasteful ( if not outright dumb ).

replies(1): >>41909551 #

47. Buttons840 ◴[21 Oct 24 13:32 UTC] No.41903988[source]▶

>>41901335 (TP) #

Students who use the "word of the week" can easily explain it by saying they used an AI in their studies.

"You asked us to write an essay on the Civil War. The first thing I did was ask an AI to explain it to me, and I asked the AI some follow-up questions. Then I did some research using other sources and wrote my paper."

It might even be a true story, and in such a case it's not surprising that the student would repeat words they encountered while studying.

48. ben_w ◴[21 Oct 24 13:33 UTC] No.41903997{7}[source]▶

>>41903546 #

Depends on the LLM.

I've seen exactly what you describe and worse *, and I've also seen them keep to one style until I got bored of prompting for new features to add to the project.

* one standard test I have is "make a tetris game as a single page web app", and one model started wrong and then suddenly flipped from Tetris in html/js to ML in python.

49. VancouverMan ◴[21 Oct 24 13:36 UTC] No.41904027{6}[source]▶

>>41903142 #

The last time I got an ID photo taken, I got to wait and watch as the dark-skinned Indian photographer repeatedly struggled to take a suitable passport photo of the light-skinned white woman who was in line directly ahead of me.

This was at a long-established mall shop that specialized in photography products and services. The same photographer had taken suitable photos of some other people in line ahead of us rather quickly.

The studio area was professional enough, with a backdrop, with dedicated photography lighting, with ample lighting in the shop beyond that, and with an adjustable stool for the subject to sit on.

The camera appeared to be a DSLR with a lens and a lens hood, similar enough to what I've seen professional wedding photographers use. It was initially on a tripod, although the photographer eventually removed it during later attempts.

Despite being in a highly-controlled purpose-built environment, and using photography equipment much better than that of a typical laptop or phone camera, the photographer still couldn't take a suitable photo of this particular woman, despite repeated attempts and adjustments to the camera's settings and to the environment.

Was the photographer "racist"? I would guess not, given the effort he put in, and the frustration he was exhibiting at the lack of success.

Was the camera "racist"? No, obviously not.

Sometimes it can just be difficult to take a suitable photo, even when using higher-end equipment in a rather ideal environment.

It has nothing to do with "racism".

replies(3): >>41904168 #>>41905360 #>>41906955 #

50. red_admiral ◴[21 Oct 24 13:53 UTC] No.41904168{7}[source]▶

>>41904027 #

I think this comes down to there being different definitions of racism, that are sometimes flat out contradictory.

I don't think anyone is saying that the universities or the software companies have some kind of secret agenda to keep black people out. As far as I can tell there's good evidence they're mostly trying to get more black people in (and in some cases to keep Asians out, but that's another story). I also don't think anyone here was acting out of fear or hatred of black people.

What I am claiming is that the universities in question ended up with a proctoring product that was more likely to produce false positives for students with darker skin colors, and did not apply sufficient human review and/or giving people the benefit of the doubt to cancel out those effects. It is quite likely that whatever model-training and testing the software companies did, was mostly on fair-skinned people in well-lit environments, otherwise they would have picked up this problem earlier on. This is not super-woke Ibram X Kendi applied antiracism, this is doing your job properly to make sure your product works for all students, especially as the students don't have any choice to opt out of using the proctoring software beyond quitting their college.

To me it's on the same level as having a SQL injection vulnerability: maybe you didn't intend to get your users' data exposed - about 100% of the time when this happens, the company involved very much did not intend to have a data breach - but it happened anyway, you were incompetent at the job and your users are now dealing with the consequences.

And to the extent that those consequences here fall disproportionately on skin colors (and so, by correlation, on ethnicities) that have historically been disadvantaged, calling this a type of racism seems appropriate. It's very much not the KKK type of racism, but it could very well still meet legal standards for discrimination.

replies(1): >>41905562 #

51. max51 ◴[21 Oct 24 13:58 UTC] No.41904228{5}[source]▶

>>41901914 #

I saw a lot of unbelievably bad code when I was teaching in university. I doubt that my undergrad students who couldn't code had access to LLMs in 2011.

52. jjmarr ◴[21 Oct 24 14:39 UTC] No.41904671{3}[source]▶

>>41902481 #

https://www.nature.com/articles/s41586-024-07856-5

LLMs already discriminates against African-American English. You could argue a human grader would as well, but all tested models were more consistent in assigning negative adjectives to hypothetical speakers of that dialect.

replies(1): >>41905451 #

53. jjmarr ◴[21 Oct 24 14:45 UTC] No.41904741[source]▶

>>41902248 #

Extremely disciplined students always feed papers into AI detectors before submitting and then revise their work until it passes.

Dodging the detector is done regardless of whether or not one has used AI to write that paper.

54. throwaway290 ◴[21 Oct 24 14:55 UTC] No.41904878{8}[source]▶

>>41903656 #

How would the teacher know what student's style is if she always uses the LLM? Also do you expect that student's style is fixed forever or teachers are all so invested that they can really tell when the student is trying something new vs use an LLM that was trained to output writing in the style of an average student?

Imagine the teacher saying "this is not your style it's too good" to a student who legit tried killing any motivation to do anything but cheat for remaining life

replies(1): >>41905280 #

55. hnlmorg ◴[21 Oct 24 15:34 UTC] No.41905280{9}[source]▶

>>41904878 #

> How would the teacher know what student's style is if she always uses the LLM?

If the student always uses LLMs then it would be pretty obvious by the fact that they’re failing at the cause in all bar the written assessments (ie the stuff they can cheat on).

> Also do you expect that student's style is fixed forever

Of course not. But people’s styles don’t change dramatically on one paper and reset back afterwards.

> teachers are all so invested that they can really tell when the student is trying something new vs use an LLM that was trained to output writing in the style of an average student?

Depends on the size of the classes. When I was at college I do know that teachers did check for changes in writing styles. I know this because one of the kids on my class was questioned about his changes in his writing style.

With time, I’m sure anti-cheat software will also check again previous works by the students to check for changes in style.

However this was never my point. My point was that cheaters wouldn’t bother training on their own corpus. You keep pushing the conversation away from that.

> Imagine the teacher saying "this is not your style it's too good" to a student who legit tried killing any motivation to do anything but cheat for remaining life

That’s how literally no good teacher would ever approach the subject. Instead they’d talk about how good the paper was and ask about where the inspiration came from.

replies(2): >>41906979 #>>41911940 #

56. realitychx2020 ◴[21 Oct 24 15:43 UTC] No.41905360{7}[source]▶

>>41904027 #

>> It has nothing to do with "racism".

Every major system in the US academic system is aimed to reducing Asian population. It often comes in the guise of DEI with a very wide definition of "Diversity" that rarely includes Asian.

These systems will use subtle features to blackbox racism. They may just be overt and leak over metadata to achieve it, or get smart and using writing styles.

57. kayodelycaon ◴[21 Oct 24 15:54 UTC] No.41905451{4}[source]▶

>>41904671 #

This is entirely unsurprising to me. As taught to me, written English (in the US) has a much stricter structure and vocabulary. African-American English was used as the primary example of incorrect and unprofessional writing.

replies(1): >>41908779 #

58. zahlman ◴[21 Oct 24 16:04 UTC] No.41905562{8}[source]▶

>>41904168 #

>What I am claiming is that the universities in question ended up with a proctoring product that was more likely to produce false positives for students with darker skin colors, and did not apply sufficient human review and/or giving people the benefit of the doubt to cancel out those effects.

The issue is that, for most people, the term "racism" connotes a moral failing comparable to the secret agendas, fear and hatred, etc. Specifically, an immoral act motivated by a deliberately applied, irrational prejudice.

Using it to refer to this sort of "disparate impact" is at best needlessly vague, and at worst a deliberate conflation known to be useful to (and used by) the "super-woke Ibram X Kendi" types - equivocating (per https://en.wikipedia.org/wiki/Motte-and-bailey_fallacy) in order to attach the spectre of moral outrage to a problem not caused by any kind of malice.

If you're interested in whether someone might have a legal case, you should be discussing that in an appropriate forum - not with lay language among laypeople.

replies(1): >>41912664 #

59. rahimnathwani ◴[21 Oct 24 17:14 UTC] No.41906183[source]▶

>>41901335 (TP) #

  For a human who deals with student work or reads job applications spotting AI generated work quickly becomes trivially easy.

When evaluating job applications we don't have ground truth labels, so we cannot possibly know the precision or recall of our classification.

60. dpkirchner ◴[21 Oct 24 18:30 UTC] No.41906955{7}[source]▶

>>41904027 #

If the outcome of a system is biased against people with darker or lighter skin, it's obviously racist and should be adjusted or eliminated. It doesn't really matter what the cause of the problem is when making this determination -- we can't just say "lol sorry, some people can't get passport photos."

> Despite being in a highly-controlled purpose-built environment

Frankly it sounds like the environment was not purpose-built at all. It was built to meet insufficient standards, perhaps.

replies(1): >>41912730 #

61. og_kalu ◴[21 Oct 24 18:32 UTC] No.41906979{10}[source]▶

>>41905280 #

>If the student always uses LLMs then it would be pretty obvious by the fact that they’re failing at the cause in all bar the written assessments (ie the stuff they can cheat on).

There's nothing stopping students from generating an essay and going over it.

>Of course not. But people’s styles don’t change dramatically on one paper and reset back afterwards.

Takes just a little effort to avoid this.

>With time, I’m sure anti-cheat software will also check again previous works by the students to check for changes in style.

That's never going to happen. Probably because it doesn't make any sense. What's a change in writing style ? Who's measuring that ? And why is that an indicator of cheating ?

>However this was never my point. My point was that cheaters wouldn’t bother training on their own corpus. You keep pushing the conversation away from that.

Training is not necessary in any technical sense. A decent sample of your writing in the context is more than good enough. Probably most cheaters wouldn't bother but some certainly would.

replies(1): >>41908102 #

62. hnlmorg ◴[21 Oct 24 20:17 UTC] No.41908102{11}[source]▶

>>41906979 #

> There's nothing stopping students from generating an essay and going over it.

This then comes back to my original point. If they learn the content and rewrite the output, is it really plagiarism?

> Takes just a little effort to avoid this.

That depends entirely on the size of the coursework.

> That's never going to happen. Probably because it doesn't make any sense. What's a change in writing style ? Who's measuring that ? And why is that an indicator of cheating ?

This entire article and all the conversations that followed are about using writing styles to spot plagiarism. It’s not a new concept nor a claim I made up.

So if you don’t agree with this premise then it’s a little late in the thread to be raising that disagreement.

> Training is not necessary in any technical sense. A decent sample of your writing in the context is more than good enough. Probably most cheaters wouldn't bother but some certainly would.

I think you’d need a larger corpus than the average cheater would be bothered to do. But I will admit I could be waaay off in my estimations of this.

replies(1): >>41908304 #

63. og_kalu ◴[21 Oct 24 20:38 UTC] No.41908304{12}[source]▶

>>41908102 #

>This then comes back to my original point. If they learn the content and rewrite the output, is it really plagiarism?

Who said anything about rewriting? That's not necessary. You can have GPT write your essay and all you do is study it afterwards, maybe ask questions etc. You've saved hours of time and yes that would still be cheating and plagiarism by most.

>This entire article and all the conversations that followed are about using writing styles to spot plagiarism. It’s not a new concept nor a claim I made up.

>So if you don’t agree with this premise then it’s a little late in the thread to be raising that disagreement.

The article is about piping essays into black box neural networks that you can at best hypothesize is looking for similarities between the presented writing and some nebulous "AI" style. It's not comparing styles between your past works and telling you just cheated because of some deviation. That's never going to happen.

>I think you’d need a larger corpus than the average cheater would be bothered to do. But I will admit I could be waaay off in my estimations of this.

An essay or two in the context window is fine. I think you underestimate just what SOTA LLMs are capable of.

You don't even need to bother with any of that if all you want is a consistent style. A style prompt with a few instructions to deviate from GPT's default writing style is sufficient.

My point is that it's not this huge effort to have generated writing that doesn't yo-yo in writing style between essays.

replies(1): >>41909182 #

64. selimthegrim ◴[21 Oct 24 21:37 UTC] No.41908779{5}[source]▶

>>41905451 #

I think that it’s a little more complicated than that as the comment from Brad Daniels at this link would show - https://www.takeourword.com/TOW145/page4.html

NB: I am not African-American, nor did I grew up on an African-American community, and I performed very well on all sorts of verbal tests. Yet, even I made the all intensive purposes mistake until well into adulthood. Probably a Midwestern thing.

replies(1): >>41917709 #

65. hnlmorg ◴[21 Oct 24 22:21 UTC] No.41909182{13}[source]▶

>>41908304 #

> Who said anything about rewriting? That's not necessary. You can have GPT write your essay and all you do is study it afterwards, maybe ask questions etc. You've saved hours of time and yes that would still be cheating and plagiarism by most.

Maybe. But I think we are getting too deep into hypotheticals about stuff that wasn’t even related to my original point.

> The article is about piping essays into black box neural networks that you can at best hypothesize is looking for similarities between the presented writing and some nebulous "AI" style. It's not comparing styles between your past works and telling you just cheated because of some deviation. That's never going to happen.

You cannot postulate your own hypothetical scenarios and deny other people the same privilege. That’s just not an honest way to debate.

> My point is that it's not this huge effort to have generated writing that doesn't yo-yo in writing style between essays.

I get your point. It’s just your point requires a bunch of assumptions and hypotheticals to work.

In theory you’re right. But, and at risk of continually harping on about my original point, I think the effort involved in doing it well would be beyond the effort required for the average person looking to cheat.

And that’s the real crux of it. Not whether something can be done, because hypothetically speaking anything is possible in AI with sufficient time, money and effort. But that doesn’t mean it’s actually going to happen.

But since this entire argument is a hypothetical, it’s probably better we agree to disagree.

66. recursive ◴[21 Oct 24 23:16 UTC] No.41909551{5}[source]▶

>>41903795 #

It's not wasteful if: the student "values" the credential but not the learning. Some people have no appetite for philosophy or the perspective of notable authors on death. He will probably learn something though.

Your characterization of an adult child does not seem fair. What makes someone an adult? If it's academic discourse, then why is it valuable?

I mean, if you're into it, more power to you. Go nuts with finally figuring out what makes a human. Just don't claim it's more virtuous than anyone else's hobby, unless you can find a reason.

To call it "wasteful" says that something of "value" is being squandered, but the value is perceived by each of us differently.

replies(1): >>41912331 #

67. throwaway290 ◴[22 Oct 24 07:12 UTC] No.41911940{10}[source]▶

>>41905280 #

> pretty obvious by the fact that they’re failing at the cause in all bar the written assessments (ie the stuff they can cheat on).

performing badly under pressure is not a thing in your world

> My point was that cheaters wouldn’t bother training on their own corpus. You keep pushing the conversation away from that.

My point was cheaters don't need to train on their corpus. That's why it's zero effort. You keep trying to wave that away

> That’s how literally no good teacher would ever approach the subject.

Now we only need to eliminate bad teachers

replies(1): >>41913974 #

68. autumnstwilight ◴[22 Oct 24 08:10 UTC] No.41912239{3}[source]▶

>>41903277 #

I do this when I write, to the point where I have to go back and edit myself after using a slightly unusual word several times in quick succession.

I think words I've used recently are easier to access, as if there's a cache for items recently retrieved from deeper layers of memory.

69. A4ET8a8uTh0 ◴[22 Oct 24 08:32 UTC] No.41912331{6}[source]▶

>>41909551 #

<< It's not wasteful if: the student "values" the credential but not the learning.

Hmm, I could try repeating the value argument from the other end, but lets approach it differently. You mention a credential itself being a value, which is not an unreasonable position to take, which is part of the reason I am not dismissing it outright.

But what is a credential? Oracle that is Google defines it as "a qualification, achievement, personal quality, or aspect of a person's background, typically when used to indicate that they are suitable for something." Are they qualified simply, because they 'put in their time' at an institution of supposed higher learning? If so, that credential is not only wasteful, it is also worthless. It exists for now only because it is riding on the glory of its past.

At least previous poster's argument was more direct: it is not about learning at all. It is a social club, where kids of already successful people are sent. That network of kids of the already successful has its uses, but if it is a social club AND, as we established in previous paragraph, it is a learning institution in name only ( used only for credential ), then it is really just that: a social club, which I characterized in a flight of fancy as an adult daycare. I do stand by this phrasing, because the more I think about it, the more it fits.

<< Your characterization of an adult child does not seem fair. What makes someone an adult?

Hmm, good question, but I will leave it unanswered. I want to see how you respond to the previous point.

<< If it's academic discourse, then why is it valuable?

Academic discourse is not valuable. Frankly, at its core, nothing is inherently valuable, because any value is the value we ascribe to things. You might think that this is me saying:

'ok, so anyone can value anything and thus the kid doing for the credential is just as valid in their choice.'

It is a choice. It is valid. It just also happens to be, well, wasteful ( and maybe a little immoral, but that ship has sailed ) as the kid in question leaves the school with a credential that does not reflect that AND then goes into the world making decisions with a weight of that credential behind him. Thank goodness he is not an actual engineer. Future would look pretty grim then.

<< Just don't claim it's more virtuous than anyone else's hobby, unless you can find a reason.

You may be hanging onto my anecdote, but since that was the only professor at Harvard I had a chance to listen to on the matter, I thought it was relevant and any virtuosity in it is purely coincidental. The point he made was genuinely pretty apt. The point of education at Harvard ( or other notable names ) is not spend your parents money parking your keister for 4 years while waiting for that credential.

<< To call it "wasteful" says that something of "value" is being squandered, but the value is perceived by each of us differently.

No. Just because we perceive things in a certain way, does not automatically mean that there is no objective reality. It just means we don't perceive it ( accurately or otherwise ). This is where I believe this conversation could get interesting, because I think this is what we actually disagree on.

Wasteful is 'expending value carelessly'. Even if we value things differently, using Shelby GT for pizza delivery seems wasteful. I technically have no problem with anyone doing that ( you got money burning your pocket, who am I to judge ), but I am also not going to pretend it is a sensible thing to do.

From where I sit this is not that different from going to Harvard for a credential. Or network. It is just a wasted potential.

And it is sad. For Harvard. Edit: Or society as a whole. I am not sure now.

Ok. I am going to stop here. It is 3am and I clearly typed too much.

replies(1): >>41915606 #

70. red_admiral ◴[22 Oct 24 09:37 UTC] No.41912664{9}[source]▶

>>41905562 #

I agree with your point that we should have two different words for two different concepts (even though they can lead to the same effects), especially as one is motivated by malice and one is not.

But from the point of view of a black person who has not got a job / college place / tenancy that a comparable white person would have got, I guess it makes sense to say "whatever the cause, I want this problem fixed" and give the symptom rather than the cause the name "racism".

71. Iulioh ◴[22 Oct 24 09:50 UTC] No.41912730{8}[source]▶

>>41906955 #

At a certain point reality has a refletivity bias

72. hnlmorg ◴[22 Oct 24 13:21 UTC] No.41913974{11}[source]▶

>>41911940 #

>performing badly under pressure is not a thing in your world

No need to be rude.

Pressure presents different characteristics. Plus lecturers would be working with failing students so would understand the difference between pressure and cheating.

> My point was cheaters don't need to train on their corpus. That's why it's zero effort. You keep trying to wave that away

My entire point was that most cheats wouldn't bother training their corpus!

With the greatest of respect, have you actually read my comments?

> Now we only need to eliminate bad teachers

Well that's a whole other discussion :)

replies(1): >>41923623 #

73. recursive ◴[22 Oct 24 15:56 UTC] No.41915606{7}[source]▶

>>41912331 #

I'm not sure whether my point of view is going to stand up to someone who has put as much into this as you have. But anyway.

> If so, that credential is not only wasteful, it is also worthless.

I think "worth" is a synonym for "value".

> Frankly, at its core, nothing is inherently valuable, because any value is the value we ascribe to things.

I can't reconcile these two statements of yours.

> as we established in previous paragraph, it is a learning institution in name only ( used only for credential )

I've never been, but I imagine there is actually more to it, but some people may not participate in the true learning part. Like the combo meal has fries. I'm just choosing not to eat them.

> The point of education at Harvard ( or other notable names ) is not spend your parents money parking your keister for 4 years while waiting for that credential.

Perhaps Harvard wishes that were not the case, and some of the time it probably isn't. But I find it hard to believe that no one pays tuition for that purpose.

There probably is an objective reality for all know. I even refer to it sometimes. But all I have to work with are these five senses, and this faulty brain.

replies(1): >>41957857 #

74. selimthegrim ◴[22 Oct 24 19:20 UTC] No.41917709{6}[source]▶

>>41908779 #

*grow up in

75. throwaway290 ◴[23 Oct 24 10:29 UTC] No.41923623{12}[source]▶

>>41913974 #

> My entire point was that most cheats wouldn't bother training their corpus!

Good, because they don't need a custom corpus to cheat with LLMs with most normal teachers.

And if a teacher reduced your grade saying you are using LLM because your style doesn't match you just report them for it and say you were trying a new style (teacher would probably will be wrong 50% of the time anyway)

replies(1): >>41929099 #

76. hnlmorg ◴[23 Oct 24 20:42 UTC] No.41929099{13}[source]▶

>>41923623 #

> Good, because they don't need a custom corpus to cheat with LLMs with most normal teachers.

I think you're underestimating the capabilities of normal teachers. And I say this as someone who a large percentage of their family are teachers.

Also this topic was about using LLMs to spot LLMs. Not teachers spotting LLMs.

> And if a teacher reduced your grade saying you are using LLM because your style doesn't match you just report them for it and say you were trying a new style (teacher would probably will be wrong 50% of the time anyway)

You're drifting off topic again. I'm not going to discuss handling false positives because that's going to come down the policies of each institution.

77. A4ET8a8uTh0 ◴[26 Oct 24 21:28 UTC] No.41957857{8}[source]▶

>>41915606 #

Thank you. This was a good exchange for me.

↑