Report: Universities Fail to Catch Most ChatGPT-Generated Cheating on Exams

Universities reportedly fail to catch answers generated by OpenAI’s popular AI chatbot ChatGPT in exams, which typically result in students obtaining higher scores than their peers who turn in real work.

A staggering 94 percent of university exam submissions — for a psychology degree — created via ChatGPT go undetected as having been AI-generated, according to a report by New Scientist.

University of Reading professor Peter Scarfe and his colleagues conducted an experiment by using ChatGPT to create answers to 63 test questions in the school’s psychology department, then submitted the AI-generated answers as 33 fake students turning in exams that students were allowed to take at home.

The individuals grading the exams were not informed that among real students existed 33 fake students handing in AI-generated work. As a result, Scarfe and his colleagues found that a mere six percent of AI-generated submissions were successfully flagged.

“On average, the AI responses gained higher grades than our real student submissions,” Scarfe said. “Current AI tends to struggle with more abstract reasoning and integration into information.”

Moreover, it was found that AI-generated answers had an 83.4 percent chance at scoring higher than students who turned in real work.

The researchers insist that the experiment they conducted is the largest and most vigorous study of its kind to date.

While the study only checked work involving the University of Reading’s psychology degree, Scarfe says the results are concerning for academics as a whole.

“I have no reason to think that other subject areas wouldn’t have just the same kind of issue,” the professor said.

Imperial College London senior teaching fellow Thomas Lancaster said, “The results show exactly what I’d expect to see. We know that generative AI can produce reasonable sounding responses to simple, constrained textual questions.”

Lancaster also pointed out that unsupervised exams have always been susceptible to cheating.

“Time-pressured markers of short answer questions are highly unlikely to raise AI misconduct cases on a whim,” Lancaster added. “I am sure this isn’t the only institution where this is happening.”

Scarfe, meanwhile, believes that taking on AI-generated submissions at its source is near-impossible, saying, “I think it’s going to take the sector as a whole to acknowledge the fact that we’re going to have to be building AI into the assessments we give to our students.”

As Breitbart News reported, data shows that ChatGPT is mainly used for cheating on homework, as its usage fell noticeably during the summer months only to return as Americans returned to school.

Last year, students in an elite academic program at a Florida high school were accused of cheating by using ChatGPT to write their essays.

You can follow Alana Mastrangelo on Facebook and X at @ARmastrangelo, and on Instagram.

COMMENTS