Professors Ask Computers to Catch Cheating

By C. Ramsey Fahs, Crimson Staff Writer

In an age where a problem set solution may be one Google search away, Harvard professors are more widely employing algorithms as an automated means to detect plagiarism in student assignments.

From the decades-old “MOSS” system to instructor-invented algorithms, such software allows professors to identify possible copying between submitted problem sets or Scantron sheets.

When Christopher L. Foote, a professor who teaches Economics 1010b, heard concerns that there may have been undetected copying on his exams this semester, he decided to add some safeguards to his testing procedure.

After a quick Google search, Foote found an October 2015 paper from the National Bureau of Economic Research called “Catching Cheating Students.” Written by National Taiwan University Economics professor Ming-Jen Lin and Steven D. Levitt ’89, an economics professor at the University of Chicago and co-author of the book “Freakonomics,” the paper includes a “simple algorithm for detecting exam cheating between students who copy off one another’s exam.”

After creating his own iteration of the algorithm in the Stata programming language, Foote announced to his students Monday that he would be running the algorithm on the class’s upcoming midterm exams.

“I wanted to do what I could to protect the students who are working hard,” Foote said, emphasizing that he has no reason to believe cheating is “rampant” in Ec1010b.

Foote also said the process of determining whether similar solutions indicate plagiarism would still require human input and analysis.

While Foote’s use of an automated plagiarism detector may be a novelty among Economics professors, Computer Science professors have has been using cheating checkers for more than a decade.

For instance, CS50, the College’s introductory computer science course famous for sending high numbers of students to the Administrative Board, has used such a system since before the course’s current instructor, David J. Malan ’99, took over in 2007.

According to associate Computer Science professor James Mickens, computer science classes are uniquely susceptible to potential plagiarism. Mickens said that though the recent boom in popularity for computer science has drawn many converts, some do not expect the amount of “technical” knowledge required to succeed, which could lead stressed students to resort to plagiarism.

“It’s not because they’re stupid, but a lot of times until you do something, you don’t know what that thing is,” Mickens said.

Fortunately for instructors, Computer Science professors are naturally good at creating crafty means of detecting plagiarism, Mickens said.

“Typically CS professors like puzzles, so it does seem kind of fun to sort of, in this very dark way, to say, ‘how could I catch these malcontents?’” Mickens said. “It’s sort of like one of these CSI shows...you have to think like the criminal.”

Computer Science professor Jelani Nelson, who uses a plagiarism detection algorithm, said the logistics of running the program are enough of a “headache” that he only runs it a few times per semester.

While plagiarism detectors are well-known, they are not ubiquitous in Harvard's Computer Science classes. Some courses, like the infamously difficult Computer Science 161: "Operating Systems," choose to forgo a code similarity algorithm entirely, opting instead for a more structured in-class advising infrastructure to give struggling students an option besides plagiarism. According to Mickens, who co-teaches the course with Computer Science professor Margo I. Seltzer ’83, in many cases plagiarism is detectable without a computer’s assistance.

“One of the big signs of plagiarism is that some student has submitted something that they don’t seem to fully understand,” Mickens said. “A lot of times, you don’t even have to resort to sort of complicated plagiarism finders.”

Want to keep up with breaking news? Subscribe to our email newsletter.