AI Detection Software as a Tool Against Plagiarism?
I recently learned of a new AI-detection software that I was curious to test.
What a Human Eye Can Pick Up
This spring, two faculty members asked me for some help in determining whether student papers may have been generated by AI.
I found a few non-determinative clues such as, the lack of footnotes for key concepts, the lack of pinpoint citations in footnotes, and writing that is generalized, high-level or non-analytical.
But up until now, I hadn’t heard of any AI detection software that could help.
AI to the Rescue?
So I was curious to try out QuillBot – Free AI Detector to see whether it might be a tool in the toolbox.
Here’s a blurb from the Quillbot website:
“Meet our AI content detector tool. Trained to identify certain patterns, our detection tool will flag AI-generated, paraphrased & human-written content in your text. AI-generated content is likely to contain repetitive words, awkward phrasing, and an unnatural, choppy flow. When these indicators are present, QuillBot’s AI Detector will flag the text for further inspection.”
There is a common understanding that there is, as of yet no reliable software that can detect AI-generated content. In an age of AI, I don’t think I’ll ever use the word “never” again when it comes to technology. It kind of makes sense to me that (so long as the detection software is built on the same dataset as the data used to generate the text), AI might be able to reverse engineer and identify it.
So I did a small test, and the results were interesting.
Test Results
Example 1: Entered a report that I had written.
Quillbot Finding: 100% human written.
Example 2: Entered a report written by a colleague.
Quillbot Finding: 100% human written.
Example 3: Entered a law journal article from 2011.
Quillbot Finding: 100% human written.
Example 4: Entered output from ChatGPT4
Quillbot Finding: 87% AI-generated + 13% human written.
Example 5: Entered output from Scite.ai
Quillbot Finding: 55% AI-generated + 45% human written.
Example 6: Entered output from Perplexity
Quillbot Finding: 100% AI-generated.
Example: Entered output from Jurisage (case summary of a Canadian case in IRAC format)
Quillbot Finding: 60% AI-generated + 40% human-written.
Summary
Of course these findings are too small to draw conclusions from.
However, I might be on to something with my theory that Quill is best at detecting AI-generation from platforms that are trained on a similar dataset.
So for example, its accuracy was cut in half when reviewing output from Scite, which is trained on a database of open access online scholarship.
Similarly, its accuracy was impacted when reviewing output from Jurisage, which is trained on case law.
Of course we can’t build an academic misconduct case on the basis of a Quillbot prediction, however, it may be another tool in the marking toolbox.
Faculty Toolbox
Ever since the days of being wholly online for my class during the pandemic, in my syllabus, I’ve reserved the right to conduct an oral assessment with a student in addition to the normal assessment approaches. I was desperate to find an alternate way of testing student knowledge when there was almost no way to detect if students were nefariously co-drafting assignments.
So here are a few thoughts about strategies we might consider when anticipating AI products being used to research or draft student papers.
In the classroom:
Invite the law librarians to speak to the classroom about good quality literature and research approaches for essay writing.
If faculty members do not emphasize finding and referring to quality sources, library efforts towards building and maintaining legal information literacy in future lawyers will be lost in the age of AI. We all must emphasize this consistently in order for the message to be reinforced.
In a rubric:
a. Explicitly allocate marks for analysis/ unique-value add to the topic (while decreasing marks allocated to generalized summaries (background, intro, conclusion, etc.).
b. Increase the amount of marks designated for breadth of resources referred to.
c. Increase the amount of marks designated for adherence to citations style guides (see below for one example).
d. Deduct marks for failure to provide pinpoint citations.
In a syllabus:
a. Explicitly refer to any campus-wide policies pertaining to AI, including academic integrity bylaws and policies.
b. Help the students understand how the policy will specifically be applied in this course as it pertains to AI usage. Explicitly state whether students are permitted to use AI when writing essays. Specify any limits or purposes that AI may be used for.
c. If AI generation will be permitted, explicitly require a footnote wherever it is used. The Open Access Legal Citation Guide* that was released by law librarians across Canada last week contains instructions for how to do this. Note that this is a good habit to get students in to, considering our various court practice directions that include this requirement.
d. If permitted on your campus, make explicit that you may use AI detection software as a tool to aid with marking.
d. Explicitly reserve the right to conduct a supplementary oral assessment to assess student learning.
May the force be with us this September as more and more legal AI comes online!
*COAL-RJAL Editorial Group, “Canadian Open Access Legal Citation Guide, Canadian Legal Information Institute”, 2024 CanLIIDocs 830, at ch 8, retrieved on 2024-06-13, online.
Much generative-AI is trained in an “adversarial” paradigm in which the generative-AI is locked in a pitched battle at machine-computational speeds with a vaguely complementary generative-AI-detector.
Both the gen-AI and the gen-AI-detector then “machine learn” to improve their capabilities from their interaction, perhaps guided by occasional nudges in the right direction from humans.
As a consequence, a key feature that emerges in most such gen-AI tools is that they have specifically “learned” to avoid detection.
It’s a constant cat-and-mouse back-and-forth, and the speed at which detection tools would need to learn is rivalled by (by definition perhaps almost necessarily always rivalled by) the capability of the gen-AI tool to trick it.
Consider also this simple approach from the student: “Gen-AI, please make 10 (or 100) essays about dinosaurs.” … “Detector, please submit the one of these essays least likely to be flagged as created by Gen-AI.”