Burstiness & Perplexity
Teachers are rightfully concerned about their students’ use of Large Language Models (LLMs). We consider what essay-writing has meant to us, and what the impact of this technology might be.
As a student, I loved writing essays. It’s how I’d really come to understand concepts and primary sources. As a teacher… I did not love them as much.
It is so much work to read a classroom full of essays by people who have not properly learned how to write, but it’s a great way to get to know your students. I imagine they continue to surprise you even after decades of teaching the same books.
LLMs are changing academia, and there’s no putting the genie back in the lamp. It’s important to find the right relationship with the technology. We found these two terms in Edward Tian’s research for GPTZero — burstiness and perplexity — which we think can serve as the basis for a fruitful collaboration with ChatGPT and similar technology.
GPTZero is a tool made to analyze text and determine if it was written by human or AI. Detecting student plagiarism is a long-standing problem. Tian’s initial model was tested on BBC articles, but we think the way it works will be even more helpful in an academic context.
Burstiness
First we developed and apply a ‘burstiness’ check to analyze how similar the text is to AI patterns of writing. A human written document will have changes in style and tone throughout the text, whereas AI content remains similar throughout the document.
Perplexity
Our perplexity test reverse engineers the generative AI model. We’ve developed an AI model similar to ChatGPT. After each word in the text, our AI model develops suggestions of what word is coming next. It checks if our suggestions match what is actually there in the text.
An essay written by a student looks different than one written by an expert. Experts in a given field will be generally consistent with each other in the vocabulary they use, and the interplay of concepts. Students are more likely to bring in their own vernacular, and to make unexpected connections between concepts. This gives students higher burstiness and perplexity.
A student using a tool like ChatGPT as a collaborator rather than a surrogate will pass the authenticity test of a tool like GPTZero, and that’s a good thing. I think the ideal situation here is that students are given the stylistic freedom to use their individual voice in tying together concepts that a LLM has made clear to them.