How Do AI detectors Work? Breaking Down the Algorithm
(Last updated: 6 November 2024)
Since 2006, Oxbridge Essays has been the UK’s leading paid essay-writing and dissertation service
We have helped 10,000s of undergraduate, Masters and PhD students to maximise their grades in essays, dissertations, model-exam answers, applications and other materials.
If you would like a free chat about your project with one of our UK staff, then please just reach out on one of the methods below.
AI detectors are designed to identify whether a piece of text has been generated by ChatGPT or similar AI tools. They are often used by university professors to determine whether an academic text, such as an essay or dissertation, has been written by AI. However, AI detectors are still in their experimental phase, meaning they are not always accurate and reliable. In this blog, we explain how AI detectors work and give a takeaway message on their reliability.
How Do AI Detectors Work and What Does AI Detection Look For?
AI detectors work by analysing text to identify patterns, structures, or anomalies that suggest whether a text was produced by AI. Their goal is to distinguish between human-written and AI-generated content. Like AI tools, AI detectors are based on machine learning and statistical models trained on large datasets. In essence, they scan a text and ask “Is this something I would have written?” If the answer is positive, they conclude that the text was generated by AI either fully or to a certain extent. Their algorithm works by scanning for two aspects of text: perplexity and burstiness, which are explained next.
What is Perplexity?
Perplexity measures how unpredictable or confusing a text is – that is, how likely the text is to confuse an average reader by appearing nonsensical or unnatural. A text is considered natural when sentences end in predictable ways. For example, if a sentence starting with “I went running last…” ends with “I went running last night” or “I went running last night along the quiet streets of my neighbourhood”, AI detectors would label it as natural and predictable. If a text contains many sentences of this kind, it would get a low perplexity score and, because it flows smoothly, would be labelled as potentially AI-generated. This is because AI tools produce flawless sentences.
In contrast, a text would be considered unnatural and get a high perplexity score if it ends in unpredictable ways, such as in the sentence “I went running last Tuesday because of the way the weather was that time”. If a text contains many sentences of this kind, which are coherent but unusually structured and long-winded, AI detectors would label it as human-written. That is because humans often make subtle errors when writing.
What is Burstiness?
Burstiness measures the variation in sentence structure and length, focusing on how sentences differ from each other. Unlike perplexity, which assesses unpredictability at the word level, burstiness evaluates the complexity and diversity of sentence patterns. AI detectors perceive texts with little variation in sentence structure and length as having low burstiness, which they consider is indicative of AI use. This is because AI tools predict the most probable next word, resulting in sentences of average length (typically 10–20 words) and conventional structures.
In contrast, texts with greater variation in sentence structure and length are perceived as having high burstiness. AI detectors label these texts as human-written because humans incorporate more variation and unpredictability into their writing. This natural variation reflects the human tendency to deviate from predictable patterns and add unique stylistic elements in a text.
Read Also: How To Get Into Oxbridge Interview
Are AI Content Detectors Accurate?
In simple words, AI detectors label text as human-generated if the text is clumsily written and varies in sentence structure and length. Text is labelled as AI-generated if it is written flawlessly and shows less variation in sentence structure and length. However, note that skilled writers often write flawlessly themselves and that many texts, such as methodology sections of reports and dissertations, require monotonous paragraphs that follow conventional structures. For this reason, human-generated texts are often labelled as AI-produced, thus undermining the reliability of AI detectors.