Detecting AI-Generated Essays with Synthetic Data Generation and Two-Stage Fine-tuning

Mei, Hao

doi:10.1184/R1/26128867.v1

Detecting AI-Generated Essays with Synthetic Data Generation and Two-Stage Fine-tuning

poster

posted on 2024-06-28, 19:56 authored by Hao Mei

We tackle the challenge of detecting AI-generated essays without having a substantial corpus of human-written examples or AI-generated essays from the specific generative models used in the test set. We circumvented this limitation by generating a large dataset from the SlimPajama collection, creating texts based on 10-word seeds using a suite of diverse language models (Falcon-7B, Mistral-7B, Llama2-7B). This resulted in a combined dataset of 1 million documents, representative of both human and AI-generated texts. We first fine-tuned a classifier on this broad dataset to distinguish between general Internet text produced by humans versus AI. To hone the model's specificity for academic essays, we further fine-tuned it using a smaller corpus of human essays and AI-generated essays created by various autoregressive language models. The resulting model, based on DeBERTa-v3-large, demonstrated a high level of accuracy in identifying AI-generated essays, achieving an AUC-ROC score of 0.965 on a Kaggle test dataset. This approach showcases an effective strategy for developing AI detection tools with limited data resources, addressing a critical need in the preservation of academic integrity in the face of advancing AI-generated content.