AI

arXiv Bans Authors for Year Over Unverified AI Content

arXiv, a major research preprint repository, will ban authors for one year if they fail to verify AI-generated content in their submissions. The move aims to combat the rise of low-quality, AI-produced papers.

Laura Roberts
Laura Roberts covers space & aerospace for Techawave.
2 min read0 views
arXiv Bans Authors for Year Over Unverified AI Content
Share

The preprint repository arXiv will now penalize authors with a one-year ban if they fail to verify content generated by artificial intelligence (AI) in their research submissions. This new policy, announced by Thomas Dietterich, chair of arXiv’s computer science section, targets the increasing presence of low-quality and potentially inaccurate AI-generated papers circulating on the platform.

arXiv, widely used by researchers in fields such as computer science and mathematics for early dissemination of findings, has become a significant data source for scientific trends. While papers are posted before undergoing formal peer review, the site plays a crucial role in the rapid circulation of new research. The repository, now transitioning to an independent nonprofit after over two decades under Cornell University, is also seeking to enhance its funding to address issues like AI-generated inaccuracies.

Dietterich stated on Thursday that submissions exhibiting "incontrovertible evidence that the authors did not check the results of LLM generation" will be flagged. Such evidence could include fabricated references or direct conversational logs with AI models. "If such evidence is found, a paper’s authors will face 'a 1-year ban from arXiv followed by the requirement that subsequent arXiv submissions must first be accepted by a reputable peer-reviewed venue,'" Dietterich explained.

Authors Remain Responsible for All Content

The updated policy is not an outright prohibition of AI tools but an emphasis on author accountability. Dietterich clarified that researchers remain "fully responsible" for their submitted content, regardless of how it was generated. This means that if authors copy-paste potentially problematic material—such as inappropriate language, plagiarized text, biased information, errors, or misleading citations—directly from a large language model (LLM), they will be held accountable.

This stringent measure is described as a "one-strike" rule, though moderators must identify the issue, and section chairs must confirm the evidence before a ban is imposed. Authors will have the recourse to appeal such decisions. The repository previously implemented measures like requiring first-time submitters to obtain an endorsement from an established author to help filter out potentially unreliable submissions.

The move by arXiv reflects a broader concern within the academic and scientific communities regarding the integrity of research in the age of advanced AI. As AI models become more sophisticated, the ease with which they can generate plausible-sounding text and data poses a challenge to traditional verification processes. Ensuring that researchers maintain critical oversight of AI-assisted work is seen as vital for preserving the credibility of scientific discourse and preventing the spread of misinformation.

Share