ArXiv, the open-access repository of preprint academic research, will ban authors of papers for a year if they submit obviously AI-generated work.
Late Thursday evening, Thomas Dietterich, chair of the computer science section of ArXiv, wrote on X: “If generative AI tools generate inappropriate language, plagiarized content, biased content, errors, mistakes, incorrect references, or misleading content, and that output is included in scientific works, it is the responsibility of the author(s). We have recently clarified our penalties for this. If a submission contains incontrovertible evidence that the authors did not check the results of LLM generation, this means we can’t trust anything in the paper.”
Examples of incontrovertible evidence, he wrote, include “hallucinated references, meta-comments from the LLM (‘here is a 200 word summary; would you like me to make any changes?’; ‘the data in this table is illustrative, fill it in with the real numbers from your experiments’.”
“The penalty is a 1-year ban from arXiv followed by the requirement that subsequent arXiv submissions must first be accepted at a reputable peer-reviewed venue,” Dietterich wrote.
Dietterich told me in an email on Friday morning that this is a one-strike rule—meaning authors caught just once including AI slop in submissions will be banned—but that decisions will be open to appeal. “I want to emphasize that we only apply this to cases of incontrovertible evidence,” he said. “I should also add that our internal process requires first a moderator to document the problem and then for the Section Chair to confirm before imposing the penalty.”
In November 2025, arXiv announced it would no longer accept computer science review articles and position papers because it was being “flooded” with AI slop. “Generative AI/large language models have added to this flood by making papers—especially papers not introducing new research results—fast and easy to write. While categories across arXiv have all seen a major increase in submissions, it’s particularly pronounced in arXiv’s CS category,” arXiv wrote in a press release about the change at the time.
And in January, it announced first-time submitters would need an endorsement from an established author due to a rise in fraudulent submissions.
AI-generated, fabricated citations are a huge problem in research. A recent study by Columbia University researchers examined 2.5 million biomedical papers across three years, and found that one in 277 papers published in the first seven weeks of 2026 contained fabricated references; In 2023, it was one in 2,828, and in 2025, one in 458. AI-generated citations and papers are already straining the peer-review process, and more and more papers are making it through the pipeline with those meta-comments and hallucinated data intact.
ArXiv is managed by Cornell Tech, but this July, it will become an independent nonprofit corporation. Greg Morrisett, dean and vice provost of Cornell Tech, told Science.org that this change will help arXiv raise more money from a wider range of donors, which Morrisett said is needed to deal with the emergence of “AI slop.”


