January 27, 2026

How AI is transforming research: More papers, less quality, and a strained review system

Mathijs De Vaan

Featured Researcher

Mathijs De Vaan

Associate Professor, Management of Organizations

Toby E. Stuart

Featured Researcher

Toby E. Stuart

Professor, Management of Organizations

By

Scott Morrison

Image: NuPenDekDee/Adobe Stock

Mathijs De Vaan has long recognized the labor required to effectively communicate scientific findings. For the Netherlands native and UC Berkeley Haas associate professor of management, data and analyses are only half the battle; the rest is a time-intensive process of ensuring that complex, nuanced ideas are articulated with the precision that top-tier research demands.

When large language models (LLMs) like ChatGPT emerged in late 2022, De Vaan saw a significant opportunity to bridge that gap. The technology acted as a sophisticated writing partner, polishing prose and sharpening phrasing to ensure that scientific merit—rather than a researcher’s primary language—remained the focus.

Yet, as these tools moved from novelty to norm across academia, a deeper question began to nag at De Vaan and his colleagues: How was AI impacting the broader scientific landscape? Were they witnessing a simple productivity upgrade, or a more fundamental shift that could complicate the very future of research?

A new study by researchers at UC Berkeley Haas and Cornell University, published in the journal Science, reveals that AI is rapidly reshaping scientific research. Even as AI tools help researchers write more papers faster, many of these studies are of marginal scientific merit. The resulting flood of polished but potentially superficial work is making it harder for reviewers, funders, and policymakers to separate worthy papers from unimportant and potentially misleading work.

“The use of AI by scientists is stressing the system. It’s creating a giant bottleneck and making it really hard for evaluators to keep up,” said De Vaan, one of the paper’s co-authors. “This could affect decisions about what science we should support and fund.”

“The use of AI by scientists is stressing the system. It’s creating a giant bottleneck and making it really hard for evaluators to keep up.”

—Associate Professor Mathijs De Vaan

The study, co-authored by UC Berkeley Haas professor Toby Stuart and Keigo Kusumegi, Xinyu Yang, Paul Ginsparg, and Yian Yin of Cornell University’s Department of Data Science, examined more than 2 million papers uploaded between January 2018 and June 2024 on the three major research preprint websites spanning mathematics, physics, biology, social sciences, and humanities. Researchers use the three sites—arXiv, bioRxiv, and Social Science Research Network (SSRN)—to post scientific papers that have not yet undergone peer review.

Using sophisticated detection algorithms, the team identified which scientists were likely using AI to write papers, and compared how many papers they produced before and after adopting AI. They also employed a formula to quantitatively measure of the complexity of the writing. And finally, they also determined which papers were later published in scientific journals.

“We found an inverse relationship between writing sophistication and quality,” Stuart said. “If you’re human, the more complicated the writing is, the better the paper is. But if you’re a robot, the more complicated the writing is, the less good the paper is.”

The findings reveal just how quickly AI is reshaping science.

“The robots now write more complex and sophisticated science than many human scientists. But what our analysis shows is that scientific articles that were mostly automated are of substantially lower quality than human-written papers.”

—Professor Toby Stuart

Productivity surges across all fields

One of the paper’s striking findings concerns productivity. Scientists who appeared to adopt LLMs saw their manuscript output jump dramatically compared with scientists who weren’t getting a boost by using AI. The increase on bioRxiv and SSRN was greater than 50%, and on arXiv, more than one-third.

The effect was most pronounced among scientists whose first language is not English. Researchers with Asian names affiliated with Asian institutions who used LLMs experienced productivity gains approaching 90% in biology and social sciences. By contrast, researchers with Western names at English-speaking institutions saw more modest but still significant increases of 24%-46%.

This disparity reveals what could be one of AI’s most consequential impacts on science—where English is the language of record: leveling the playing field for non-native English speakers and potentially shifting the balance of scientific productivity toward regions of the world previously disadvantaged by the language barrier.

Broadening the knowledge base

One positive finding revealed by the research is that AI-powered search tools such as Bing Chat—the first widely-used LLM-integrated search engine—are better at finding newer publications and relevant books than traditional search tools, which tend to surface older, more commonly referenced material.

“There’s more and more science being produced and it’s hard for scientists to keep up with research that is relevant to their work,” says De Vaan. “LLMs allows us to search a broader base and delve much deeper.”

Complexity versus quality

But the productivity surge points to significant challenges ahead. For decades, writing quality has served as an imperfect but useful signal of scientific rigor. Papers with clear, sophisticated language tend to be stronger and get cited more frequently based on the premise that a team that can precisely articulate complex ideas is likely to have command of their subject. These well-written papers were most likely to clear the peer review process and be published in a scientific journal.

The researchers found that for AI-assisted manuscripts, this traditional relationship doesn’t just disappear—it completely reverses. More complex papers written with AI assistance were actually less likely to be published in peer-reviewed venues, suggesting that polished AI prose often masks weak science rather than signaling strong research.

“The robots now write more complex and sophisticated science than many human scientists,” Stuart said. “But what our analysis shows is that scientific articles that were mostly automated are of substantially lower quality than human-written papers.”

Looking ahead

The flood of AI-generated papers that look compelling but lack substance is already straining the review system to its breaking point, requiring the scientific community to find new ways to distinguish genuine research from a flood of marginal contributions.

The study’s authors emphasize their findings captured the impact of earlier AI models and cautioned that AI’s impact on science is likely to be substantially greater as these tools improve and scientists develop new ways to use them.

The paper suggests that AI itself might offer a stop-gap solution. Specialized “reviewer agents” could act as filters to make sure research papers meet certain quality thresholds before they reach a human reviewer.

But the researchers argue that broader institutional change will be required to cope with AI’s impact on science. De Vaan urges universities, funders, and policymakers to start now, despite uncertainty about the coming changes.

“The institutions that begin experimenting now with new evaluation criteria, new funding models, and new approaches to verification will be better positioned than those that wait for the full impact to become undeniable,” says De Vaan.

Read the full paper:

Scientific production in the era of large language models
By Keigo Kusumegi, Xinyu Yang, Paul Ginsparg, Mathijs de Vaan, Toby Stuart, and Yian Yin
Science, December 2025