Peer Review, Plagiarism, and Authorship in the Age of Large Language Models

How do we retain public trust in science when it becomes easy to cheat?

01 November 2022 4 minute read

With LLMs for science out there (e.g. Galactica from Meta) we need some new ethics guidelines for scientific publication. Existing rules regarding plagiarism, fraud, and authorship need to be rethought for LLMs. This is necessary to safeguard public trust in science.

I’ll state the obvious because it doesn’t seem to be obvious to everyone. Science depends on public trust. The public funds basic research and leaves scientists alone to decide what to study and how to study it. This is an amazing system that works. But it only works if scientists and the public each uphold their part of the deal.

The pressure on scientists today to publish has never been greater. Publication and citation metrics are widely used for evaluation. In this environment some small number of scientists will cheat to increase the number of papers they write. Automated tools like Galactica will assist them.

Many have argued that warning notices on Galactica are sufficient to prevent misuse. This ignores the fact that there are people who want to misuse it. They will see it as a shortcut to write more papers. Those papers will definitely not carry a warning that the text comes from an LLM.

Some of these papers will slip through the review process. They will include incorrect or biased information. Reviews of the literature will be slightly off. Results will be fabricated. Other authors will be influenced by these papers. Science funding is not so plentiful that society, and scientists, can waste it pursuing dead ends based on fake papers.

There will be press articles about paper mills using LLMs to create fake papers at scale. When this happens, public trust erodes. When public support declines, so does political support for science funding. Society needs science today as much or more than it ever has. We have big problems to address and we can’t afford to squander public trust.

People ask me, shouldn’t the review process catch these fake articles? Are reviewers so easily duped? In fact, people are easily fooled by natural sounding text. People can believe that LLMs are sentient for example. Reviewers are already overloaded and they cannot take on the task of rooting out a flood of LLM-generated papers.

LLMs are not going away. Even though the Galactica demo was taken down, the code remains on-line and new models will appear. Science reviewing and publishing is now in a battle against misuse of the machine. So here are some thoughts about peer review in the age of LLMs.

It’s obvious that copying text from Wikipedia without reference is plagiarism. But it’s also relatively easy to detect with a web search. Now, what if an LLM is trained on Wikipedia and someone uses the trained model to generate text? It’s impossible to search for the text without knowing the prompt, making detection difficult. Is this plagiarism?

I think it is. If you argue that using LLMs is not plagiarism, then this must be because they have created something novel and the author is and not “copying” existing text. If another human writes something novel that goes in your paper, this person is considered an author. In this case, an LLM that generates text for a paper should be listed as an “author”.

Authors, however, are responsible for the contents of the paper. That is, they are responsible for fraud and errors. If an LLM is an author, who takes responsibility if what it generates is wrong? Of course, authors have other responsibilities.

If a paper’s results are challenged, the authors need to be able to explain how the results were obtained and produce evidence to support their claims. Can LLMs live up to that responsibility? Can they explain themselves? Authors also need to disclose conflicts that might bias their work. What biases does an LLM have and can it disclose them?

The above suggests that LLMs cannot be “authors” today. The only viable solution is to require citation of all text generated by LLMs using the same rules we apply to quoting text from any traditional source. The text goes in quotes and the source is cited. I’d be fine with this. It’s transparent.

Of course, it is unlikely that anyone will do this. What some people want is to have the computer write their paper and then pass it off as their own work. That’s scientific fraud. So, the other alternative is to ban the use of LLMs in scientific publications. Of course, this is unenforceable but that doesn’t mean we shouldn’t impose it. It gives people a warning and it provides a mechanism for punishment for detected violations.

It may not sound like it, but I think research on LLMs is important. I use LLMs in my own research. The last thing I want to do is to slow down that research. So what can we do? What I call for is three things: (1) responsible dissemination of these tools that takes into account the risks, (2) change in the peer review process that addresses the risks, (3) research into “antidotes”.

Today, only large companies can afford to train LLMs. They can also afford to train adversarial networks to detect fake science. If a company releases a science LLM, they should develop a companion network to differentiate its output from real science. They should make this network available to publishers for free.

Acting now to introduce safeguards is necessary to protect the integrity of scientific publishing, prevent an undue burden on reviewers, limit fraud, and defend the public trust in science.

The Perceiving Systems Department is a leading Computer Vision group in Germany.

We are part of the Max Planck Institute for Intelligent Systems in Tübingen — the heart of Cyber Valley.

We use Machine Learning to train computers to recover human behavior in fine detail, including face and hand movement. We also recover the 3D structure of the world, its motion, and the objects in it to understand how humans interact with 3D scenes.

By capturing human motion, and modeling behavior, we contibute realistic avatars to Computer Graphics.

To have an impact beyond academia we develop applications in medicine and psychology, spin off companies, and license technology. We make most of our code and data available to the research community.