Peer review is a cornerstone of academic publishing, ensuring the integrity, novelty, and accuracy of published research. However, peer review faces challenges such as reviewer biases, inconsistent assessments, and concerns regarding the design of review mechanisms. These issues can obscure the fairness and integrity of scientific evaluation, especially given the increasing volume of academic submissions.
To address these issues, traditional studies in peer review often focus on statistical analyses of past reviews, which struggle to fully capture the multivariate nature of peer review -- entangled factors like reviewer expertise, motivation, and bias contribute jointly to the review results, making them difficult to study in isolation. Moreover, ethical and privacy issues around investigating real-world peer review data further complicate the exploration of real-world peer review data.
In our recent work, accepted as a main track (Oral) paper at EMNLP 2024, we introduce AgentReview, the first large language model (LLM)-based framework designed to simulate the peer review process. AgentReview allows for the controlled simulation of peer review dynamics using LLM agents, enabling researchers to explore biases, reviewer roles, and decision mechanisms in a way that respects privacy while providing actionable insights into how the peer review process can be improved.
Figure 1: AgentReview is an open and flexible framework designed to realistically simulate the peer review process. It enables controlled experiments to disentangle multiple variables in peer review, allowing for an in-depth examination of their effects on review outcomes.
AgentReview provides a flexible and extensible testbed for studying the impact of different roles and decision mechanisms in peer review. It follows procedures of popular Natural Language Processing (NLP) and Machine Learning (ML) conferences.
The framework simulates the roles of reviewers, authors, and Area Chairs (ACs) as LLM agents, allowing us to observe how different various configurations lead to different simulation results.
AgentReview integrates three roles—reviewers, authors, and ACs—all powered by LLM agents.
Figure 2: The Pipeline for AgentReview.
Our simulation adopts a structured, 5-phase pipeline:
We adopt a fixed acceptance rate of 32%, corresponding to the actual average acceptance rate for ICLR 2020 -- 2023.
Figure 3: Distribution of Reasons to Accept and Reject.
We use AgentReview to simulate peer review process using real conference papers from ICLR 2020 to 2023, and explore several sociological theories that explain how multiple factors affect peer review outcomes.
Social Influence Theory suggests that individuals in a group tend to revise their beliefs towards a common viewpoint. In AgentReview, we observe that reviewers often align their ratings with their peers during the rebuttal phase (Phase III. Reviewer-Author Discussion), leading to a 27.2% decrease in the standard deviation of ratings.
Paper review is typically unpaid and time-consuming. Altruism Fatigue describes how this demanding nature of peer review can lead to superficial assessments when reviewers feel their efforts go unrecognized. Including just one _irresponsible_ reviewer led to an 18.7% drop in engagement during the Reviewer-AC discussion phase, as reviewers provided shorter, less detailed feedback.
When biased reviewers dominate discussions, Groupthink can occur, where a group of individuals reaches a consensus without critical reasoning or evaluation of a manuscript. Biased reviewers often reinforce each other's negative opinions during Phase III. Reviewer-AC discussion. resulting in a 0.17 drop in ratings among the biased reviewers. This can also cause a spillover effect, where their negativity influences the assessments of unbiased reviewers, ultimately causing a 0.25 decrease in overall ratings.
Reviewers may give more favorable ratings to papers from well-known authors. When reviewers were aware of the authors' identities, paper decisions changed by 27.7%, highlighting how prestige can unjustly influence the review process.
AgentReview offers key insights into latent factors like bias, expertise, and motivation, helping design more equitable and efficient peer review systems. Future processes could incorporate better checks for reviewer expertise and commitment to reduce biases and move towards more equitable, transparent, and efficient peer review processes.
By simulating the peer review process, AgentReview ensures privacy and ethical standards without accessing sensitive real-world data, allowing for large-scale studies that respect reviewer anonymity while offering actionable insights into improving peer review mechanisms.
AgentReview's simulations enable adaptive mechanisms, such as adjusting the number of reviewers or the process length based on a paper’s complexity. This flexibility maintains rigor while streamlining the review process.
AgentReview can simulate peer review dynamics in interdisciplinary fields, identifying challenges where reviewers from different domains evaluate the same paper. These insights could help design better processes for interdisciplinary journals, ensuring fair, balanced assessments.
AgentReview marks significant progress in understanding peer review dynamics. Through LLM-driven simulations, we reveal how biases, expertise, and social factors impact academic evaluations. These insights lay the groundwork for enhancing the fairness and integrity of peer review.
Call to Action: We encourage researchers, editors, and policymakers to explore AGENTREVIEW’s findings and consider how AI can help create a more transparent and equitable academic publishing process.
@inproceedings{jin2024agentreview,
title={AgentReview: Exploring Peer Review Dynamics with LLM Agents},
author={Jin, Yiqiao and Zhao, Qinlin and Wang, Yiyang and Chen, Hao and Zhu, Kaijie and Xiao, Yijia and Wang, Jindong},
booktitle={EMNLP},
year={2024}
}
@inproceedings{jin-etal-2024-agentreview,
title = "{A}gent{R}eview: Exploring Peer Review Dynamics with {LLM} Agents",
author = "Jin, Yiqiao and
Zhao, Qinlin and
Wang, Yiyang and
Chen, Hao and
Zhu, Kaijie and
Xiao, Yijia and
Wang, Jindong",
editor = "Al-Onaizan, Yaser and
Bansal, Mohit and
Chen, Yun-Nung",
booktitle = "Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing",
month = nov,
year = "2024",
address = "Miami, Florida, USA",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2024.emnlp-main.70",
pages = "1208--1226",
}