SciJudge-4B: AI for Scientific Paper Evaluation
SciJudge-4B, developed by OpenMOSS-Team, is a specialized 4 billion parameter language model designed for evaluating scientific papers. It takes two academic papers' metadata (title, abstract, and publication date) and predicts which one will achieve a higher citation count, effectively assessing research impact and "scientific taste." This model is a key component of the research detailed in the paper "AI Can Learn Scientific Taste".
Key Capabilities
- Scientific Impact Prediction: Predicts relative citation counts between two papers.
- Research Evaluation: Acts as a proxy for assessing the potential impact and "taste" of scientific work.
- Metadata-driven Analysis: Utilizes titles, abstracts, and publication dates for its predictions.
Training Details
SciJudge-4B was fine-tuned from the Qwen3-4B-Instruct-2507 base model using GRPO (Generative Reward Policy Optimization) with DAPO loss. It was trained on 720,341 preference pairs derived from arXiv papers, utilizing a bfloat16 precision and an effective batch size of 1024.
Good for
- Researchers and institutions interested in automated scientific paper evaluation.
- Developing tools for assessing potential research impact.
- Exploring AI's ability to understand and predict "scientific taste."