OpenMOSS-Team/SciJudge-4B-2605
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:Jul 2, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold
SciJudge-4B-2605 is a 4 billion parameter Qwen3-4B-Instruct-2507 based model developed by OpenMOSS-Team, fine-tuned for scientific paper evaluation. It predicts which of two papers has higher citation impact based on titles, abstracts, and publication dates, leveraging a 32768 token context length. This model excels at discerning scientific 'taste' and is part of research on AI learning scientific judgment.
Loading preview...
SciJudge-4B-2605: AI for Scientific Paper Evaluation
SciJudge-4B-2605 is a 4 billion parameter model developed by OpenMOSS-Team, built upon the Qwen3-4B-Instruct-2507 architecture. Its core function is to evaluate scientific papers, specifically predicting which of two given papers (with titles, abstracts, and publication dates) will achieve a higher citation count.
Key Capabilities & Features
- Citation Impact Prediction: Specialized in comparing two scientific papers and determining which is likely to have greater future citation impact.
- Contextual Analysis: Utilizes paper titles, abstracts, and publication dates for its comparative judgment.
- Base Model: Fine-tuned from Qwen3-4B-Instruct-2507, indicating a strong foundation in instruction following and general language understanding.
- Training Methodology: Trained using GRPO with DAPO loss and an external preference reward, leveraging 720,341 preference pairs from the SciJudgeBench dataset.
- Performance: Achieves an average accuracy of 77.3% on the SciJudgeBench
testsplit (MAIN_1000 in-domain evaluation set), significantly outperforming its base model (58.1%).
Good For
- Scientific Research Analysis: Ideal for tasks requiring an AI to assess the potential impact or 'taste' of scientific publications.
- Academic Trend Prediction: Useful for researchers or institutions interested in forecasting the influence of new scientific work.
- Benchmarking: Serves as a smaller, efficient model for evaluating scientific judgment tasks, complementing its larger counterpart, SciJudge-30B-2605.