CompassVerifier-3B by OpenCompass is a 3.1 billion parameter verifier model built on the Qwen series, designed for robust evaluation and outcome reward of LLMs. It demonstrates multi-domain competency across math, knowledge, and diverse reasoning tasks, capable of processing various answer types and identifying abnormal responses. This model excels at accurately judging LLM outputs and can serve as an effective reward model in reinforcement learning.
No reviews yet. Be the first to review!