IAAR-Shanghai/xVerify-7B-I
IAAR-Shanghai/xVerify-7B-I is a 7.6 billion parameter evaluation tool fine-tuned from a pre-trained large language model, designed to verify answers for objective questions with a single correct answer. Developed by IAAR-Shanghai, it excels at extracting final answers from complex reasoning processes and judging equivalence across various expression formats. This model is particularly suited for evaluating tasks like math problems, multiple-choice questions, and classification, supporting both Chinese and English responses.
Loading preview...
xVerify-7B-I: An Efficient Answer Verifier
xVerify-7B-I is a 7.6 billion parameter model developed by IAAR-Shanghai, specifically fine-tuned as an evaluation tool for objective questions. Presented in the paper "xVerify: Efficient Answer Verifier for Reasoning Model Evaluations" (arXiv:2504.10481), its primary function is to accurately extract final answers from reasoning processes and efficiently determine equivalence across different forms of expressions.
Key Capabilities
- Broad Applicability: Suitable for diverse objective question evaluation scenarios, including mathematical problems, multiple-choice questions, classification tasks, and short-answer questions.
- Handles Long Reasoning Chains: Capable of processing extensive reasoning steps to extract the final answer, regardless of the complexity of the intermediate steps.
- Multilingual Support: Primarily supports Chinese and English responses, with compatibility for other languages.
- Powerful Equivalence Judgment: Features robust capabilities for recognizing equivalence, including:
- Basic transformations (e.g., letter case, Greek letter conversions).
- Equivalent mathematical expressions (e.g., LaTeX, fractions, scientific notation).
- Semantic equivalence in natural language answers.
- Matching multiple-choice responses by content rather than just option identifiers.
Good For
- Automated evaluation of LLM outputs on objective tasks.
- Verifying correctness in educational or assessment systems.
- Applications requiring precise answer extraction and equivalence checking from complex text.