Overview
UNA-POLAR-10.7B-InstructMath-v2 Overview
The UNA-POLAR-10.7B-InstructMath-v2 is a 10.7 billion parameter instruction-tuned model developed by fblgit. It is a specialized version derived from the UNA-SOLAR-10.7B-Instruct-1.0 base model, distinguished by its targeted optimization for mathematical tasks.
Key Capabilities & Differentiators
- Mathematical Reasoning: The model has been fine-tuned using Direct Preference Optimization (DPO) over the extensive MathPILE Books dataset, specifically designed to improve its performance in mathematical contexts.
- Instruction Following: As an instruct-tuned model, it is designed to follow user instructions effectively, particularly for math-related queries.
- Performance Benchmarks: Evaluation on the Open LLM Leaderboard shows a competitive average score of 74.07. Notable scores include 70.73 on AI2 Reasoning Challenge, 66.03 on MMLU, and 64.75 on GSM8k, indicating its proficiency in reasoning and mathematical problem-solving.
Ideal Use Cases
- Mathematical Problem Solving: Excellent for applications requiring the model to understand and solve mathematical problems.
- Educational Tools: Can be integrated into tools for learning or tutoring in mathematics.
- Research & Development: Suitable for researchers exploring advanced mathematical reasoning in LLMs, especially those interested in the impact of DPO with specialized datasets like MathPILE.