fblgit/UNA-POLAR-10.7B-InstructMath-v2

Warm
Public
10.7B
FP8
4096
1
Jan 2, 2024
License: cc-by-nc-nd-4.0
Hugging Face
Overview

UNA-POLAR-10.7B-InstructMath-v2 Overview

The UNA-POLAR-10.7B-InstructMath-v2 is a 10.7 billion parameter instruction-tuned model developed by fblgit. It is a specialized version derived from the UNA-SOLAR-10.7B-Instruct-1.0 base model, distinguished by its targeted optimization for mathematical tasks.

Key Capabilities & Differentiators

  • Mathematical Reasoning: The model has been fine-tuned using Direct Preference Optimization (DPO) over the extensive MathPILE Books dataset, specifically designed to improve its performance in mathematical contexts.
  • Instruction Following: As an instruct-tuned model, it is designed to follow user instructions effectively, particularly for math-related queries.
  • Performance Benchmarks: Evaluation on the Open LLM Leaderboard shows a competitive average score of 74.07. Notable scores include 70.73 on AI2 Reasoning Challenge, 66.03 on MMLU, and 64.75 on GSM8k, indicating its proficiency in reasoning and mathematical problem-solving.

Ideal Use Cases

  • Mathematical Problem Solving: Excellent for applications requiring the model to understand and solve mathematical problems.
  • Educational Tools: Can be integrated into tools for learning or tutoring in mathematics.
  • Research & Development: Suitable for researchers exploring advanced mathematical reasoning in LLMs, especially those interested in the impact of DPO with specialized datasets like MathPILE.