DART-Math-Mistral-7B-Prop2Diff Overview
This model is a 7 billion parameter variant of the Mistral architecture, developed by hkust-nlp, and fine-tuned using the DART-Math (Difficulty-Aware Rejection Tuning) methodology. DART-Math aims to overcome the limitations of traditional rejection sampling in mathematical datasets, which often exhibit a bias towards easier problems. By employing a 'Prop2Diff' (proportional to difficulty) sampling strategy, this model is trained on datasets where more challenging queries are better represented, leading to improved performance on complex mathematical tasks.
Key Capabilities and Performance
- Enhanced Mathematical Reasoning: Achieves strong results on a range of mathematical benchmarks, including MATH, GSM8K, CollegeMath, DeepMind Mathematics, OlympiadBench-Math, and TheoremQA.
- Outperforms Baselines: Demonstrates superior or competitive performance compared to other models in its size class, such as Mistral-7B-MetaMath, on challenging out-of-domain math problems.
- Difficulty-Aware Training: Utilizes a novel training dataset construction method, DARS, which addresses the issue of severe biases towards easy queries in standard mathematical datasets.
Training Details
- Base Model: Mistral-7B.
- Training Data: Synthetic datasets derived from MATH and GSM8K training sets, processed with Difficulty-Aware Rejection Sampling.
- Prompt Template: Uses the Alpaca prompt template.
- Max Sequence Length: 4096 tokens.
When to Use This Model
This model is particularly well-suited for applications requiring robust mathematical problem-solving capabilities, especially when dealing with problems of varying difficulty. Its fine-tuning approach makes it a strong candidate for tasks where other models might struggle with more complex or less common mathematical challenges.