Name: hkust-nlp/dart-math-mistral-7b-prop2diff API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: hkust-nlp

DART-Math-Mistral-7B-Prop2Diff Overview

This model is a 7 billion parameter variant of the Mistral architecture, developed by hkust-nlp, and fine-tuned using the DART-Math (Difficulty-Aware Rejection Tuning) methodology. DART-Math aims to overcome the limitations of traditional rejection sampling in mathematical datasets, which often exhibit a bias towards easier problems. By employing a 'Prop2Diff' (proportional to difficulty) sampling strategy, this model is trained on datasets where more challenging queries are better represented, leading to improved performance on complex mathematical tasks.

Key Capabilities and Performance

Enhanced Mathematical Reasoning: Achieves strong results on a range of mathematical benchmarks, including MATH, GSM8K, CollegeMath, DeepMind Mathematics, OlympiadBench-Math, and TheoremQA.
Outperforms Baselines: Demonstrates superior or competitive performance compared to other models in its size class, such as Mistral-7B-MetaMath, on challenging out-of-domain math problems.
Difficulty-Aware Training: Utilizes a novel training dataset construction method, DARS, which addresses the issue of severe biases towards easy queries in standard mathematical datasets.

Training Details

Base Model: Mistral-7B.
Training Data: Synthetic datasets derived from MATH and GSM8K training sets, processed with Difficulty-Aware Rejection Sampling.
Prompt Template: Uses the Alpaca prompt template.
Max Sequence Length: 4096 tokens.

When to Use This Model

This model is particularly well-suited for applications requiring robust mathematical problem-solving capabilities, especially when dealing with problems of varying difficulty. Its fine-tuning approach makes it a strong candidate for tasks where other models might struggle with more complex or less common mathematical challenges.

Overview

DART-Math-Mistral-7B-Prop2Diff Overview

Key Capabilities and Performance

Training Details

When to Use This Model

Full Model Card (README)