Name: abacusai/MetaMath-Bagel-DPO-34B API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: abacusai

abacusai/MetaMath-Bagel-DPO-34B: DPO Fine-tuned for Enhanced Reasoning

abacusai/MetaMath-Bagel-DPO-34B is a 34 billion parameter language model built upon the MetaMath SFT (Supervised Fine-Tuning) model. It has undergone further refinement through Direct Preference Optimization (DPO) using the Truthy DPO dataset, which is designed to improve truthfulness and reasoning abilities.

Key Capabilities & Performance

This model excels in tasks requiring logical reasoning and factual accuracy, as evidenced by its evaluation results across a suite of benchmarks:

Average Score: 75.54
MMLU (Massive Multitask Language Understanding): 76.46
GSM8K (Grade School Math 8K): 72.78
TruthfulQA: 67.58
ARC (AI2 Reasoning Challenge): 69.20
HellaSwag: 84.34
Winogrande: 82.87

With a context length of 32768 tokens, it can process and understand extensive inputs, making it suitable for detailed problem-solving.

Good For

Applications requiring strong mathematical and logical reasoning.
Tasks where factual accuracy and truthfulness are critical.
Complex question answering and analytical workloads.

Overview

abacusai/MetaMath-Bagel-DPO-34B: DPO Fine-tuned for Enhanced Reasoning

Key Capabilities & Performance

Good For

Full Model Card (README)