Name: argilla/distilabeled-Marcoro14-7B-slerp API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: argilla

Overview

The argilla/distilabeled-Marcoro14-7B-slerp is a 7 billion parameter language model developed by Argilla. It is a Direct Preference Optimization (DPO) fine-tune of the mlabonne/Marcoro14-7B-slerp base model. The fine-tuning process utilized a custom-filtered version of the argilla/distilabel-intel-orca-dpo-pairs dataset, which is derived from the original Intel Orca DPO dataset but with enhanced data quality filtering (removing ties, selecting pairs with chosen_score >= 8, and excluding GSM8k training data).

Key Capabilities & Performance

This model shows notable improvements over its base model, Marcoro14-7B-slerp, across several benchmarks. For instance, on the "Nous" or "Teknium" benchmark, it achieved:

AGIEval: 45.4 (vs. 44.66 for base)
GPT4ALL: 76.47 (vs. 76.24 for base)
TruthfulQA: 65.46 (vs. 64.15 for base)
Bigbench: 47.19 (vs. 45.64 for base)
Average: 58.63 (vs. 57.67 for base)

Additionally, Open LLM Leaderboard evaluations show an average score of 73.63, with specific scores like MMLU at 65.22 and GSM8k at 71.19. The training was conducted efficiently on a single A100 80GB GPU for less than an hour.

When to Use This Model

This model is suitable for applications requiring a 7B parameter model with enhanced reasoning and truthfulness capabilities, particularly where the quality of instruction-following and response generation is critical. Its DPO fine-tuning on a high-quality dataset makes it a strong candidate for general conversational agents and tasks benefiting from improved factual accuracy and alignment.

Overview

Overview

Key Capabilities & Performance

When to Use This Model

Full Model Card (README)