Name: allenai/llama-3-tulu-2-dpo-70b API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: allenai

Model Overview

allenai/llama-3-tulu-2-dpo-70b is a 70 billion parameter language model from AllenAI, built upon Meta's Llama 3 architecture. It is part of the Tulu series, designed to function as a helpful assistant. The model underwent a two-stage training process: initial fine-tuning on a diverse mix of publicly available, synthetic, and human-created datasets, followed by further alignment using Direct Preference Optimization (DPO) on the UltraFeedback dataset.

Key Capabilities & Performance

This model is primarily English-centric and demonstrates robust performance across a range of benchmarks, including MMLU, GSM8k, BBH, and HumanEval. Notably, it achieves a 0.754 on MMLU 5-shot, 0.860 on GSM8k 8-shot cot, and 0.878 on Codex HumanEval Pass@10. The DPO training phase, utilizing the UltraFeedback dataset, aims to enhance its ability to generate preferred responses.

Intended Uses & Limitations

Use Cases: Ideal for conversational AI, instruction following, and general assistant-like applications.
Input Format: Requires a specific input format: <|user|> Your message here! <|assistant|> for optimal generation quality.
Limitations: The model has not undergone extensive safety alignment (like in-the-loop filtering) and may produce problematic outputs if explicitly prompted. The exact composition of the base Llama 3 training corpus is not fully disclosed.

Overview

Model Overview

Key Capabilities & Performance

Intended Uses & Limitations

Full Model Card (README)