Name: allenai/Llama-3.1-Tulu-3-70B-DPO API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: allenai

Llama-3.1-Tulu-3-70B-DPO: An Advanced Instruction-Following Model

allenai/Llama-3.1-Tulu-3-70B-DPO is a 70 billion parameter instruction-following model from the Allen Institute for AI's Tülu3 family. It is fine-tuned from Meta's Llama 3.1 base model using Direct Preference Optimization (DPO) and is designed to provide state-of-the-art performance across a variety of tasks.

Key Capabilities & Features

Instruction Following: Excels in understanding and executing complex instructions.
Mathematical Reasoning: Demonstrates strong performance on benchmarks like MATH and GSM8K.
Information Extraction: Achieves competitive results on IFEval (Information Extraction Evaluation).
Open-Source Approach: Part of a family that provides fully open-source data, code, and recipes for post-training techniques, fostering transparency and research.
Llama 3.1 Foundation: Built upon the robust Llama 3.1 architecture, benefiting from its extensive pre-training.

Performance Highlights

On the 70B model benchmarks, Tülu 3 DPO 70B achieves an average score of 75.9, with notable scores in:

PopQA (15 shot): 46.3
TruthfulQA (6 shot): 67.9
MATH (4 shot CoT, Flex): 62.3
GSM8K (8 shot, CoT): 93.5

Usage Considerations

This model is primarily English-language and is released under the Llama 3.1 Community License Agreement. It is intended for research and educational use, with limited safety training, meaning it may produce problematic outputs if specifically prompted. The model's chat template is embedded within the tokenizer for easy use with tokenizer.apply_chat_template.

Overview

Llama-3.1-Tulu-3-70B-DPO: An Advanced Instruction-Following Model

Key Capabilities & Features

Performance Highlights

Usage Considerations

Full Model Card (README)