allenai/Llama-3.1-Tulu-3-70B-DPO
allenai/Llama-3.1-Tulu-3-70B-DPO is a 70 billion parameter instruction-following model developed by Allen Institute for AI, fine-tuned from Llama 3.1. This model is part of the Tülu3 family, known for its open-source data, code, and recipes for post-training techniques. It demonstrates strong performance across diverse tasks including MATH, GSM8K, and IFEval, making it suitable for complex reasoning and instruction-following applications.
Loading preview...
Llama-3.1-Tulu-3-70B-DPO: An Advanced Instruction-Following Model
allenai/Llama-3.1-Tulu-3-70B-DPO is a 70 billion parameter instruction-following model from the Allen Institute for AI's Tülu3 family. It is fine-tuned from Meta's Llama 3.1 base model using Direct Preference Optimization (DPO) and is designed to provide state-of-the-art performance across a variety of tasks.
Key Capabilities & Features
- Instruction Following: Excels in understanding and executing complex instructions.
- Mathematical Reasoning: Demonstrates strong performance on benchmarks like MATH and GSM8K.
- Information Extraction: Achieves competitive results on IFEval (Information Extraction Evaluation).
- Open-Source Approach: Part of a family that provides fully open-source data, code, and recipes for post-training techniques, fostering transparency and research.
- Llama 3.1 Foundation: Built upon the robust Llama 3.1 architecture, benefiting from its extensive pre-training.
Performance Highlights
On the 70B model benchmarks, Tülu 3 DPO 70B achieves an average score of 75.9, with notable scores in:
- PopQA (15 shot): 46.3
- TruthfulQA (6 shot): 67.9
- MATH (4 shot CoT, Flex): 62.3
- GSM8K (8 shot, CoT): 93.5
Usage Considerations
This model is primarily English-language and is released under the Llama 3.1 Community License Agreement. It is intended for research and educational use, with limited safety training, meaning it may produce problematic outputs if specifically prompted. The model's chat template is embedded within the tokenizer for easy use with tokenizer.apply_chat_template.