The allenai/Llama-3.1-Tulu-3-70B-SFT is a 70 billion parameter instruction-following model developed by Allen Institute for AI, fine-tuned from Meta's Llama-3.1-70B base model. It is designed for state-of-the-art performance across diverse tasks, including mathematical reasoning (MATH, GSM8K) and instruction following (IFEval). This model offers fully open-source data, code, and recipes, serving as a guide for modern post-training techniques. It is primarily English-language and has a context length of 32768 tokens.
Loading preview...
Overview
allenai/Llama-3.1-Tulu-3-70B-SFT is a 70 billion parameter instruction-following model developed by the Allen Institute for AI. It is fine-tuned from the meta-llama/Llama-3.1-70B base model and is part of the Tülu3 family, which emphasizes fully open-source data, code, and recipes for post-training techniques. This model is designed to achieve strong performance across a variety of tasks, including complex reasoning and instruction adherence.
Key Capabilities & Performance
- Instruction Following: Excels in instruction-following tasks, as indicated by strong IFEval scores (82.1 for SFT, 83.2 for the final Tülu 3 70B model).
- Mathematical Reasoning: Demonstrates robust performance on mathematical benchmarks like MATH (53.7 for SFT) and GSM8K (91.1 for SFT).
- General Benchmarks: Achieves a competitive average score of 72.6 on a suite of benchmarks, with notable strengths in PopQA (48.6) and BigBenchHard (82.7).
- Safety: Shows high safety scores (94.4 for SFT) across 6 tasks, though it has limited safety training and can produce problematic outputs if prompted.
Training & Usage
The model is primarily English-language and was fine-tuned with a learning rate of 2E-6, an effective batch size of 128, and a maximum sequence length of 4096 over 2 epochs. It can be loaded using HuggingFace AutoModelForCausalLM or served with VLLM. The chat template follows a specific user/assistant format, and while a default system prompt is provided for demos, the model was not trained with a specific system prompt in mind.
License
This model is released under Meta's Llama 3.1 Community License Agreement and is intended for research and educational use.