allenai/Llama-3.1-Tulu-3-70B

Warm
Public
70B
FP8
32768
License: llama3.1
Hugging Face
Overview

Overview

Llama-3.1-Tulu-3-70B is a 70 billion parameter instruction-following model from AllenAI, built upon Meta's Llama 3.1 base model. It is part of the Tülu 3 family, which emphasizes fully open-source data, code, and training recipes. The model is primarily English-language and is licensed under the Llama 3.1 Community License Agreement.

Key Capabilities

  • Instruction Following: Designed for state-of-the-art performance across a diversity of tasks, including general chat.
  • Mathematical Reasoning: Shows strong performance on benchmarks like MATH and GSM8K.
  • Instruction Following Evaluation (IFEval): Excels in complex instruction following scenarios.
  • Open-Source Approach: Provides a comprehensive post-training package with open-source data, code, and recipes.

Performance Highlights

On a range of benchmarks, the Tülu 3 70B model achieves an average score of 76.0, outperforming Llama 3.1 70B Instruct (73.4) and Qwen 2.5 72B Instruct (71.5). Notable scores include:

  • PopQA (15 shot): 46.5
  • BigBenchHard (3 shot, CoT): 82.0
  • MATH (4 shot CoT, Flex): 63.0
  • GSM8K (8 shot, CoT): 93.5
  • Safety (6 task avg.): 88.3

Usage Considerations

The model has limited safety training and does not include in-the-loop filtering, meaning it can produce problematic outputs. It is intended for research and educational use, and its fine-tuning involved datasets with outputs from third-party models, subject to their respective terms of use.